This job is no longer available

The position may have been filled or the posting has expired. Browse similar opportunities below.

Back to Jobs
HB

Head of Platform/AI Cluster Management - System Integrator at Hamilton Barnes Associates Limited

Hamilton Barnes Associates Limited No longer available $500,000+/year

Job Description

Ready to lead innovation at the intersection of platforms and artificial intelligence?

Join a pioneering technology company driving advancements in cloud, AI, and data-driven solutions across global markets. The organization is recognized for fostering innovation, scalability, and collaboration through cutting-edge platforms that empower enterprises to evolve intelligently.

The team is hiring a Head of Platform/AI Cluster Management to oversee the strategic development, integration, and optimization of AI and platform initiatives. The role will focus on leading cross-functional teams, enhancing performance and scalability, and aligning technology strategy with long-term business goals.

Shape the future of intelligent platforms and transformative innovation. Apply now!

Responsibilities
  • Own the scheduler/runtime layer (Slurm, Kubernetes, Ray), including multi-tenancy, quotas, and GPU/host fleet management.
  • Lead cluster operations across images, CI/CD, repair/health, performance/telemetry, and incident response.
  • Deliver platform services that ensure workload SLOs and reliable runtime execution.
  • Define and implement namespace/tenancy design, node health automation, golden images, admission controls, on-call runbooks, and go-live gates.
  • Collaborate closely with infra, SRE, and network teams to optimize workload placement and cluster efficiency.
  • Provide hands-on expertise in NCCL behaviours, placement strategies, and congestion signal management.
Requirements
  • Deep expertise in cluster management, scheduling, and runtime environments for large-scale compute.
  • Hands-on background with Slurm, Kubernetes, Ray, or similar orchestration platforms.
  • Strong understanding of NCCL performance tuning, workload isolation, and congestion management.
  • Experience scaling multi-tenant, GPU-heavy clusters with strict SLOs.
  • Ability to thrive in a startup environment with full ownership over platform and cluster strategy.
Salary
  • $500,000 gross per year (Negotiable)

$500k beats the market for Emergency Management Directors nationally

National salary averages
$500k
↑ 318% vs typical senior-level
Entry
Mid
Senior
This job
$51k Market range (10th-90th percentile) $160k

Top performers earn significantly more—skill and negotiation matter.

Senior roles pay 86% more than entry—experience is well rewarded.

This is a strong offer—weigh total comp and growth potential.

Slight employer advantage

Standard dynamics. Preparation and demonstrated value matter most.

Hiring leverage
Balanced
Wage leverage
Balanced
Mobility
Limited

Where to negotiate

Base salary
Sign-on bonus
Title / level
Remote flexibility
Scope & responsibility
Start date / PTO

Likely Possible Unlikely

Watch out for

Limited mobility: Few adjacent roles—switching employers is harder.

Focus on demonstrating unique fit and value.

Does this path compound?

Job Growth →
High churn
Growth, flat pay
🚀 Compound
Growth + pay upside
⚠️ Plateau
Limited growth
Specialize
Experts earn more
Pay Upside →
Expertise pays off

Limited new roles, but specialists earn significantly more.

+3%
10yr growth
A bachelor's degree is typically expected.
Typical: Bachelor's degree

Openings come from turnover, not new growth. Differentiate to advance.

Labor data: BLS 2024