Head of Cloud Inference

GW27 Posted: 19/08/2025

$300,000-$400,000
Remote
Permanent

Head of Cloud Inference

Developers face challenges deploying trained machine learning models due to fragmented, incomplete solutions necessitating extensive adjustments and model-specific optimizations. This next-generation platform aims to revolutionize AI development and deployment.

As the Head of Cloud Inference, you will spearhead a critical component of an enterprise offering, focusing on a GenAI cluster-level inference product. This Kubernetes-native solution ensures reliable, effortless deployment of customer GenAI models with top-tier large-scale performance, independent of using different types of GPUs or other accelerator hardware.

Join in shaping the future of AI infrastructure at scale—your work will directly power the next generation of AI applications used by millions of people worldwide.

Responsibilities

Team Leadership & Growth: Lead and scale the cloud inference organization to build next-generation cluster-level AI inference solutions.
Strategic Vision: Drive product strategy and design decisions for an enterprise-grade cloud inference serving platform; work closely with customers to ensure their success while advancing the platform as the best choice for AI inference in production.
People Development: Coach, mentor, and develop a high-performance engineering team while fostering cross-functional collaboration with product and customer support teams.
Technical Excellence: Navigate a fast-paced environment with changing priorities while establishing technical expertise and cutting-edge technology adoption.
Customer Success: Collaborate with engineering leaders to deliver integrated AI inference deployment solutions for enterprise customers.

Requirements

7+ years of experience in people management.
10+ years of experience in the field of cloud infrastructure.
Proven experience in developing production-quality high-performance software.
Proven experience in AI/ML infrastructure, model serving, or related field.
Proven experience in managing large teams and managers of managers.
A robust understanding of the principles of cloud infrastructure and running large-scale systems.

Tyler Long Software Systems & HPC Recruiter

Apply for this role

First Name

Last Name

Telephone Number

Email Address

Resume, LinkedIn or Dropbox URL

Resume Upload

Choose File

LinkedIn / Dropbox URL

Message

By submitting this form you agree to our Terms & Conditions, Privacy Policy & Cookie Policy

Not yet registered? Create an account today

Already have an account? Sign in now

Still looking? What about...

Featured Jobs

View all jobs

Posted: 19/08/2025

Cloud Inference Tech Lead

GW28

$250,000-$325,000
Remote
Permanent

Cloud Inference Tech LeadAbout the RoleDevelopers often struggle to deploy trained ML models due to frag...

View Job

Posted: 19/08/2025

Head of Cloud Inference

GW27

$300,000-$400,000
Remote
Permanent

Head of Cloud InferenceDevelopers face challenges deploying trained machine learning models due to fragm...

View Job

Posted: 18/08/2025

Applied Research Engineer

GW26

$200,000-$250,000
San Francisco, CA
Permanent

About the job🌟 Applied Research Engineer – Video Intelligence 🌟Our client is on a miss...

View Job

Posted: 18/08/2025

Applied Research Engineer

GW25

$200,000-$250,000
New York, NY
Permanent

About the job🌟 Applied Research Engineer – Video Intelligence 🌟Our client is on a miss...

View Job

Posted: 18/08/2025

Founding Engineer

GW24

$150,000-$250,000
San Francisco, CA
Permanent

About the jobFounding Engineer - San Francisco, CAA YC backed cutting-edge AI startup that empowers busi...

View Job

Posted: 18/08/2025

AI Researcher

GW23

$170,000- $300,000
San Francisco, CA
Permanent

AI Researcher - San Francisco (Onsite) A San Francisco-based AI infrastructure startup on...

View Job

Posted: 12/08/2025

Post-Graduate Role: Fullstack Software Engineer

GW22

$125,000-$135,000
Denver, CO
Permanent

Post-Graduate Role: Full Stack Software EngineerAbout the CompanyJoin a fast-growing AI startup, backed ...

View Job

Posted: 12/08/2025

Frontend Engineer

GW21

$150,000-$180,000
New York City
Permanent

Front-End Engineer - San Francisco/New YorkAn AI strike force spun out of a16z and Apple, working direct...

View Job

Posted: 12/08/2025

Frontend Engineer

GW20

$150,000-$180,000
San Francisco, CA
Permanent

Front-End Engineer - San Francisco/New YorkAn AI strike force spun out of a16z and Apple, working direct...

View Job

Posted: 12/08/2025

Fullstack Engineer

GW19

$150,000-$180,000
New York City
Permanent

Full Stack Engineer (Front-End Leaning) - San Francisco/New YorkAn AI strike force spun out of a16z and ...

View Job

Quick Resume Dropoff

Head of Cloud Inference

Head of Cloud Inference

Responsibilities

Requirements

Apply for this role

Still looking? What about...

Featured Jobs

Cloud Inference Tech Lead

Head of Cloud Inference

Applied Research Engineer

Applied Research Engineer

Founding Engineer

AI Researcher

Post-Graduate Role: Fullstack Software Engineer

Frontend Engineer

Frontend Engineer

Fullstack Engineer

Contact Us

Find us on social

Useful Links

Legal