Head of Cloud Inference
- $300,000-$400,000
- Remote
- Permanent
Head of Cloud Inference
Developers face challenges deploying trained machine learning models due to fragmented, incomplete solutions necessitating extensive adjustments and model-specific optimizations. This next-generation platform aims to revolutionize AI development and deployment.
As the Head of Cloud Inference, you will spearhead a critical component of an enterprise offering, focusing on a GenAI cluster-level inference product. This Kubernetes-native solution ensures reliable, effortless deployment of customer GenAI models with top-tier large-scale performance, independent of using different types of GPUs or other accelerator hardware.
Join in shaping the future of AI infrastructure at scale—your work will directly power the next generation of AI applications used by millions of people worldwide.
Responsibilities
Team Leadership & Growth: Lead and scale the cloud inference organization to build next-generation cluster-level AI inference solutions.
Strategic Vision: Drive product strategy and design decisions for an enterprise-grade cloud inference serving platform; work closely with customers to ensure their success while advancing the platform as the best choice for AI inference in production.
People Development: Coach, mentor, and develop a high-performance engineering team while fostering cross-functional collaboration with product and customer support teams.
Technical Excellence: Navigate a fast-paced environment with changing priorities while establishing technical expertise and cutting-edge technology adoption.
Customer Success: Collaborate with engineering leaders to deliver integrated AI inference deployment solutions for enterprise customers.
Requirements
7+ years of experience in people management.
10+ years of experience in the field of cloud infrastructure.
Proven experience in developing production-quality high-performance software.
Proven experience in AI/ML infrastructure, model serving, or related field.
Proven experience in managing large teams and managers of managers.
A robust understanding of the principles of cloud infrastructure and running large-scale systems.
