Cloud Inference Tech Lead

GW28
  • $250,000-$325,000
  • Remote
  • Permanent

Cloud Inference Tech Lead

About the Role

Developers often struggle to deploy trained ML models due to fragmented solutions that require heavy customization. This role leads a GenAI cluster-level inference initiative—built as a Kubernetes-native platform to deliver reliable, high-performance deployment across diverse GPUs and other accelerators. 

What You’ll Do

  • Technical Leadership: Mentor engineers and help set the technical direction for large-scale GenAI inference.

  • Strategic Vision: Drive product strategy and design for an enterprise-grade cloud inference serving platform; partner closely with customers to ensure success. 

  • Technical Excellence: Support rapid growth and new use cases in a fast-moving environment, applying cutting-edge solutions. 

  • Customer Success: Collaborate across product and engineering leadership to deliver integrated, large-scale cloud inference deployments. 

What You Bring

  • 10+ years in cloud infrastructure.

  • Proven experience building production-quality, high-performance cloud software.

  • Background in AI/ML infrastructure or model serving.

  • Demonstrated technical leadership and cross-team collaboration.

  • Track record of customer excellence.

  • Strong understanding of operating large-scale systems on cloud infrastructure. 

Work Setup

US or Canada; work remotely or from the company’s office. (Onboarding is conducted in person at the Los Altos, CA office.) 

Equal Opportunity & Compliance

Committed to equal employment opportunity and providing reasonable accommodations. Participates in E-Verify in the US. 

Tyler Long Software Systems & HPC Recruiter

Apply for this role