Software Engineer
GW311
Posted: 25/02/2026
- $200,000-$350,000
- San Francisco, CA
- Permanent
About the job
Software Engineer: Machine Learning Infrastructure - San Francisco (Onsite)
A Frontier AI Robotics Company backed by over $100m in early funding from top-tier investors including NVIDIA are hiring a Software Engineer: Machine Learning Infrastructure to join their team.
What will I be doing?
- Lead the buildout and scaling of high-performance GPU clusters supporting large-scale robot model training
- Drive efficiency and usability of compute resources for research teams running distributed experiments
- Architect and optimize data pipelines, storage systems, and throughput across heavily distributed environments
- Oversee low-latency inference systems running on on-prem GPUs deployed with robotic fleets
- Continuously improve performance across compute, networking, and storage infrastructure
What we’re looking for:
- Proven experience operating and scaling GPU infrastructure for large distributed ML workloads
- Strong orchestration expertise (Slurm, Kubernetes or similar)
- Hands-on experience building robust ML data pipelines at scale
- Deep systems knowledge spanning hardware, networking, and storage layers
- Familiarity with NVIDIA GPU tooling and ecosystem
What’s in it for me?
- Build and operate the compute backbone behind embodied foundation models
- Work at the frontier of large-scale AI + real-world robotics
- Join a deeply senior team from OpenAI, Boston Dynamics, and DeepMind
- Backed by ~$140–150M from leading investors
- High ownership, real-world impact, and serious compute resources
Apply now for immediate consideration!
Kirstie Moffat
ML Research & Engineering Recruiter