Principal Machine Learning Engineer
- $250,000-$300,000
- Santa Clara, CA
- Permanent
About the job
Principal Machine Learning Engineer - Inference Serving Frameworks
Acceler8 Talent is seeking a Principal Machine Learning Engineer to join an early stage startup backed by a Tier-1 VC that is rethinking AI infrastructure from first principles.
Founded by highly respected industry veterans, they are innovating at the chip and system level to deliver an order of magnitude better performance-per-watt for inference, which would mean a huge economic shift for anyone running large scale models, unlocking larger context windows and longer generations, making new workloads economically viable.
Responsibilities:
● Design, develop, and tune multi-node inference techniques to optimize throughput and latency.
● Employ strategies such as TP/PP/EP hybrids, continuous batching, and KV cache management to optimize at the intersection of compute, networking, and storage.
● Drive performance improvements in frameworks (vLLM, SGLang, PyTorch) and develop advanced cluster scheduling algorithms to optimize throughput and latency.
● Engage with the open-source community to upstream optimizations, influence roadmaps, and ensure the sustainability of our contributions.
● Apply best practices in performance benchmarking, testing, and debugging to maintain a robust production-grade stack.
Experience:
● Strong proficiency in Python, C++, and PyTorch, with a demonstrated history of shipping high-quality software in a startup or fast-paced environment.
● Experience contributing to one or more LLM inference serving frameworks such as vLLM or SGLang.
● Deep understanding of LLM inference internals (KV cache, batching, attention mechanisms).
● Experience running and optimizing large-scale workloads on heterogeneous clusters. Familiarity with networking, storage management, or distributed scheduling (e.g., Orca, LMCache) is a significant plus.
● Proficiency in performance analysis and GPU kernel development (CUDA, Triton, or ROCm) is a plus.
● Master’s or PhD in Computer Science, Computer Engineering, Electrical Engineering, or a related field.
If you're looking for massive ownership, huge impact, and the opportunity to build from the ground up, please apply here or reach out to me at ltomaszko@acceler8talent.com to hear more.