Machine Learning Engineer (Inference)
- $200,000-$350,000
- San Francisco, CA
- Permanent
About the job
Machine Learning Engineer (Inference)
We are seeking an Inference focussed Machine Learning Engineer to join a Stanford spin out scale up building a foundational infrastructure layer for AI inference.
The team were founded on the back of a successful exit, with the core of the previous founding team creating their new venture. Their aim is to dramatically improve inference efficiency across the stack, tackling custom compiler, kernel & distributed orchestration bottlenecks.
They are hiring across the stack, with a particular focussing on accelerating inference performance with cutting edge research and engineering techniques. You will work on lower level systems, building and optimizing serving stacks.
We are seeking a Machine Learning Engineer (Inference) with:
- A focus on inference and serving environments
- Experience building and optimizing inference stacks
- Exposure to inference related tools and frameworks which could include vLLM, TensorRT, SGLang or Pytorch
Location: San Francisco or the Bay on a hybrid basis
Compensation: Competitive base salary + meaningful equity % + potential sign on bonus