Machine Learning Engineer (Distributed Training)
- $250,000-$400,000
- Santa Clara, CA
- Permanent
About the job
Machine Learning Engineer (Distributed Training)
We are seeking a distributed training focussed Machine Learning Engineer to train, accelerate and deploy specialized foundation models for a $50m+ funded later stage scale up, building the worlds leading 3D foundation models.
The team were founded on the back of state of the art MIT research in computer graphics, the Founding team a mix of some of the most cited researchers in this space globally and commercially experienced veterans from teams like Nvidia and Microsoft. You'll be joining an elite team of Engineers and Researchers.
You join to build and optimize the worlds largest 3D native ML systems, working from lower levels, building an end to end ML framework from pretraining all the way through training, quantization and inferencing. You'll work closely with model researchers on the in-house foundation models, optimizing for throughput and efficiency.
We are seeking a Machine Learning Engineer (Distributed Training) with:
- Experience of low level training optimizations including quantization and parallelism
- Demonstrable experience in low level training systems which could include attention and softmax work. This may include kernel experience.
- Expertise in Pytorch
Location: Bay Area on a hybrid basis
Compensation: Substantial cash and generous equity + potential for sign on bonuses