MTS: ML Infrastructure

BBBH17298_1719264405
  • US$180000 - US$200000 per annum
  • San Francisco, California

Revolutionize AI with Us by Helping Everyone Save Time

Join our mission to redefine human-computer collaboration and automate workflows with cutting-edge AI products. Be part of a team shaping the future of enterprise operations, leveraging Large Language Models (LLMs) to elevate organizational impact.

Your Impact:

  • Collaborate on delivering captivating experiences through Large Language Models.
  • Architect Scalable ML Systems:
  • Design and implement scalable machine learning and distributed systems for LLMs.
  • Optimize Under the Hood:
  • Innovate at lower stack levels, creating high-performing infrastructure with custom kernels.
  • Master Parallelism Methods:
  • Develop parallelism methods for large-scale LLM distribution training.

Your Skills:

  • Experience in training LLMs using Megatron, DeepSpeed, etc., and deploying with vLLM, TGI, TensorRT-LLM, etc.
  • Possesses a strong grasp of the architectures of cutting-edge AI accelerators such as TPU, IPU, HPU, and their associated tradeoffs.
  • Proficient in working under-the-hood with kernel languages like OAI Triton, Pallas, and compilers like XLA.
  • Proven hands-on experience in tuning LLM workloads. Familiarity with MLPerf or production workloads is a plus.

If you're passionate about driving AI innovation and pushing the boundaries of what's possible, we invite you to join our collaborative and forward-thinking team.

Victor Pascoe Researcher

Apply for this role