Principal Compiler Developer

GW346
  • $250,000-$325,000
  • Santa Clara, CA
  • Permanent

About the job


Acceler8 Talent is seeking a Principal Compiler Developer to join an early stage startup backed by a Tier-1 VC that is rethinking AI infrastructure from first principles.


Founded by highly respected industry veterans, they are innovating at the chip and system level to deliver an order of magnitude better performance-per-watt for inference, which would mean a huge economic shift for anyone running large scale models, unlocking larger context windows and longer generations, making new workloads economically viable.


Lead or be a member of a small team (depending on experience) developing new ML compiler capabilities in a modern ML compiler stack from PyTorch through Triton down to CUDA or machine IR on leading edge GPUs and new accelerators.


Responsibilities:

 ● Build and optimize ML compiler infrastructure across the stack

 ● Lower PyTorch/Triton workloads to efficient machine IR

 ● Optimize for GPUs (CUDA/ROCm) and XPU AI hardware

 ● Collaborate on HW–SW co-design in a fast-moving startup


Experience:

 ● PhD + 5 years (or equivalent) in compilers, ML systems, or related fields

 ● Strong compiler fundamentals and performance optimization skills

 ● Experience with PyTorch, Triton, and low-level code generation

 ● GPU expertise and HW–SW co-design experience a plus


If you're looking for massive ownership, huge impact, and the opportunity to build from the ground up, please apply here or reach out to me at ltomaszko@acceler8talent.com to hear more.


Luke Tomaszko Senior Semiconductor & Chip Design Recruiter

Apply for this role