GPU Kernel Engineer (CUDA/ Triton)

BBBH14826_1714578212
  • US$200000.00 - US$400000.00 per annum
  • San Francisco, California
  • Permanent

Company Overview:

Join a pioneering startup at the forefront of Large Language Models (LLM) technology. Having successfully completed our Series A funding round from top-tier VC firms and industry leaders, we're expanding our foundational team to further our mission in the LLM arena. Our aim is to develop sophisticated LLMs that can revolutionize various industries, from digital assistants to advanced data analysis.

Role Description for GPU Kernel Engineer (CUDA/ Triton):

As a specialist in GPU Kernels, you will play a crucial role in ensuring ultra-efficient training and inference for our expansive neural networks. Your responsibilities will include developing high-throughput, low-latency GPU kernels using CUDA or Triton, optimizing kernel fusions, and enhancing multi-GPU communication. Your expertise in GPU programming will be vital in pushing the boundaries of what LLMs can achieve.

Key Responsibilities of this GPU Kernel Engineer (CUDA/ Triton):

  • Implement and optimize GPU kernels for large-scale neural network models, with a focus on CUDA or Triton.
  • Innovate in the area of kernel fusion and maximize multi-GPU communication efficiency.
  • Collaborate with our team of experts in LLMs to build and deploy powerful and efficient AI solutions.

What We Offer this GPU Kernel Engineer:

  • Option for remote or hybrid working arrangement, with offices in NYC/SF.
  • Highly competitive salary, positioned at the 90th percentile of the market.
  • Generous equity and benefits package, reflecting your key role in our growth.

This is an exciting opportunity to be part of a team that's shaping the future of LLM technology. Your work will contribute directly to the development of groundbreaking AI solutions that will have a lasting impact on the industry.

Keywords: Artificial Intelligence, Machine Learning, GPU Kernel, Graphic Processing Units, CUDA, Triton, Inference, Training, Optimization, Neural Network Models, Parallel Computing, Low Level Optimization, Compilers, Software Architecture, Workload Mapping, Implementation, Testing, Bringup, and Debug, Kernel Components, Kernel Infrastructure, Compute Kernels

Rose Noterman Recruitment Executive

Apply for this role