Inference Engineer

GW475 Posted: 08/05/2026

$200,000-$300,000
San Francisco, CA
Permanent

About the job

📍 Onsite – San Francisco

💰 $200k–300k base + meaningful equity

Acceler8 Talent is partnering with an AI infrastructure startup building the platform next-generation AI systems will run on.

Fresh out of stealth, the company has already reached eight-figure revenue, raised an $80M Series A, and is scaling a world-class engineering team across inference, distributed systems, compiler infrastructure, and high-performance AI compute.

Their platform automatically maps complex AI workloads across CPUs, GPUs, and emerging accelerators to maximize inference performance and hardware efficiency at scale.

As a Software Engineer focused on inference systems, you’ll own the runtime layer that executes modern models end-to-end under real production constraints.

Responsibilities ⚙️

• Design and optimize production inference pipelines

• Improve batching, scheduling, concurrency, and runtime behavior

• Optimize KV cache systems and memory efficiency

• Debug latency and throughput bottlenecks across model and systems layers

• Partner closely with compiler, kernel, and distributed systems engineers

• Contribute to large-scale distributed inference infrastructure

Requirements

• Hands-on experience building and scaling production ML inference systems

• Experience owning inference or model serving infrastructure end-to-end

• Strong understanding of distributed systems and runtime behavior under load

• Experience optimizing latency, throughput, batching, and memory efficiency

• Strong Python and/or C++ skills

• Comfortable operating in highly technical, high-ownership environments

Bonus Points

• Experience with TensorRT-LLM, vLLM, or custom inference runtimes

• CUDA, kernel optimization, or compiler-adjacent systems experience

• Experience optimizing GPU utilization at scale

• Background in AI infrastructure or high-performance compute systems.

If you're interested in building inference infrastructure for next-generation AI systems at massive scale, please apply here or reach out directly to hear more!

Anna Button Researcher

Apply for this role

First Name

Last Name

Telephone Number

Email Address

Resume, LinkedIn or Dropbox URL

Resume Upload

Choose File

LinkedIn / Dropbox URL

Message

By submitting this form you agree to our Terms & Conditions, Privacy Policy & Cookie Policy

Not yet registered? Create an account today

Already have an account? Sign in now

Still looking? What about...

Featured Jobs

View all jobs

Posted: 08/05/2026

Senior AI Engineer

GW482

$150,000-$180,000
United States
Permanent

About the job📍 Remote across the U.S. (Eastern/Central time zones) 💰 $150-180k + strong benefitsT...

View Job

Posted: 08/05/2026

Senior Backend Engineer

GW481

$180,000-$250,000
Boston, MA
Permanent

About the jobSenior Software Engineer (Agentic AI) - Boston, MA A fast-growing deep tech startup is...

View Job

Posted: 08/05/2026

Lead AI Engineer

GW480

$160,000-$245,000
United States
Permanent

About the jobWe are looking for a Lead AI Engineer with 7+ years’ experience buildi...

View Job

Posted: 08/05/2026

Lead AI Engineer

GW479

$180,000-$240,000
United States
Permanent

About the jobLead AI Engineer – Enterprise AI Transformation - US Eastern (Remote) A global o...

View Job

Posted: 08/05/2026

Software Engineer, ML Inference

GW478

$250,000-$320,000
San Francisco, CA
Permanent

About the jobSoftware Engineer, ML InferenceSan Francisco (On-Site)$250,000–$320,000 base + equity...

View Job

Posted: 08/05/2026

Senior Software Engineer / Software Engineer

GW477

$170,000-$250,000
New York City, NY
Permanent

About the jobHiring for 2 roles:🚀 Senior Software Engineer - $215K–$250K + equity🚀 Software Engi...

View Job

Posted: 08/05/2026

VP Of Engineering- AI

GW476

$300,000-$350,000
Cambridge, MA
Permanent

About the jobVP Engineering – AI & Scientific Discovery - Cambridge, MA An advanced AI-dr...

View Job

Posted: 08/05/2026

Inference Engineer

GW475

$200,000-$300,000
San Francisco, CA
Permanent

About the job📍 Onsite – San Francisco💰 $200k–300k base + meaningful equityAcceler8 Talent ...

View Job

Posted: 08/05/2026

Forward Deployed Engineer

GW474

$150,000-$250,000
San Francisco, CA
Permanent

About the jobFounding Forward Deployed Engineer (German Speaking)Join a fast-scaling AI company transfor...

View Job

Posted: 08/05/2026

Senior AI Engineer

GW473

$140,000-$180,000
United States
Permanent

About the jobSenior AI Engineer – Enterprise AI Transformation - Remote East Coast, US A glob...

View Job

Quick Resume Dropoff

Inference Engineer

About the job

Apply for this role

Still looking? What about...

Featured Jobs

Senior AI Engineer

Senior Backend Engineer

Lead AI Engineer

Lead AI Engineer

Software Engineer, ML Inference

Senior Software Engineer / Software Engineer

VP Of Engineering- AI

Inference Engineer

Forward Deployed Engineer

Senior AI Engineer

Contact Us

Find us on social

Useful Links

Legal