Lead LLM Evals Engineer

GW267 Posted: 28/01/2026

$250,000-$350,000
San Francisco, CA
Permanent

About the job

Lead LLM Evals Engineer | SF or Redwood City

I'm hiring a Lead LLM Evals Engineer to join an early-stage physical AI startup building systems with general physical ability to experiment, engineer, and manufacture anything. They’re a small, deeply technical team pushing agentic LLMs into real autonomous workflows tied to physical systems, factories, and end-to-end execution.

This role owns the evaluation and verification layer for agentic LLM systems operating in complex, long-horizon environments. You’ll build eval harnesses, automated verifiers, and regression gates that determine whether agents can actually plan, execute, recover, and ship real outcomes across simulated and real-world workflows. The work directly shapes how fast these systems improve, how safely

they operate, and whether progress is real or illusory.

→ Build eval harnesses for agentic LLM systems in complex workflows

→ Design verifiers for planning, execution, recovery, and constraint adherence

→ Turn eval failures into training signals with research and systems teams

Both Senior & Lead levels considered.

Interested? Apply now!

Nick Bell ML Research & Engineering Recruiter

Apply for this role

First Name

Last Name

Telephone Number

Email Address

Resume, LinkedIn or Dropbox URL

Resume Upload

Choose File

LinkedIn / Dropbox URL

Message

By submitting this form you agree to our Terms & Conditions, Privacy Policy & Cookie Policy

Not yet registered? Create an account today

Already have an account? Sign in now

Still looking? What about...

Featured Jobs

View all jobs

Posted: 28/01/2026

Software Engineer — LLM Agents & Automation Systems

GW280

$200,000-$250,000
San Francisco, CA
Permanent

About the jobSoftware Engineer — LLM Agents & Automation SystemsSF Bay Area | On-siteWe’...

View Job

Posted: 28/01/2026

Founding Engineer

GW279

$180,000-$300,000
New York City, NY
Permanent

About the jobFounding Engineer - Onsite NYC - up to $300k A well-funded, seed-stage startup rebuild...

View Job

Posted: 28/01/2026

Compiler Engineer

GW278

$200,000-$300,000
Mountain View, CA
Permanent

About the jobCompiler Engineer - AI Hardware Hybrid from Mountain View, CAAcceler8 Talent is seekin...

View Job

Posted: 28/01/2026

Emulation Engineer

GW277

$200,000-$300,000
Mountain View, CA
Permanent

About the jobAcceler8 Talent is seeking an experienced Design Verification / Principal Emulation En...

View Job

Posted: 28/01/2026

Research Scientist (Gen AI)

GW276

$200,000-$250,000
San Francisco, CA
Permanent

About the jobResearch ScientistLocation: SF Bay Area (On-site)We’re representing a cutting-ed...

View Job

Posted: 28/01/2026

Principal Python Engineer — ML Product Company (Remote)

GW275

$225,000-$250,000
United States
Permanent

About the job🚀 Principal Python Engineer — ML Product Company (Remote)Acceler8 Talent is partneri...

View Job

Posted: 28/01/2026

Machine Learning Research Engineer

GW274

$150,000-$250,000
San Francisco, CA
Permanent

About the jobMachine Learning Research Engineer - Video Intelligence - Up to $250kA Series A Funded star...

View Job

Posted: 28/01/2026

Senior Backend Engineer

GW273

$150,000-$220,000
United States
Permanent

About the jobSenior Backend Engineer (Python) - US RemoteA Boston based deep-tech startup building AI-dr...

View Job

Posted: 28/01/2026

Software Engineer

GW272

$150,000-$250,000
San Francisco, CA
Permanent

About the jobSoftware Engineer - San Francisco, CAA revolutionary startup that's completely rebuildi...

View Job

Posted: 28/01/2026

ML Infrastructure Engineer

GW271

$200,000-$275,000
San Francisco, CA
Permanent

About the jobSenior ML Infrastructure / Backend EngineerSeries C Startup | AI-Powered 3D & Avatar Pl...

View Job

Quick Resume Dropoff

Lead LLM Evals Engineer

About the job

Apply for this role

Still looking? What about...

Featured Jobs

Software Engineer — LLM Agents & Automation Systems

Founding Engineer

Compiler Engineer

Emulation Engineer

Research Scientist (Gen AI)

Principal Python Engineer — ML Product Company (Remote)

Machine Learning Research Engineer

Senior Backend Engineer

Software Engineer

ML Infrastructure Engineer

Contact Us

Find us on social

Useful Links

Legal