Software Engineer
- $180,000-$300,000
- San Francisco, CA
- Permanent
About the job
Compiler & Runtime Systems
San Francisco, CA (Onsite)
$250,000-$300,000 Base + Equity
I'm working with a rapidly growing AI infrastructure company building the orchestration layer for the future of AI compute.
As AI systems become increasingly complex and hardware becomes increasingly heterogeneous, simply deploying more GPUs is no longer enough. This team is building the software stack that intelligently partitions, schedules, optimizes, and executes AI workloads across diverse hardware architectures.
They've recently emerged from stealth with 80m Series A , eight-figure revenue, Fortune 500 deployments, and a growing roster of AI-native customers.
This is not a traditional compiler role.
You won't be building language tooling in isolation. You'll be working across compilers, runtime systems, serving infrastructure, scheduling, and execution optimization to improve how large-scale AI workloads run in production.
You'll work on problems such as:
• MLIR and compiler optimizations for AI workloads
• Runtime systems and execution planning
• Scheduling and workload partitioning across heterogeneous hardware
• Memory movement, kernel orchestration, and execution efficiency
• AI inference serving and latency optimization
• Speculative decoding and next-generation serving architectures
• Performance profiling and bottleneck analysis across the AI stack.
We're looking for engineers who have:
• Strong systems programming and performance engineering fundamentals
• Experience building compiler systems, runtime systems, or execution infrastructure
• Experience implementing compiler passes, IR transformations, lowering, or code generation systems
• Strong understanding of memory systems, scheduling, and hardware performance
• Strong C++ and/or Python engineering skills
• Experience working on performance-critical systems
Particularly relevant experience includes:
• Experience optimizing large-scale inference workloads
• Experience with GPUs, AI accelerators, or heterogeneous compute systems
• Familiarity with kernel dispatch, launch APIs, or memory allocators
• Experience with distributed systems and serving infrastructure
• Experience profiling and debugging production performance bottlenecks
If you're excited about compilers, runtime systems, and helping define how AI workloads execute across the next generation of compute infrastructure, I'd love to chat.