Senior Platform Engineer, Voice AI

Company: Together AI
Location: San Francisco
Posted on: April 2, 2026

Job Description:

About the Role Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications — serving speech-to-text and text-to-speech models with best-in-class latency and reliability. We're looking for a Senior Platform Engineer to own the API and infrastructure layer for voice workloads. You'll build the real-time WebSocket and HTTP APIs that developers use to ship voice experiences, design autoscaling for latency-sensitive streaming workloads, and ensure our multi-provider voice platform is reliable enough for production voice agents handling millions of calls. This is a foundational hire on a small, high-impact team. Voice APIs have fundamentally different infrastructure requirements than text-based inference — bidirectional audio streaming, stateful connections, tight latency SLOs, and complex multi-model routing. You'll define how developers interact with Together's voice platform as we grow from early customers to the default infrastructure for voice AI. Own the real-time API layer (WebSocket HTTP streaming) that powers Together's voice platform. Design autoscaling and orchestration for voice workloads running on tens of thousands of GPUs. Build the developer experience — APIs, observability, and tooling — for a fast-growing product area. Work with production voice customers (contact centers, AI agents, communication platforms) to ship what they actually need. Join a small, early-stage team with outsized impact on a new product line. Responsibilities Build and harden real-time WebSocket and HTTP streaming APIs for STT and TTS — including connection lifecycle management, backpressure, error handling, and reconnection, at the reliability bar needed for production voice agents. Design and ship autoscaling for voice model endpoints that handles bursty, real-time traffic patterns — accounting for concurrent connection limits, streaming state, and hard latency ceilings. Implement voice-specific API features: word-level alignment, speaker diarization in realtime, audio format flexibility (g711/mulaw for telephony, PCM, WebRTC formats), pronunciation controls, and multi-context WebSocket support. Build voice-specific observability — latency breakdowns, audio quality signals, and dashboards that help both the team and customers debug issues. Own multi-model normalization across our model partners (Cartesia, Deepgram, Rime, and others), ensuring consistent API behavior regardless of the underlying provider. Collaborate with the ML engineering side of the team on the interface between the API layer and the model serving stack, ensuring latency and reliability requirements are met end-to-end. Contribute to developer experience — API design, documentation, integration cookbooks, playground and showcasing how best-in-class voice agents are built. Lay the groundwork for multiple new products down the line. Requirements 5 years of experience building large-scale, real-time distributed systems and API services. Deep expertise in real-time streaming infrastructure — WebSocket server architecture, Server-Sent Events, bidirectional streaming, connection multiplexing, and stateful protocol design. Expert-level programming in TypeScript and Python; experience with Rust is a plus. Strong distributed systems fundamentals: load balancing, autoscaling, rate limiting, and traffic shaping for latency-sensitive workloads. Experience with Kubernetes — including custom autoscalers, resource management, and health checking for stateful services. Strong product sense — you care about API ergonomics and think about what developers building voice apps actually need. Comfort working on a small, early-stage team where you'll wear multiple hats and move fast. Experience with audio or media protocols (WebRTC, g711, PCM encoding) is a strong plus. Familiarity with ML model serving infrastructure and how inference engines work is a plus — you'll interface with the serving layer regularly. Full-stack experience (React, Next.js) is a nice-to-have for contributing to developer-facing tooling. Bachelor's or Master's degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $200,000 - $260,000 equity benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy

Keywords: Together AI, Sacramento , Senior Platform Engineer, Voice AI, IT / Software / Systems , San Francisco, California

Didn't find what you're looking for? Search again!

Let San Francisco recruiters find you. Post your resume for free!

Get San Francisco IT / Software / Systems jobs via email.

View more Sacramento IT / Software / Systems jobs

Other IT / Software / Systems Jobs

Tech Lead Manager, Billing and Insights
Description: About Glean: Glean is the Work AI platform that helps everyone work smarter with AI. What began as the industry s most advanced enterprise search has evolved into a full-scale Work AI ecosystem, powering (more...)
Company: Glean
Location: San Francisco
Posted on: 04/4/2026

Software Engineer, Backend
Description: About Glean: Glean is the Work AI platform that helps everyone work smarter with AI. What began as the industry s most advanced enterprise search has evolved into a full-scale Work AI ecosystem, powering (more...)
Company: Glean
Location: San Francisco
Posted on: 04/4/2026

Product Manager, Developer Experience
Description: About us At Sierra, we re creating a platform to help businesses build better, more human customer experiences with AI. We are primarily an in-person company based in San Francisco, with growing offices (more...)
Company: Sierra
Location: San Francisco
Posted on: 04/4/2026

Salary in Sacramento, California Area | More details for Sacramento, California Jobs |Salary

Systems Research Engineer, GPU Programming
Description: About the Role As a Systems Research Engineer specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working (more...)
Company: Together AI
Location: San Francisco
Posted on: 04/4/2026

Software Engineer, Full-Stack
Description: About us Encord is the universal data layer for AI that helps 300 AI teams train and run models on the right data. Our platform indexes, curates, annotates, and evaluates data across the full AI lifecycle, (more...)
Company: Encord
Location: San Francisco
Posted on: 04/4/2026

Product Lifecycle Management (PLM) Analyst
Description: Cognizant is a leading provider IT and BPO services, providing critical initiatives to a variety of global clients. The PLM Product Lifecycle Management services within New Product Introduction team (more...)
Company: Cognizant
Location: Stanford
Posted on: 04/4/2026

Technical Project Manager
Description: Technical Project Manager Hybrid, 6 month contract with the potential to extend. WHY DEPT We are a Growth Invention company built to help the world s most ambitious brands grow faster. Operating (more...)
Company: Dept
Location: San Francisco
Posted on: 04/4/2026

Product Manager, Agent SDK
Description: About us At Sierra, we re creating a platform to help businesses build better, more human customer experiences with AI. We are primarily an in-person company based in San Francisco, with growing offices (more...)
Company: Sierra
Location: San Francisco
Posted on: 04/4/2026

Product Engineer, Applied AI
Description: Pocus exists to supercharge GTM teams. We make every rep a 10x seller. With Pocus, organizations can have fewer, better reps to drive increased pipeline and revenue. How We ve created the world s (more...)
Company: Pocus
Location: San Francisco
Posted on: 04/4/2026

Deal Desk & Pricing Senior Analyst
Description: About Checkr Checkr is building the data platform to power safe and fair decisions. Established in 2014, Checkr s innovative technology and robust data platform help customers assess risk and ensure (more...)
Company: Checkr
Location: San Francisco
Posted on: 04/4/2026

Loading more jobs...

Senior Platform Engineer, Voice AI

Didn't find what you're looking for? Search again!

Other IT / Software / Systems Jobs

Log In or Create An Account