Senior AI Engineer

Lucida is teaching the world to speak.

Two billion people are trying to learn a language. Almost all of them are stuck ; not because they lack motivation, but because the only thing that actually works (talking to a human tutor) is too expensive, too inconvenient, or too embarrassing.

We're building the alternative: a voice-first AI tutor you can actually have a conversation with, anytime, in your pocket. Real-time. Sub-second. Feels-like-a-person. Already serving a million learners.

We're well-funded, seed-stage, and we're hiring the engineer who'll build the backbone behind that product.

The role

You'll own a meaningful surface of our backend ; the systems that turn audio, models, prompts, and user state into a working tutor at scale. Day-to-day, you'll:

Design and operate the real-time conversational pipeline ; streaming services and WebSocket interfaces that keep latency budgets honest at the scale of a million users
Build and harden the LLM orchestration layer ; prompt design as code, structured outputs, streaming, retries, fallbacks, cost control across multiple providers
Treat prompts as engineering artifacts: versioned, evaluated, regression-tested. Vibes are not a methodology
Take open-source models (LLM, ASR, TTS, avatar) from a paper or HF repo and put them on our GPUs ; benchmark, optimize, serve, monitor
Fine-tune and train our own models on top of open-source bases ; curate datasets, run training jobs, evaluate against production criteria, and ship the result
Design event-driven media flows ; webhooks, post-session processing, recording and export pipelines
Own third-party integrations end-to-end ; contracts, retries, observability, the boring-important stuff
Make architecture decisions with the founders, not after them

What we're looking for

5+ years writing production Python you're not embarrassed by ; typed, tested, readable
Deep fluency in asyncio and concurrent/streaming code
Strong command of HTTP, WebSockets, and event-driven systems
Hands-on experience integrating with LLM APIs in production ; streaming, tool use, structured outputs, and the operational realities (rate limits, retries, cost control)
A real sense of prompt engineering as engineering ; you've shipped prompts that survived contact with users, iterated on them with data, and didn't just "feel good in the playground"
A real fine-tuning / training track record ; you've taken an open-source model, prepared the data, run the training, evaluated it honestly, and shipped the result to users. Not a notebook tutorial. A model that moved a metric
Experience deploying and serving your own models on GPUs ; quantization, batching, KV-cache, latency/throughput tradeoffs
A debugging instinct for distributed systems at scale: traces, profiling, backpressure, capacity planning
Comfort with Postgres, Redis, and a queue/broker layer
Pragmatism ; you ship, you measure, you iterate. You don't over-engineer, and you don't under-test

Nice to have

Real-time media systems (WebRTC, SFU, streaming pipelines)
Audio or speech model deployment and fine-tuning in production
Distillation, synthetic data generation, or RLHF/DPO-style alignment work
Multi-region or multi-cloud infrastructure
Cost optimization at scale, token economics, GPU utilization, caching strategies
Open-source contributions

Senior AI Engineer

Key Skills

Related Jobs

Manager Data Science & AI - Consulting

System Engineer/Site Reliability Engineer (m/w/d)

Fullstack Engineer (m/w/d) - Android & Kotlin

Related Jobs

Manager Data Science & AI - Consulting

System Engineer/Site Reliability Engineer (m/w/d)

Fullstack Engineer (m/w/d) - Android & Kotlin

Cookie Settings