Research Engineer Job at Delphi, San Francisco, CA

VDlrTnJ2TUNkK0JIRHMxdFBUKzlGaEtEWXc9PQ==
  • Delphi
  • San Francisco, CA

Job Description

Our “Clone Brain” architecture allows you to create a digital representation of your mind—reflecting your knowledge, tone, ways of thinking, and even the purpose that drives your conversations. (For example, a leadership coach might direct their clone to mentor emerging managers, while a consultant might want their clone to focus on sales strategy and client onboarding.)

Up until now, many of our improvements have come from intuition, first principles, and a very basic testing suite. We want to increase the fidelity of each Clone Brain, ensuring it captures its owner’s unique style, knowledge, and conversational aims, while also being able to reason in new situations. But to do that, we need rigorous measurements and interpretability tools that transform “it feels right” into “we have metrics & benchmarks that prove it.”

Enter the Research Engineer – Evals & Interpretability. You’ll develop frameworks that quantify how well each digital clone mirrors the authenticity and expertise of its human counterpart, while also building the tooling to open the black box and figure out why the clone behaves the way it does. If you’re curious about cognitive science, neural network interpretability, and the essence of what makes a human mind unique—this role has your name on it.

What You Will Work On

1. Frontier Eval Systems & Metrics

  • Design, implement, and manage robust evaluation frameworks that measure how faithfully a clone reflects its owner’s tone, style, purpose, and reasoning.
  • Develop automated tests and analysis pipelines to compare new models and architectures, ensuring we’re always improving the fidelity of our Clone Brain.

2. Interpretability & Debugging

  • Build interpretability tools that shine a light on the internal workings of our clone models, from attention heads to knowledge graph structures.
  • Investigate model behaviors and anomalies, surfacing insights that guide algorithmic improvements and mitigate unexpected outcomes.

3. Collaboration & Deployment

  • Work closely with our AI, product, and engineering teams to integrate your evaluation suites into production workflows.
  • Contribute to real-time feedback loops that help experts refine their clone’s knowledge and style with confidence.

4. Infrastructure & Tooling

  • Develop the technical infrastructure for large-scale experimentation and analysis, ensuring that interpretability and eval frameworks can scale across thousands of clones.
  • Help define our data schemas, retrieval strategies, and model instrumentation in collaboration with data and infra engineers.

Preferred Abilities

  • Hands-On Research Experience : A track record of designing experiments and running them end-to-end—whether in AI, ML, or another scientific domain.
  • LLM Familiarity : Experience evaluating or fine-tuning large language models, with an emphasis on measuring alignment, style transfer, or interpretability.
  • Python Proficiency : Strong coding skills to build robust pipelines and experiment frameworks.
  • Evals & Benchmarking : Familiarity with common language model benchmarks and an eagerness to develop new ones.
  • Interpretability Fundamentals : Knowledge of mechanistic interpretability, feature attribution, or circuit-level analysis is a huge plus.
  • Infrastructure & Tools : Comfort with containers, scaling experiments on clusters, and building internal tools.
  • Experimental Mindset : Ability to pivot quickly when an approach doesn’t pan out, and a relentless drive to find creative solutions to open-ended questions.

Why You Might Like This Role

  • Evals for AI is pushing the frontier of research. How to do evals correctly is still an open question. People who will thrive in this role are excited by this challenge, and the opportunity to be at the forefront of research.
  • High level of ownership and impact on product, technical architecture, and company culture
  • Opportunity to define the future of digital cloning, ultimately enabling digital immortality and 1-1 mentorship for the masses.
  • Challenging work that pushes you to your limits
  • Collaboration with a team passionate about scaling human potential and personalized learning
  • Chance to join a fast-growing startup creating a new market, approaching problems from first principles while valuing design and brand

Why You Might Not Like This Role

  • Not a 9-to-5
  • We move fast, iterate often, and tackle ambitious challenges—this isn’t a clock-in/clock-out environment.
  • No Existing Blueprint
  • If you prefer well-trodden paths and established frameworks, be warned: we’re creating something that’s never existed before.
  • Applied AI Over Foundation Research
  • Our focus is on building and optimizing real products for end users, not on training new LLMs from scratch.
  • Fully On-Site
  • We believe in-person collaboration drives better ideas. If you’re looking for remote, this might not be for you.

Job Tags

Remote job,

Similar Jobs

CBN Brand

Remote Contact Center Representative - Full Time & Part Time Job at CBN Brand

 ...The 700 Club Prayer Center The Christian Broadcasting Network (CBN) is looking for Remote Contact Center Representatives to join our dynamic inbound contact center ministry team. We are seeking full and part-time dedicated customer service professionals who also... 

One World Global Services LLC

Freelance Vietnamese US-Based Interpreter Job at One World Global Services LLC

 ...LANGUAGE : Vietnamese US-based Interpreter As a remote interpreter, you play a significant role in facilitating communication between VIETNAMESE and English speakers. The interpreter needs to be able to process information quickly and with accuracy in a professional... 

DEPENDABLE NURSES

Home Health PT Needed!-Central or East Valley Job at DEPENDABLE NURSES

 ...Home Health Physical Therapist (PT) Location: Central/East Valley Employment Type: Part-Time (Flexible Schedules) Compensation: Competitive Pay + Weekly Direct Deposit Benefits: 401(k) Plan Why Join Dependable Nursing? With a heart-centered approach to... 

OneMain Financial

Consumer Loan Sales Specialist Job at OneMain Financial

 ...Job Description At OneMain, Consumer Loan Sales Specialists empower customers listening...  ..., by phone and online. At every level, were committed to an inclusive culture,...  ...Business Development, New Grad, Newly Graduated, Entry level, Financial Sales, Management... 

Avera Health

Registered Nurse (RN), Licensed Practical Nurse (LPN), Medical Assistant (MA) Job at Avera Health

 ...Registered Nurse (RN), Licensed Practical Nurse (LPN), Medical Assistant (MA) at Avera Health summary: The position involves delivering quality nursing care in a patient-focused environment at Avera Medical Group Pain Management. Responsibilities include developing nursing...