Principal Machine Learning Researcher Job at Alldus, San Jose, CA

cTVHSTdvWVVOcms1NWgrTVNKS2F2djRFMXc9PQ==
  • Alldus
  • San Jose, CA

Job Description

Principal / Director, AI Research – Reinforcement Learning for LLMs

We're hiring a Principal or Director-level AI Researcher with deep expertise in Reinforcement Learning and LLM post-training to join our growing AI research group. This is a research-first role, with a mandate to push the frontier of model alignment, safety, and performance — working with foundation models in real-world, high-stakes environments.

You won’t be handed toy problems or legacy systems. Instead, you'll lead applied research efforts focused on tuning, aligning, and optimizing large models for privacy, security, and interpretability - in one of the few spaces where LLMs have both massive scale and measurable consequences.

What You’ll Work On:

This role centers on building and refining intelligent agents that interact with sensitive data and complex access controls, using modern reinforcement learning and post-training techniques:

  • Post-training of LLMs using RL: Design and run experiments with methods like PPO, DPO, RLAIF, and other fine-tuning strategies to align model behavior with security and privacy goals
  • RL for Self-Correction & Redaction: Enable models to iteratively improve their predictions on document classification, redaction, and identity resolution through self-rewarded feedback loops
  • Model Alignment & Safety: Contribute to the development of our “LLM Firewall” — filtering prompts/responses to prevent jailbreaking, data leakage, and adversarial exploits
  • Inference Stack & Optimization: Collaborate with engineers optimizing our in-house inference stack to make LLaMA-class models performant at scale

What We’re Looking For:

  • Demonstrated expertise in Reinforcement Learning applied to language models or decision-making agents
  • Strong understanding of post-training methodologies (e.g., RLHF, DPO, preference modeling, rejection sampling, offline RL)
  • Solid background in LLMs , token-level reasoning , and language modeling internals
  • Publication record or research contributions in top-tier venues (NeurIPS, ICLR, ICML, ACL, etc.) preferred
  • Ability to work independently and iterate quickly — experience in scrappy, high-output research environments a plus
  • Industry experience is not required — we care more about the depth of your research thinking and experimentation rigor

Why This Role:

  • Join a company with massive real-world data , impactful use cases, and a mature infrastructure
  • Avoid the grind of infra-focused roles — we’ve already solved those problems
  • Shape the next phase of LLM alignment , self-correcting models , and AI safety at inference time
  • Work on problems with technical depth and direct product impact

Job Tags

Similar Jobs

Lifepoint Health®

Program Director/Director of Rehabilitation Job at Lifepoint Health®

 ...Facility Name: Kindred Hospital The Palm Beaches Setting: Inpatient Acute Rehab Job Type: Full-Time City/State: Riviera Beach, FL Discipline: RN/PT/OT/SLP Lifepoint Rehabilitation is a leading provider of acute inpatient rehabilitation services with more... 

Lori Long - State Farm Insurance Agent

Customer Service Representative Job at Lori Long - State Farm Insurance Agent

About the Company - Established State Farm Agent in Rancho Bernardo area looking for a licensed and experienced Customer Service Representative to provide outstanding service to our customers. This is an in-office position. About the Ro le - Must be professional...

PHR Group

Entry Level Sales Representative Job at PHR Group

$3,900 Sign On Bonus* (No Experience Necessary) Hiring Immediately - As a full-time sales representative, we're preparing you to succeed...  ...needs. - Subsidized child care and fertility benefits. - Paid parental leave. - Free health screenings & rewards for... 

Klamath Community College

12-Month Surgical Technology Clinical Lead/Instructor Job at Klamath Community College

 ...Participate in the development, implementation, and evaluation of the organizing framework and learning outcomes of the surgery tech program. In collaboration with the lead faculty, develop, integrate, and evaluate student learning experiences, including selection... 

Adecco

Customer Service Manager Job at Adecco

 ...Adecco is representing a well-established manufacturer seeking a Customer Service Manager to lead a high-performing team, improve internal processes, and ensure exceptional service to key accounts. This is a great opportunity to join a stable and growing company that...