Free cookie consent management tool by TermsFeed Generator Research Scientist – RL & LLM Post-Training | Axioma Search
Image
Image
Bg

Research Scientist (LLM/RL)

About

Well-funded frontier AI startup building state-of-the-art agentic systems that automate complex, multi-step tasks. The Models team develops the core LLMs and vision-language models focused on instruction following, tool use, and reliable decision-making at controlled inference cost.

What you'll do

  • Research post-training methods for large multimodal language models with focus on RL and feedback-driven learning
  • Design reward models and large-scale reinforcement learning setups
  • Build automated data collection pipelines using human and machine feedback
  • Develop evaluations that capture real capability gains
  • Translate product failures and use cases into improved training signals

What you'll need

  • Strong research background with hands-on experience in LLM post-training, alignment, or reinforcement learning
  • Proficiency in Python and at least one major deep learning framework (PyTorch, JAX, or TensorFlow)
  • Experience training large models on distributed systems
  • Publications at top-tier conferences (NeurIPS, ICML, ICLR, ACL, CVPR, etc.)
  • Comfortable working in fast-moving research environments

Shortlisted candidates will be contacted within 48 hours.

Back to job listings
  • Location London, Paris
  • Salary / Compensation Up to £220k + equity
  • Work Setup permanent , hybrid
  • Sectors Agentic, Frontier AI / Foundation Models
  • Skills LLMs Reinforcement Learning Post-training PyTorch Distributed training
Image

Role Contact

Calvin Duffy

Bg

Didn't find the right role?

Send us your CV.