Free cookie consent management tool by TermsFeed Generator Research Engineer, RL Environment | Agentic AI | Axioma Search
Image
Image
Bg

Research Engineer (RL Environment)

About 

This is a well-funded frontier AI company building agentic systems that automate complex, multi-step work. 

Agents do not improve in a vacuum. They need environments to operate in, tasks to solve, and clear signals for what good looks like. This role exists to build that layer.

Recent progress has made performance highly competitive on computer-use style benchmarks, and the company has launched a more visible product layer to make the technology easier to demonstrate.

The Environment and Tasks team builds the playground itself: synthetic websites, structured workflows, task sets, and evaluation environments where agents can act, fail, retry, and learn.

What you’ll do

  • Build training and evaluation environments for agentic systems
  • Create synthetic websites, workflows, and task suites that reflect useful real-world work
  • Define reward signals and success criteria for agent behaviour in structured environments
  • Turn documentation, tools, and existing workflows into interactive agent tasks
  • Improve the realism, coverage, and difficulty of training environments over time
  • Partner with research teams to convert product failures into better environments and tasks
  • Build internal tooling to generate, run, and measure large task sets reliably

What you’ll need

  • Strong software engineering skills, ideally in Python plus web or backend systems
  • Experience building internal tools, simulations, evaluation systems, or synthetic environments
  • Ability to structure ambiguous workflows into clear tasks with measurable outcomes
  • Good product instinct for what makes an environment realistic and useful for agents
  • Comfort working at the intersection of engineering, research, and experimentation
  • High ownership and a practical mindset

Optional Bonus

  • Experience with browser automation, agents, or evaluation harnesses
  • Familiarity with RL, reward design, or synthetic data generation

Shortlisted candidates will be contacted within 48 hours.

Back to job listings
  • Location London, Paris
  • Salary / Compensation Up to £180k + equity package
  • Sectors Agentic, Frontier AI / Foundation Models
  • Skills Python, Full-Stack Engineering, Evaluation, Synthetic Data, Agent Environments, LLM Systems
Image

Role Contact

Alex Jouatte

Bg

Didn't find the right role?

Send us your CV.