Research Engineer (RL Environment)

About

This is a well-funded frontier AI company building agentic systems that automate complex, multi-step work.

Agents do not improve in a vacuum. They need environments to operate in, tasks to solve, and clear signals for what good looks like. This role exists to build that layer.

Recent progress has made performance highly competitive on computer-use style benchmarks, and the company has launched a more visible product layer to make the technology easier to demonstrate.

The Environment and Tasks team builds the playground itself: synthetic websites, structured workflows, task sets, and evaluation environments where agents can act, fail, retry, and learn.

What you’ll do

Build training and evaluation environments for agentic systems
Create synthetic websites, workflows, and task suites that reflect useful real-world work
Define reward signals and success criteria for agent behaviour in structured environments
Turn documentation, tools, and existing workflows into interactive agent tasks
Improve the realism, coverage, and difficulty of training environments over time
Partner with research teams to convert product failures into better environments and tasks
Build internal tooling to generate, run, and measure large task sets reliably

What you’ll need

Strong software engineering skills, ideally in Python plus web or backend systems
Experience building internal tools, simulations, evaluation systems, or synthetic environments
Ability to structure ambiguous workflows into clear tasks with measurable outcomes
Good product instinct for what makes an environment realistic and useful for agents
Comfort working at the intersection of engineering, research, and experimentation
High ownership and a practical mindset

Optional Bonus

Experience with browser automation, agents, or evaluation harnesses
Familiarity with RL, reward design, or synthetic data generation

Shortlisted candidates will be contacted within 48 hours.

Location London, Paris
Salary / Compensation Up to £180k + equity package
Sectors Agentic, Frontier AI / Foundation Models
Skills Python, Full-Stack Engineering, Evaluation, Synthetic Data, Agent Environments, LLM Systems

Role Contact

Alex Jouatte

alex@axiomasearch.com

Founding ML Engineer

Founding ML hire at an early-stage startup pre-training foundation models for time-series forecasting. Own the training infrastructure and model architecture from the ground up.

Location
Paris
Type
On-site
Salary
€125k + equity package

Research Engineer (Training Infrastructure)

Build the training stack behind large multimodal models used in agentic AI. This role sits close to research and focuses on distributed training, reliability, and performance at meaningful scale.

Location
London, Paris
Type
Hybrid
Salary
Up to £180k + equity package

Research Engineer (Inference & Serving)

Research Engineer role in frontier AI focused on production LLM serving. London or Paris. Work on latency, throughput, batching, and GPU efficiency.

Location
London, Paris
Type
Hybrid
Salary
Up to £180k + equity package

Didn't find the right role?

Send us your CV.

Upload Your CV Now