About
Early-stage robotics and foundational AI company building universal foundation models for general-purpose mobile robots. The focus is on simulation-first data generation, proprietary multimodal models, and tight hardware integration to enable scalable Physical AI systems.
You will join the group building the core Vision-Language-Action models used for real-world robotic control.
What you'll do
- Build Vision-Language-Action models end-to-end: data curation, architecture design, training, inference, and evaluation
- Work with simulation and robotics teams to curate large-scale embodied datasets
- Curate and refine internet-scale datasets for embodied perception and robot video
- Design generative simulation techniques to scale data diversity
- Run reproducible experiments and push models from prototype to production-ready systems
- Contribute to a focused team building general-purpose Physical AI
What you'll need
- Strong track record in foundation model research (robotics, autonomous driving, or video generation preferred)
- Experience pioneering new ML methods or significantly advancing existing approaches
- High standards for data quality, code quality, and evaluation rigor
- Python expertise
- Strong ownership mindset
Shortlisted candidates will be contacted within 48 hours.