Role Overview

Mercor is partnering with leading AI labs on Project Atlas — an initiative to build realistic enterprise environments that frontier AI agents are trained and evaluated in. We're seeking experienced AI users who really live in this space. Think: you use AI tools daily, you genuinely love working with them, you've built custom GPTs, agents, workflows, or prompts that go beyond the basics, or your friends regularly ask, "Wait, how did you do that?!" Sound like you? We’d love to have you on our team Why we're asking: In building realistic enterprise environments, we’d like to understand if given the right tools, whether the right AI power user can solve complex domain problems without being an expert in the field.

Key Responsibilities

Use Claude Code fluently — solving real problems through multi-turn agentic workflows that go beyond single-prompt interactions
Operate across agentic tools like Codex and Claude Code, integrating file environments and MCP tools to execute end-to-end professional workflows
Author benchmark tasks in collaboration with fellow domain experts, ensuring each scenario reflects how the work is actually done in your field
Bring a clear view of how AI agents will be deployed in real professional environments — and concrete ideas for the tasks they'll need to perform well to be trusted at the frontier
Design multi-step tasks grounded in your real workflows — structured around end-to-end matters that require navigating multiple apps, files, and stakeholders in a way that meaningfully challenges frontier AI agents
Collaborate with other experts to design the environment, shape task scope, and review each other's scenarios for realism and rigor
Work asynchronously with research teams to refine task designs and evaluation criteria for litigation agent benchmarks
Contribute to frontier AI research and benchmarking — the work you produce directly informs how leading labs train and evaluate the next generation of AI systems

Compensation Note

This project is expected to begin on an effective hourly rate, but will transition to a model where experts are compensated based on throughput of quality work rather than a flat accruing hourly rate.

About Mercor

Mercor is a talent marketplace that connects top experts with leading AI labs and research organizations. Backed by investors including Benchmark, General Catalyst, Adam D'Angelo, and Jack Dorsey. Thousands of professionals across domains contribute to projects shaping the next generation of AI systems.

AI Power User

Description

Interested in this position?