Description
Role Overview
Mercor is collaborating with a leading AI lab to contract experienced professionals for an AI model evaluation project. Contractors will assess the quality, accuracy, and safety of AI-generated responses across specialized domains such as finance, law, medicine, and accounting. The project offers an opportunity to directly improve the reliability of AI systems in high-stakes contexts where inaccurate information carries serious risk.
Key Responsibilities
-
Write realistic prompts that reflect how professionals and consumers seek domain-specific guidance
-
Evaluate AI-generated responses for factual accuracy, regulatory or clinical correctness, and practical usefulness
-
Identify fabricated claims, incorrect references, or misleading reasoning across model outputs
-
Score and rank multiple model responses using structured rubrics across dimensions
-
Provide written justifications with specific evidence for each evaluation
Ideal Qualifications
-
Master’s degree or higher in a relevant professional field (e.g., Finance, Accounting, Law, Medicine, Healthcare, Engineering)
-
Professional experience applying domain expertise in a practitioner or advisory capacity
-
Familiarity with industry-specific standards, regulations, or clinical guidelines
-
Strong written communication and critical reasoning skills
More About the Opportunity
- Expected commitment: ~20 hours/week
Application Process
-
Submit your resume to begin
-
Complete a Training Assessment
Interested in this position?
Apply directly on the company's website