Back to Academy
Relevance 7/10Data and MetricsAdvanced6 min read

Label Leakage

Label leakage occurs when target information unintentionally appears in features or prompt context.

Why it matters for annotators

Leakage inflates metrics and harms real-world performance.

Visual mental model

Feature/prompt audit -> leakage detection -> cleanup.

Examples (bad vs good)

Scenario: Real annotation scenario involving Label Leakage

Bad: Labeling quickly without applying project rubric.

Good: Applying rubric criteria, documenting rationale, and escalating uncertainty.

Common mistakes

  • Skipping guideline details for edge cases.
  • Applying inconsistent criteria across similar samples.
  • Avoiding escalation even when uncertain.

Submission checklist

  • Read the latest guideline update before each batch.
  • Apply rubric dimensions explicitly in each decision.
  • Escalate ambiguous items with concise rationale.