← All games
Game · Compare · A vs B
Versus
Pick the better of two AI answers.
How it plays
- 1A task carries a shared prompt and two candidate answers, A and B.
- 2Read both side by side — labels hidden, sides shuffled to remove bias.
- 3Choose A, B, or Tie, with an optional reason.
- 4With several reviewers, the majority decides — RLHF-grade preference data.
← A is better→ B is betterspace tie
Best for
- LLM answer-quality evaluation
- A/B of two prompts or models
- RLHF / preference data collection
- Choosing between two generated drafts
Send a task
POST to /api/v1/tasks with module: "versus" and a project API key. The verdict webhooks back to your callback_url.
{
"module": "versus",
"payload": {
"prompt": "Draft a reply to this refund request",
"a": { "label": "model-x", "text": "..." },
"b": { "label": "model-y", "text": "..." }
},
"callback_url": "https://your-app/verdict"
}Other games
SSwiper
Approve or flag outputs, one card at a time
Binary · yes / noSSorter
Drop each item into the right category
Classify · 1 of NDDetective
Hunt errors in long generated text
Verify · find the mistakeFFixer
Confirm or correct extracted fields
Correct · check the dataRRedact
Mask the sensitive data an AI left in the open
Redact · tap what's sensitiveGGrounding
Check an AI's claim against its source
Verify · supported or not