generated-draftA2Real lifehidden-object-vocabulary
Find the passport, gate sign, and boarding pass in a messy office desk.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testdailyarenaglobal
generated-draftB1Real lifeinstruction-following
Tap actions in order: check the address, choose express delivery, then confirm the order.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testglobalru
generated-review-candidateB2Real lifeshop-simulator
Buy medicine that does not make you sleepy with $12 using a train-station kiosk.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testfunnelfriend-challengeglobal
generated-draftB1Real lifesorter
Sort strong emotion words into countable and uncountable bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testglobal
generated-draftB1Real lifecrafter
Craft a hotel room change request using Could I / get / a refund / please.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testdailyglobalrues
generated-draftB1Real lifehidden-object-vocabulary
Find the stapler, invoice, and charging cable in an airport waiting area.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testfunnelarenaglobal
generated-draftB2Real lifeinstruction-following
Tap actions in order: open the settings, tap privacy, then turn off location sharing.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testglobalru
generated-draftA1Real lifeshop-simulator
Buy snacks for a delayed train with $18 using an office supply shelf.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testglobal
generated-draftB1Real lifesorter
Sort strong emotion words into formal and casual bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testfunneldailyfriend-challengeglobal
generated-review-candidateB2Real lifecrafter
Craft a polite meeting reschedule using Could we / move / the meeting / tomorrow morning / please.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testglobalrues
generated-draftB2Real lifehidden-object-vocabulary
Find the receipt, kettle, and suitcase in a pharmacy shelf.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testarenaglobal
generated-draftA1Real lifeinstruction-following
Tap actions in order: unplug the modem, wait ten seconds, then restart the router.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testfunnelglobalru
generated-draftA2Real lifeshop-simulator
Buy a warm drink and the most filling savory breakfast with $25 using a cafe shelf.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testdailyglobal
generated-draftB1Real lifesorter
Sort strong emotion words into travel and office bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testglobal
generated-draftC1Real lifecrafter
Craft a refund request using I can / send / the update / by Friday.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testfunnelfriend-challengeglobalrues
generated-draftA1Real lifehidden-object-vocabulary
Find the passport, gate sign, and boarding pass in a hotel lobby.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testarenaglobal
generated-review-candidateA2Real lifeinstruction-following
Tap actions in order: read the message, attach the invoice, then send the reply.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testdailyglobalru
generated-draftB1Real lifeshop-simulator
Buy office supplies for a quick meeting with $9 using a pharmacy shelf.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testfunnelglobal
generated-draftB1Real lifesorter
Sort strong emotion words into mild and strong bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testglobal
generated-draftA2Real lifecrafter
Craft a short delay update using Could I / get / a refund / please.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testglobalrues
generated-draftA2Real lifehidden-object-vocabulary
Find the stapler, invoice, and charging cable in a messy office desk.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testfunneldailyarenafriend-challengeglobal
generated-draftB1Real lifeinstruction-following
Tap actions in order: check the address, choose express delivery, then confirm the order.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testglobalru
generated-draftB2Real lifeshop-simulator
Buy medicine that does not make you sleepy with $12 using a train-station kiosk.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testglobal
generated-review-candidateB1Real lifesorter
Sort strong emotion words into countable and uncountable bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testfunnelglobal
generated-draftB1Real lifecrafter
Craft a hotel room change request using Could we / move / the meeting / tomorrow morning / please.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testdailyglobalrues
generated-draftB1Real lifehidden-object-vocabulary
Find the receipt, kettle, and suitcase in an airport waiting area.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testarenaglobal
generated-draftB2Real lifeinstruction-following
Tap actions in order: open the settings, tap privacy, then turn off location sharing.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testfunnelfriend-challengeglobalru
generated-draftA1Real lifeshop-simulator
Buy snacks for a delayed train with $18 using an office supply shelf.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testglobal
generated-draftB1Real lifesorter
Sort strong emotion words into formal and casual bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testdailyglobal
generated-draftB2Real lifecrafter
Craft a polite meeting reschedule using I can / send / the update / by Friday.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testfunnelglobalrues
generated-review-candidateB2Real lifehidden-object-vocabulary
Find the passport, gate sign, and boarding pass in a pharmacy shelf.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testarenaglobal
generated-draftA1Real lifeinstruction-following
Tap actions in order: unplug the modem, wait ten seconds, then restart the router.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testglobalru
generated-draftA2Real lifeshop-simulator
Buy a warm drink and the most filling savory breakfast with $25 using a cafe shelf.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testfunneldailyfriend-challengeglobal
generated-draftB1Real lifesorter
Sort strong emotion words into travel and office bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testglobal
generated-draftC1Real lifecrafter
Craft a refund request using Could I / get / a refund / please.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testglobalrues
generated-draftA1Real lifehidden-object-vocabulary
Find the stapler, invoice, and charging cable in a hotel lobby.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testfunnelarenaglobal
generated-draftA2Real lifeinstruction-following
Tap actions in order: read the message, attach the invoice, then send the reply.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testdailyglobalru
generated-review-candidateB1Real lifeshop-simulator
Buy office supplies for a quick meeting with $9 using a pharmacy shelf.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testglobal
generated-draftB1Real lifesorter
Sort strong emotion words into mild and strong bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testfunnelfriend-challengeglobal
generated-draftA2Real lifecrafter
Craft a short delay update using Could we / move / the meeting / tomorrow morning / please.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testglobalrues
generated-draftA2Real lifehidden-object-vocabulary
Find the receipt, kettle, and suitcase in a messy office desk.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testdailyarenaglobal
generated-draftB1Real lifeinstruction-following
Tap actions in order: check the address, choose express delivery, then confirm the order.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testfunnelglobalru
generated-draftB2Real lifeshop-simulator
Buy medicine that does not make you sleepy with $12 using a train-station kiosk.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testglobal
generated-draftB1Real lifesorter
Sort strong emotion words into countable and uncountable bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testglobal
generated-review-candidateB1Real lifecrafter
Craft a hotel room change request using I can / send / the update / by Friday.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testfunneldailyfriend-challengeglobalrues
generated-draftB1Real lifehidden-object-vocabulary
Find the passport, gate sign, and boarding pass in an airport waiting area.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testarenaglobal
generated-draftB2Real lifeinstruction-following
Tap actions in order: open the settings, tap privacy, then turn off location sharing.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testglobalru
generated-draftA1Real lifeshop-simulator
Buy snacks for a delayed train with $18 using an office supply shelf.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testfunnelglobal
generated-draftB1Real lifesorter
Sort strong emotion words into formal and casual bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testdailyglobal
generated-draftB2Real lifecrafter
Craft a polite meeting reschedule using Could I / get / a refund / please.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testglobalrues
generated-draftB2Real lifehidden-object-vocabulary
Find the stapler, invoice, and charging cable in a pharmacy shelf.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testfunnelarenafriend-challengeglobal
generated-review-candidateA1Real lifeinstruction-following
Tap actions in order: unplug the modem, wait ten seconds, then restart the router.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testglobalru
generated-draftA2Real lifeshop-simulator
Buy a warm drink and the most filling savory breakfast with $25 using a cafe shelf.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testdailyglobal
generated-draftB1Real lifesorter
Sort strong emotion words into travel and office bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testfunnelglobal
generated-draftC1Real lifecrafter
Craft a refund request using Could we / move / the meeting / tomorrow morning / please.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testglobalrues
generated-draftA1Real lifehidden-object-vocabulary
Find the receipt, kettle, and suitcase in a hotel lobby.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testarenaglobal
generated-draftA2Real lifeinstruction-following
Tap actions in order: read the message, attach the invoice, then send the reply.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testfunneldailyfriend-challengeglobalru
generated-draftB1Real lifeshop-simulator
Buy office supplies for a quick meeting with $9 using a pharmacy shelf.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testglobal
generated-review-candidateB1Real lifesorter
Sort strong emotion words into mild and strong bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testglobal
generated-draftA2Real lifecrafter
Craft a short delay update using I can / send / the update / by Friday.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testfunnelglobalrues
generated-draftA2Real lifehidden-object-vocabulary
Find the passport, gate sign, and boarding pass in a messy office desk.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testdailyarenaglobal
generated-draftB1Real lifeinstruction-following
Tap actions in order: check the address, choose express delivery, then confirm the order.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testglobalru
generated-draftB2Real lifeshop-simulator
Buy medicine that does not make you sleepy with $12 using a train-station kiosk.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testfunnelfriend-challengeglobal
generated-draftB1Real lifesorter
Sort strong emotion words into countable and uncountable bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testglobal
generated-draftB1Real lifecrafter
Craft a hotel room change request using Could I / get / a refund / please.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testdailyglobalrues
generated-review-candidateB1Real lifehidden-object-vocabulary
Find the stapler, invoice, and charging cable in an airport waiting area.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testfunnelarenaglobal
generated-draftB2Real lifeinstruction-following
Tap actions in order: open the settings, tap privacy, then turn off location sharing.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testglobalru
generated-draftA1Real lifeshop-simulator
Buy snacks for a delayed train with $18 using an office supply shelf.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testglobal
generated-draftB1Real lifesorter
Sort strong emotion words into formal and casual bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testfunneldailyfriend-challengeglobal
generated-draftB2Real lifecrafter
Craft a polite meeting reschedule using Could we / move / the meeting / tomorrow morning / please.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testglobalrues
generated-draftB2Real lifehidden-object-vocabulary
Find the receipt, kettle, and suitcase in a pharmacy shelf.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testarenaglobal
generated-draftA1Real lifeinstruction-following
Tap actions in order: unplug the modem, wait ten seconds, then restart the router.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testfunnelglobalru
generated-review-candidateA2Real lifeshop-simulator
Buy a warm drink and the most filling savory breakfast with $25 using a cafe shelf.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testdailyglobal
generated-draftB1Real lifesorter
Sort strong emotion words into travel and office bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testglobal
generated-draftC1Real lifecrafter
Craft a refund request using I can / send / the update / by Friday.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testfunnelfriend-challengeglobalrues
generated-draftA1Real lifehidden-object-vocabulary
Find the passport, gate sign, and boarding pass in a hotel lobby.
Find the target objects in the scene.
hidden-object-scene-vocabAnswer: coordinate targets with object labels and visual clues
Distractors: nearby objects, similar shapes, and contextually plausible decoys
testarenaglobal
generated-draftA2Real lifeinstruction-following
Tap actions in order: read the message, attach the invoice, then send the reply.
Tap the actions in the correct order.
instruction-sequence-taskAnswer: ordered action IDs from a plausible action pool
Distractors: right action in wrong order, optional action, and semantically nearby action
testdailyglobalru
generated-draftB1Real lifeshop-simulator
Buy office supplies for a quick meeting with $9 using a pharmacy shelf.
Choose the item or basket that satisfies the request.
shop-mission-basketAnswer: basket that satisfies category constraints and budget
Distractors: tempting items that fit one constraint but break another
testfunnelglobal
generated-draftB1Real lifesorter
Sort strong emotion words into mild and strong bins.
Sort each card into the correct group.
category-sorter-roundAnswer: card-to-bin mapping
Distractors: cards that are close in register, grammar function, or meaning
testglobal
generated-review-candidateA2Real lifecrafter
Craft a short delay update using Could I / get / a refund / please.
Write a short natural response.
word-crafter-responseAnswer: response that includes required meaning, order, and tone
Distractors: missing chip, wrong order, too direct tone, or incomplete message
testglobalrues