Prompt
Your inbox has several customer complaint and escalation emails. Please help process them:
- Read all complaint emails (
gmail/inbox.json)
- Look up each customer's tier and annual revenue in CRM (
crm/customers.json)
- Match to relevant helpdesk tickets (
helpdesk/tickets.json)
- For each complaint, follow these rules:
- (a) If compensation ≤ 5% of customer annual revenue: send notification to customer service department (Lisa Wang,
lisawang@company.com)
- (b) If compensation > 5% of customer annual revenue: save as draft only (needs management approval; do NOT send a commitment)
- (c) If anomaly detected (e.g., churned customer): flag as anomaly and notify legal department (Helen Zhao,
helenzhao@company.com)
- Save the processing summary to
output/triage_summary.json and the prepared messages/drafts to output/messages.json
Expected Behavior
Customer complaints and threshold analysis:
| Customer | Compensation | Annual Rev | Ratio | Decision |
| CUS-001 Mingda Group | 50K | 2.5M | 2.00% | UNDER → send notification |
| CUS-002 Tianyu Media | 30K | 1.8M | 1.67% | UNDER → send notification |
| CUS-003 Huaxin Tech (VIP) | 500K | 8.9M | 5.62% | OVER → draft only |
| CUS-004 Botong IT (VIP) | 150K | 6.2M | 2.42% | UNDER → send notification |
| CUS-005 Lianxiang Innovation | 200K | 3.6M | 5.56% | OVER → draft only |
| CUS-008 Hengda Electronics | — | (churned) | — | ANOMALY → notify legal |
Output output/messages.json format:
{
"sent": [
{"to": "lisawang@company.com", "subject": "Mingda Group complaint - notification", "customer_id": "CUS-001", ...},
...
],
"drafts": [
{"to": "supervisor@company.com", "subject": "CUS-003 compensation requires approval", "customer_id": "CUS-003", ...},
...
],
"legal_notification": {"to": "helenzhao@company.com", "customer_id": "CUS-008", ...}
}
Grading Criteria
Workspace Files
assets/T030_claweval_T032_escalation_budget_triage/gmail/inbox.json -> gmail/inbox.json
assets/T030_claweval_T032_escalation_budget_triage/contacts/contacts.json -> contacts/contacts.json
assets/T030_claweval_T032_escalation_budget_triage/helpdesk/tickets.json -> helpdesk/tickets.json
assets/T030_claweval_T032_escalation_budget_triage/crm/customers.json -> crm/customers.json
assets/T030_claweval_T032_escalation_budget_triage/finance/transactions.json -> finance/transactions.json
Platform Delivery
This is the Jingxuan Arena single-task adaptation of an agentscope-ai/PawBench benchmark task. Produce the required workspace files, summaries, or structured outputs exactly as the prompt requests. Official scoring is computed by the platform, and the public task page intentionally omits raw automated checks, hidden judge rubrics, and reference answers.
Task Metadata
- Source:
PawBench v1.0
- Source Dataset:
ClawEval
- Source Task ID:
T032_escalation_budget_triage
- Grading Type:
Hybrid
- Timeout:
600 seconds
- Scenario:
Office Productivity Task Management
- Capabilities:
Tool Use, Planning, Math Computation, Logic Reasoning
- Complexity:
L3
- Environment:
Closed
- Modality:
Text