Skip to main content

HCS-25 (Signal): A2A/MCP Simple Evals (Informative)

PurposeDirect link to Purpose

Run lightweight, automatically-graded “baseline correctness” prompts against chat-capable agents and store results as trust signals.

Stored fields (example schema)Direct link to Stored fields (example schema)

Stored in subject.metadata.additional (example keys):

FieldTypeMeaning
a2aSimpleMathScorenumber | nullScore [0,100] (typically 0/100)
a2aSimpleMathStatusstringStatus token (e.g., correct, wrong, unparseable, timeout, missing, empty, skipped, upstream-error, error)
a2aSimpleMathQuestionIdstringQuestion identifier
a2aSimpleMathResponsestring | nullRaw response (optional)
a2aSimpleMathErrorstring | nullOptional error classification (timeouts, upstream limits, etc.)
a2aSimpleMathUpdatedAtISO timestampRefresh time
a2aSimpleScienceScorenumber | nullScore [0,100]
a2aSimpleScienceStatusstringStatus token
a2aSimpleScienceQuestionIdstringQuestion identifier
a2aSimpleScienceResponsestring | nullRaw response (optional)
a2aSimpleScienceErrorstring | nullOptional error classification
a2aSimpleScienceUpdatedAtISO timestampRefresh time

See ../simple-evals.md for the general evaluation methodology.

Production example (Registry Broker; informative)Direct link to Production example (Registry Broker; informative)

  • Endpoint: https://hol.org/registry/api/v1/agents/{uaid}
  • Example UAID: uaid:aid:3RomW1LwBJ7ZM1PWrLCEro9w4YtY9xwGNWMM2i21mANv8BcWpKg4a7zXxcoNMPDJ7B (NANDA registry entry with simple eval fields populated)