Import & Export
Evalify reads and writes eval files in multiple formats. The same pack can be consumed by different tools — Evalify handles the conversion.
Internal model
All formats are normalized to a single internal representation before processing:
interface EvalItem {
prompt: string; // The task or query to send to the AI
expectations: string[]; // Verifiable pass/fail criteria
}
This is the model stored in the registry and returned by the CLI. Import and export are transformations to and from this model.
Anthropic Skill Creator v2
ID: anthropic/skillcreator/v2
Import (read)
Evalify detects this format when the file has a skill_name wrapper, or when items contain id + expected_output + files alongside prompt + expectations. Both the wrapped and bare-array forms are accepted.
{
"skill_name": "my-skill",
"evals": [
{
"id": 1,
"prompt": "Summarize this pull request diff in under 100 words",
"expected_output": "A concise summary that mentions the core change",
"files": [],
"expectations": [
"Response is under 100 words",
"Response mentions the core change",
"Response does not include file paths or line numbers"
]
}
]
}
On import, prompt and expectations are extracted. expected_output, id, and files are not part of the internal model but are preserved when round-tripping.
Export (write)
When exporting back to Skill Creator v2, Evalify produces the full wrapper format. id is re-assigned sequentially from 1. expected_output is left empty (it's a human-authored prose field). files is set to [].
{
"skill_name": "my-skill",
"evals": [
{
"id": 1,
"prompt": "...",
"expected_output": "",
"files": [],
"expectations": ["..."]
}
]
}
Evalify (native)
ID: evalify
Import (read)
Detected when items have prompt + expectations without Skill Creator-specific fields. Accepts both a bare array and an { evals: [...] } wrapper.
[
{
"prompt": "Summarize this pull request diff in under 100 words",
"expectations": [
"Response is under 100 words",
"Response mentions the core change"
]
}
]
Export (write)
Exports as a flat array of { prompt, expectations } objects — no wrapper, no extra fields.
Format detection order
Detection runs most-specific first:
- Skill Creator v2 —
skill_namewrapper, or items withid+expected_output+files - Evalify —
prompt+expectationsfallback