# Examples

This directory holds reference payloads in AXF (`.axf`) alongside their
JSON equivalents (`.json`) where applicable. The `.axf` files are normative
test vectors — encoders and decoders MUST be able to round-trip them.

## File index

| File | What it is |
|---|---|
| `tool-call.axf` | A single tool/function call (`get_weather`) with header + metadata. |
| `tool-call.json` | The same logical payload encoded as JSON, for byte/token comparison. |
| `chat-message.axf` | A user-authored chat message with annotation segment. |

## Wire format primer (cheat sheet)

AXF is segment-based, inspired by EDI/X12 but reshaped for LLM tokens:

- `~` terminates a **segment** (one logical row).
- `*` separates **elements** within a segment.
- `:` separates **sub-elements** (key/value or compound values).
- The first segment is always `FX*<version>*<schema-id>*<message-id>~`.
- The last segment is always `E~` (end-of-message sentinel).

Tags used in these examples:

| Tag | Meaning |
|---|---|
| `FX` | Envelope / version header |
| `H`  | Message header (role, timestamp, conversation id) |
| `TC` | Tool call (name + arguments) |
| `T`  | Text body |
| `A`  | Annotations (intent, language, etc.) |
| `M`  | Metadata (priority, timeout, …) |
| `E`  | End of message |

Schema ids (e.g. `tc:weather.v1`) are resolved against the well-known
schema directory — see [`../schemas/README.md`](../schemas/README.md).

## Size & token comparison

Counts below are approximate. Token counts use the cl100k_base tokenizer
(GPT-4 family) as a reference; other tokenizers will differ, but the
direction of the result is consistent across all current LLM tokenizers.

| Payload | Bytes (JSON) | Bytes (FX) | Tokens (JSON) | Tokens (FX) | FX savings |
|---|---:|---:|---:|---:|---:|
| `tool-call`    | 414 | 164 | ~135 | ~58 | **~57%** |
| `chat-message` | ~340 | 200 | ~95 | ~52 | **~45%** |

Why the savings?

1. **No structural punctuation tax.** JSON spends a lot of tokens on
   `{`, `}`, `"`, `,`, and whitespace. AXF's delimiters are single
   ASCII chars that the tokenizer treats as cheap separators.
2. **No repeated keys.** Field positions inside a segment are defined by
   the segment's schema, so the key names don't have to ride along on the
   wire on every message.
3. **Schema-by-reference.** The envelope names a schema id; the receiver
   already knows the field layout.

## Running the comparison yourself

Once a reference implementation lands under `reference/`, you'll be able to:

```sh
fx tokens examples/tool-call.axf
fx tokens examples/tool-call.json --as=json
fx diff   examples/tool-call.axf examples/tool-call.json
```

Until then, treat the table above as illustrative.
