Step 4 — Run the plaintext arm

Fetch the same guide as flattened text - no item ids in the body:

curl -sS "https://guides.co/g/{slug}/gdf?format=plaintext"

Use the same retriever settings (chunk size, k, model) as the GDF arm. Only the source format changes.

Plaintext is a deliberate weaker baseline: headings may survive, but stable item_id and per-page jsonl boundaries are gone. That is what you are testing.

Fairness rules:

  • Same questions, same k, same embedding model (if any)
  • Same publish state (both public or both via API)
  • Do not hand-tune prompts differently per arm unless you document it