- From: Antun Jurkovic <antunjurkovic@gmail.com>
- Date: Mon, 20 Oct 2025 15:46:52 +0200
- To: ietf-http-wg@w3.org
Summary (experience report)
I've deployed a per-resource machine representation pattern that uses
existing HTTP semantics (RFC 9110, RFC 9111) to reduce transfers for
unchanged content. It's been applied to article pages (e.g., for AI
crawlers) but is content-agnostic.
What we do (interplay of standard pieces)
1. Deterministic per-resource machine endpoint (JSON) mapped from the human page
2. Normalized visible content → fingerprint; endpoint serves ETag
equal to that fingerprint (weak/strong ETags handled)
3. Machine-readable index (JSON) publishes
cUrl/mUrl/modified/contentHash where contentHash equals the ETag
4. Clients use If-None-Match; servers return 304 for unchanged
content; HEAD where available, GET fallback otherwise
5. Link relations: HTML rel="alternate" points to JSON; HTTP Link:
rel="canonical" points back (RFC 8288; canonical per RFC 6596)
Built entirely on existing HTTP semantics; no new headers.
Deployment notes
Three production sites (~970 URLs). Representative results (non-limiting):
- bestdemotivationalposters.com - 500 URLs, 2 months production
- wellbeing-support.com - 400 URLs, 2 months production
- omacedonii.com - 68 URLs, 1 month production
Measured results:
- Average JSON: ~17.7 kB vs HTML: ~103 kB (~83% byte reduction)
- 304 responses for unchanged content post-revalidation; high
steady-state skip rate when comparing published hashes
Artifacts:
- Spec repo: https://github.com/antunjurkovic-collab/collab-tunnel-spec
- WordPress plugin:
https://github.com/antunjurkovic-collab/trusted-collab-tunnel
- Edge worker (optional):
https://github.com/antunjurkovic-collab/trusted-collab-worker
Why I'm writing
I'm looking for the WG's guidance on the best venue and shape for
documenting this application pattern of HTTP semantics, e.g.:
- An individual Informational or BCP draft under httpbis?
- IANA Link Relation Type registrations (e.g., if standard relations
for "terms-of-service"/"pricing" are appropriate, or guidance to use
existing relations)?
- A well-known location (RFC 8615) for discovery (if the group deems
that useful), or simply documenting current practice with sitemaps?
- Any cautions with ETag parity and intermediary behavior that we
should capture?
Questions for the WG
1. HTTP caching/conditional requests: Are there subtleties we should
highlight for intermediaries/CDNs with this ETag pattern?
2. Scope: Is this squarely in scope for httpbis, or better as an
independent Informational?
3. IPR: I have a US provisional patent application related to the
method. I will make any required IPR disclosure per RFC 8179 and am
considering RF licensing; open to WG feedback on expectations here.
Documentation
- Technical overview: https://llmpages.org/developers/
- Live validator: https://llmpages.org/validator/
- Production examples (sitemaps):
- https://bestdemotivationalposters.com/llm-sitemap.json
- https://wellbeing-support.com/llm-sitemap.json
- https://omacedonii.com/llm-sitemap.json
Thanks for your time. Happy to share more measurements or a draft
document for review.
Best regards,
Antun Jurkovikj
GitHub: https://github.com/antunjurkovic-collab
Email: antunjurkovic@gmail.com
Website: https://llmpages.org/
Received on Tuesday, 21 October 2025 10:05:39 UTC