Proposal: AI Domain Data Standard for Authoritative Domain Identity Metadata from dylan larson on 2025-12-02 (public-wicg@w3.org from December 2025)

From: dylan larson <dylanl37@hotmail.com>
Date: Tue, 2 Dec 2025 23:00:28 +0000
To: "public-wicg@w3.org" <public-wicg@w3.org>
Message-ID: <SN7PR84MB313656EBEEA6AB6C0B989DDBD9D8A@SN7PR84MB3136.NAMPRD84.PROD.OUTLOOK.COM>

Hello WICG community,

I would like to introduce the AI Domain Data Standard (AIDD) for discussion. Its goal is to address a gap in the web ecosystem that is becoming more visible as AI systems increasingly act as intermediaries between users and websites.

Problem
AI assistants often misidentify or misrepresent domains because there is no consistent, machine-readable, domain-controlled source of identity data. Today, models rely on scraped pages, inconsistent metadata, third-party aggregators, or outdated indexes. There is no canonical place where a domain can declare who they are, what they represent, or which resources are authoritative.

Proposal
AIDD defines a small, predictable JSON document served from:
* https://<domain>/.well-known/domain-profile.json
* Optional fallback: _ai.<domain> TXT record containing a base64-encoded JSON copy
The format contains required identity fields (name, description, website, contact) and optional schema.org-aligned fields such as entity type, logo, and JSON-LD. The schema is intentionally minimal to ensure predictable consumption by AI systems, agents, crawlers, and other automated clients.

Specification (v0.1.1):
https://ai-domain-data.org/spec/v0.1
Schema:
https://ai-domain-data.org/spec/schema-v0.1.json

Design Principles
* Self-hosted and vendor-neutral
* Aligns with schema.org vocabulary
* Minimal surface area with clear versioning
* Follows existing web conventions for .well-known/
* Supports both HTTPS and DNS TXT discovery

Early Adoption & Tooling

* CLI validator and generator
* Resolver SDK
* Next.js integration
* Jekyll plugin
* WordPress plugin (submitted)
* Online generator and checker tools

Repository:
https://github.com/ai-domain-data/spec<https://github.com/ai-domain-data/spec?utm_source=chatgpt.com>

Questions for the community

1. Should this pursue formal standardization (W3C, IETF) or remain a community-driven specification
2. Are the discovery mechanisms (.well-known + DNS TXT fallback) appropriate for long-term stability
3. What extension patterns are advisable while preserving strict predictability
4. Should browsers or other user agents eventually consume this data
5. Are there concerns around naming (domain-profile.json) that the group would recommend addressing early

Explainer
A more complete explainer is available here:
https://ai-domain-data.org/spec/v0.1
I would appreciate any feedback from the WICG community on scope, technical direction, and whether this fits the criteria for incubation.
Best regards,
Dylan Larson

Received on Wednesday, 3 December 2025 15:21:06 UTC