Re: page about robots.txt from Brendan Quinn on 2021-02-25 (public-tdmrep@w3.org from February 2021)

From: Brendan Quinn <brendan@cluefulmedia.com>
Date: Thu, 25 Feb 2021 16:02:08 +0200
To: laurent.lemeur@edrlab.org
Cc: public-tdmrep@w3.org
Message-ID: <CAMvELkeaLVKGXaJPBAkmVTS4y3F4kQgJR9VV-tpbiKAovz-JsQ@mail.gmail.com>

Thanks Laurent, that looks good.

It's probably worth mentioning that there are some provider-specific
extensions to robots.txt used in the wild, eg sitemap: used by "Google,
Bing,and other major search engines".

https://developers.google.com/search/reference/robots_txt#google-supported-non-group-member-lines

I guess we should also document the .well-known folder, with spec here:
https://tools.ietf.org/html/rfc8615 and the quite extensive "well-known URI
repository" at
https://www.iana.org/assignments/well-known-uris/well-known-uris.xhtml

Also see IAB's ads.txt initiative: https://iabtechlab.com/ads-txt/

Sorry I missed the call on Tuesday. I hope it was fruitful.

Best regards,

Brendan.

On Thu, 25 Feb 2021 at 15:29, Laurent Le Meur <laurent.lemeur@edrlab.org>
wrote:

> Dear participants,
>
> I have added a page to the Github repo, which tries to summarize what is
> robots.txt and how it is used. Robots.txt has been described by Ivan Herman
> as a possible source of inspiration during our last call.
>
> https://github.com/w3c/tdm-reservation-protocol/blob/main/docs/robots.md
>
> Best regards
> Laurent Le Meur
>

Received on Thursday, 25 February 2021 14:02:34 UTC