Re: [w3ctag/design-reviews] ads.txt (#201)

Hi, ads.txt working group member here.  Yes, it would be great to get these concerns addressed in the next ads.txt (and related specs) version update.  Items I had previously written down that I'm hoping to make more technically precise include:

- Character encoding: we see files published in various character encodings which may not be properly interpreted by all platforms. We should specify a character encoding such as UTF-8 for the file content so that validators can consistently flag issues
- Byte-order mark headers: we see files that have non-visible byte order marks (https://en.wikipedia.org/wiki/Byte_order_mark) which can trip up parsing if not interpreted properly.  We should include specifics in the spec about whether these are allowed or not
- Line endings: the spec does not specify which byte sequences are considered line endings.  We've encountered files encoded using atypical (or containing a mix of) line ending types which could trip up parsers.  We should update the spec to include specifics of what byte sequences (0a; 0d0a; 0d; etc) are considered valid, parseable line endings.
- Public suffix list specificity: the publicsuffix.org list contains two sections: an ICANN section and a private section.  The ads.txt spec doesn't specify whether the private section is valid for use.
- SUBDOMAIN= directive specificity and limitations: I'd like to make the spec provide more detail and examples about how SUBDOMAIN= directives behave and interact with each other, along with potentially defining a limit to the number of levels.
- Security: I'd like to see if we can be more precise in the standard about how to treat HTTPS URLs, when it is permissible to fall back to HTTP, what validations the crawler should perform (e.g. SSL certificate validation), and the valid transport security protocols accepted.  We should consider security risks that should be mitigated with precise rules.

I will work with @slightlyoff on this.

Stepping back from the specific recommendations in this thread, I was wondering if you have any pointers to documents that explain how to write a good spec, if such a thing exists?  Also, I would like to somehow put together a compatibility testing suite that participants can use to confirm that their crawlers and parsers were implemented correctly.  If you have any tips on this or examples of well-written solutions that do this, that would be great to learn from.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3ctag/design-reviews/issues/201#issuecomment-579850491

Received on Wednesday, 29 January 2020 16:47:52 UTC