W3C home > Mailing lists > Public > www-validator@w3.org > July 2013

Re: [VE][html5] Scope issue with & in external URL coming up with an issue, when it is legal and ok in URL.

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Mon, 29 Jul 2013 09:30:16 +0300
Message-ID: <51F60BF8.8040608@cs.tut.fi>
To: Mark Edgar <me4booking@hotmail.co.uk>
CC: www-validator@w3.org
2013-07-28 0:53, Mark Edgar wrote:

> Validating http://www.morayphotovoltaic.com/
> Error [html5]: ""
> Error/Line 12, Column 62/: & did not start a character reference. (&
> probably should have been escaped as &amp;.)
> |…pt src="https://maps.googleapis.com/maps/api/js?v=3.exp*&*sensor=false"></script>|

By HTML5 rules as currently defined, the markup is valid, but the 
validator plays by older rules. Unless I’m missing something, the 
authors of the validator think the older rules are better and therefore 
haven’t fixed the bug.

This is an example of what the boilerplate text in the validator’s 
report (when used in HTML5 mode) says: “The validator checked your 
document with an experimental feature: HTML5 Conformance Checker. This 
feature has been made available for your convenience, but be aware that 
it may be unreliable, or not perfectly up to date with the latest 
development of some cutting-edge technologies”

> |The src string value is out of scope and externally as a URL it is correct.|

Just as the string B&N (a company name abbreviation) is correct, yet can 
be written as B&amp;N in HTML and *must* be written that way (or using 
an equivalent numeric character reference) in XHTML and *should* be 
written that way according to HTML 4.01 and to some versions of HTML5 

URL-valued attributes have not been an exception to HTML parsing. In 
HTML5, the rules are being changed somewhat. But still, a URL as a 
string may be distinct from the presentation of that string in HTML, 
just as we type B&N on paper but may/must/should type B&amp;N when 
writing HTML document content by hand. (Wysiwyg editors, in their normal 
mode, internally convert input of & to &amp; in HTML code.)

My practical advice:
1) If it’s just this src attribute, or a few such cases, use &amp; to 
denote &. It has always been valid to do so and will always remain valid.
2) If you have dozens of such cases, possibly with several ampersands 
per URL, consider letting them be as they are and ignore the error 
messages. There have been some versions of the validator that let you 
filter out specific types of messages, but I’ve forgotten where they 
were. I hope this feature will soon be added to http://validator.w3.org

> |I don't think you should parse external strings, who’s to say how they pass arguments.|

The href attribute value is not an external string, it’s as such just a 
string in the HTML document. When the parser has done with it, any &amp; 
has been changed to & (in the DOM) and this is what will be passed.

Received on Monday, 29 July 2013 06:30:47 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:18:09 UTC