Re: Validation error from base 64 encoded data in an object element.

2013-11-29 11:54, Philip Taylor wrote:

> Michael[tm] Smith wrote:
>
>> The validator's correct. The values of the "data" attributes of your
>> <object> elements contain literal newlines. Valid URLs aren't allowed to
>> contain literal newlines. So to make them valid URLs you need make each
>> of your "data" attribute values a single line, without the line breaks.
>
> The value of the data attributes in the example under discussion is a
> data URI encoded in base64.  According to Wikipedia [1],
>
>  "data URIs encoded with base64 may contain whitespace for readability."
>
> Is Wikipedia therefore incorrect in its assertion ?

It seems that the assertion is true as regards to browser behavior in 
popular browsers, strangely enough. A data: URL divided into several 
lines works OK. Internally, in the DOM, line breaks are stripped off. 
This also applies to normal URLs; the following works:

<a href="http://www.w3.
org">W3C</a>

So this isn't about data: URLs or about Base64, in which a line break is 
more or less an open issue, see
https://tools.ietf.org/html/rfc4648#section-3.1

I cannot find any justification for stripping line breaks in the specs. 
Attribute value parsing rules say nothing of the kind, and in general, 
line breaks in attribute values in HTML source are present in the DOM.

But I can see practical reasons to this. It's probably useful error 
recovery. And it could be turned to a simple rule: in an attribute value 
declared to be a URL (or "valid non-empty URL potentially surrounded by 
space" in the HTML5 spec jargon), removing line breaks is OK since URLs 
must not contain line breaks.

It's a bit different with spaces. They get inserted into the DOM, as 
such in a data: URL, but %-encoded as %20 in an http: URL. And in a 
data: URL, they seem to get ignored. I can't find a formal justification 
to this, but in practical terms, it might be regarded as useful error 
recovery.

Yucca

Received on Friday, 29 November 2013 11:48:47 UTC