> > > >> It is one thing that %FC needs to work (in some sense - like > >> quirks-mode pages also have to work even if it is not valid). But if > >> there is no good necessary usecase for %FC, then we should help > >> authors avoid problems by encourage validators to warn against it use. > > > > There's nothing invalid with %FC. > > My suggestion was that it should *become* invalid/get a warning in - let's say - > HTML5 docs. Making the literal sequence %FC invalid would be a Bad Thing. It would make it impossible to encode certain resources that are otherwise completely valid. > > > A URI that contains %FC is perfectly valid (check RFC 3986). Because it's a > valid URI, it's also a valid IRI. > > But an author which -today- inserts %FC is likely to do a mistake - or at least > make a bad choice, no? An author who inserts u-umlaut and expects to get %FC is making a mistake. An author who inserts %FC and expects to see u-umlaut is making a mistake (or should be). But an author who inserts %FC because that's what her server expects? Valid. And an author who inserts u-umlaut and expects it to display as u-umlaut and send (as %C3%BC in URI form)? Also valid, IMHO. > > > And it's useful in some circumstances. Imagine a server where all the > resource names are encoded in iso-8859-1 (or any other legacy (single-byte) > encoding). What you tell http (or whatever other scheme/protocol) by > using %FC is that you want the resource with the name with the <0xFC> byte in > it. > > How common are such servers these days? They should be really really common, since that's what URI *says* %FC means. > > My focus is authors. And of course it could be the author meant %FC. But might > it not more often be simply a result of a bad %-encoder or on a misconception? > The problem, as I see it, is not with the sequence %FC. It is with the character U+00FC appearing in an HTML document inside a URI path. I tend to think that the interpretation of %FC using page encoding is bad because an IRI (or URI) lacks the necessary context to make that determination. I agree with Boris's earlier message on the list that showing %FC is a bad user experience. But shouldn't we be trying to close on a well-defined set of behaviors that content authors (and others) can understand? I think such an approach would include the behavior described above, even at the expense of some usability. And who looks at those really long URIs full of percent gunk anyway? :-)) AddisonReceived on Wednesday, 27 July 2011 01:14:01 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 April 2012 19:52:02 GMT