- From: Ivan Herman <ivan@w3.org>
- Date: Thu, 13 Nov 2008 13:44:35 +0100
- To: Peter Mika <pmika@yahoo-inc.com>
- CC: public-rdf-in-xhtml-tf@w3.org
- Message-ID: <491C2133.4030403@w3.org>
What I do now in the distiller (not yet uploaded on the system, but will be in the next release...) is - strip the URI from trailing and starting white spaces - always go through the quoting of URIs, ie, to turn the space characters into %20, before using them as URI prefixes - if the (original) URI contains a white space, then a warning is generated I am not sure anything else could be expected from a user agent... Thanks! Ivan Peter Mika wrote: > I'm not sure either... As I'm too lazy to read the whole spec, I did > some testing in java, where... > > URI uri1 = new URI("http://creativecommons.org/ns #"); > > throws a URI syntax exception > > but interestingly > > URI uri2 = new URI("http://creativecommons.org/ns%20#"); > > doesn't. > > In any case, there is an appendix of the URI specification which seems > to put the burden of removing whitespaces on the processing agent: > > http://labs.apache.org/webarch/uri/rfc/rfc3986.html#delimiting > > Quoting: > > For robustness, software that accepts user-typed URI should attempt to > recognize and strip both delimiters and embedded whitespace. > > For example, the text > > Yes, Jim, I found it under "http://www.w3.org/Addressing/", > but you can probably pick it up from <ftp://foo.example. > com/rfc/>. Note the warning in <http://www.ics.uci.edu/pub/ > ietf/uri/historical.html#WARNING>. > > contains the URI references > > http://www.w3.org/Addressing/ > ftp://foo.example.com/rfc/ > http://www.ics.uci.edu/pub/ietf/uri/historical.html#WARNING > > End quote. > > Cheers, > Peter > > Ivan Herman wrote: >> I actually wonder... >> >> RDFa uses the xmlns syntax for URI prefixing only. Ie, the only thing >> that counts is whether it is a valid URI. If the result of the >> processing is to generate >> >> http://creativecommons.org/ns&20# >> >> that _is_ a valid URI, isn't it? Ie, I guess the bug in the current >> distiller code is that URI-s should be properly quoted. >> >> Having said that, such setting is probably an error, so if there is a >> space in the string than a warning is probably in order. But, who knows, >> some crazy users may want to use such a URI... >> >> Ivan >> >> Ivan Herman wrote: >> >>> Hi Peter, >>> >>> thanks for the note. I will have a look into it but yes, the tool should >>> probably warn... >>> >>> Ivan >>> >>> Peter Mika wrote: >>> >>>> Hi All, >>>> >>>> We have found another corner case while looking at all the wonderful >>>> RDFa on the Web: >>>> >>>> The page at [1] contains: >>>> >>>> >>>> This >>>> work by <a >>>> xmlns:cc="http://creativecommons.org/ns >>>> # >>>> " >>>> >>>> which is probably not intended (the page is broken in some sense). When >>>> run through either the XSLT or the Distiller this >>>> becomes: >>>> >>>> <cc:attributionName xmlns:cc="http://creativecommons.org/ns #">New >>>> Jersey State Auto >>>> Auction</cc:attributionName> >>>> >>>> which is normalized [1] as >>>> xmlns:cc="http://creativecommons.org/ns  >>>> <http://creativecommons.org/ns >;#"> >>>> >>>> It seems to me that what you get is XML well-formed but not >>>> namespace-well-formed [2] because the attribute value is not a valid >>>> URI. >>>> >>>> Not sure really what to do about this but the output is not very >>>> useful... should the tools raise some warning? >>>> >>>> Thanks, >>>> Peter >>>> >>>> [1] http://www.w3.org/TR/REC-xml/#AVNormalize >>>> [2] http://www.w3.org/TR/REC-xml-names/#Conformance >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> [1] http://www.njstateauto.com/preowned/index.cfm?make=Mercedes-Benz >>>> >>>> >> >> > -- Ivan Herman, W3C Semantic Web Activity Lead Home: http://www.w3.org/People/Ivan/ PGP Key: http://www.ivan-herman.net/pgpkey.html FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Thursday, 13 November 2008 12:45:16 UTC