- From: Ivan Herman <ivan@w3.org>
- Date: Thu, 13 Nov 2008 13:44:35 +0100
- To: Peter Mika <pmika@yahoo-inc.com>
- CC: public-rdf-in-xhtml-tf@w3.org
- Message-ID: <491C2133.4030403@w3.org>
What I do now in the distiller (not yet uploaded on the system, but will
be in the next release...) is
- strip the URI from trailing and starting white spaces
- always go through the quoting of URIs, ie, to turn the space
characters into %20, before using them as URI prefixes
- if the (original) URI contains a white space, then a warning is generated
I am not sure anything else could be expected from a user agent...
Thanks!
Ivan
Peter Mika wrote:
> I'm not sure either... As I'm too lazy to read the whole spec, I did
> some testing in java, where...
>
> URI uri1 = new URI("http://creativecommons.org/ns #");
>
> throws a URI syntax exception
>
> but interestingly
>
> URI uri2 = new URI("http://creativecommons.org/ns%20#");
>
> doesn't.
>
> In any case, there is an appendix of the URI specification which seems
> to put the burden of removing whitespaces on the processing agent:
>
> http://labs.apache.org/webarch/uri/rfc/rfc3986.html#delimiting
>
> Quoting:
>
> For robustness, software that accepts user-typed URI should attempt to
> recognize and strip both delimiters and embedded whitespace.
>
> For example, the text
>
> Yes, Jim, I found it under "http://www.w3.org/Addressing/",
> but you can probably pick it up from <ftp://foo.example.
> com/rfc/>. Note the warning in <http://www.ics.uci.edu/pub/
> ietf/uri/historical.html#WARNING>.
>
> contains the URI references
>
> http://www.w3.org/Addressing/
> ftp://foo.example.com/rfc/
> http://www.ics.uci.edu/pub/ietf/uri/historical.html#WARNING
>
> End quote.
>
> Cheers,
> Peter
>
> Ivan Herman wrote:
>> I actually wonder...
>>
>> RDFa uses the xmlns syntax for URI prefixing only. Ie, the only thing
>> that counts is whether it is a valid URI. If the result of the
>> processing is to generate
>>
>> http://creativecommons.org/ns&20#
>>
>> that _is_ a valid URI, isn't it? Ie, I guess the bug in the current
>> distiller code is that URI-s should be properly quoted.
>>
>> Having said that, such setting is probably an error, so if there is a
>> space in the string than a warning is probably in order. But, who knows,
>> some crazy users may want to use such a URI...
>>
>> Ivan
>>
>> Ivan Herman wrote:
>>
>>> Hi Peter,
>>>
>>> thanks for the note. I will have a look into it but yes, the tool should
>>> probably warn...
>>>
>>> Ivan
>>>
>>> Peter Mika wrote:
>>>
>>>> Hi All,
>>>>
>>>> We have found another corner case while looking at all the wonderful
>>>> RDFa on the Web:
>>>>
>>>> The page at [1] contains:
>>>>
>>>>
>>>> This
>>>> work by <a
>>>> xmlns:cc="http://creativecommons.org/ns
>>>> #
>>>> "
>>>>
>>>> which is probably not intended (the page is broken in some sense). When
>>>> run through either the XSLT or the Distiller this
>>>> becomes:
>>>>
>>>> <cc:attributionName xmlns:cc="http://creativecommons.org/ns #">New
>>>> Jersey State Auto
>>>> Auction</cc:attributionName>
>>>>
>>>> which is normalized [1] as
>>>> xmlns:cc="http://creativecommons.org/ns 
>>>> <http://creativecommons.org/ns >;#">
>>>>
>>>> It seems to me that what you get is XML well-formed but not
>>>> namespace-well-formed [2] because the attribute value is not a valid
>>>> URI.
>>>>
>>>> Not sure really what to do about this but the output is not very
>>>> useful... should the tools raise some warning?
>>>>
>>>> Thanks,
>>>> Peter
>>>>
>>>> [1] http://www.w3.org/TR/REC-xml/#AVNormalize
>>>> [2] http://www.w3.org/TR/REC-xml-names/#Conformance
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> [1] http://www.njstateauto.com/preowned/index.cfm?make=Mercedes-Benz
>>>>
>>>>
>>
>>
>
--
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Thursday, 13 November 2008 12:45:16 UTC