Re: A question for RDF parser implementers - whitespace

At 12:59 11/07/04 -0400, Norman Walsh wrote:

>/ Graham Klyne <GK@ninebynine.org> was heard to say:
>| At 07:42 09/07/04 -0400, Norman Walsh wrote:
>|>I think if you are parsing a typed literal and you know you're parsing
>|>a typed literal, you should collapse the whitespace before passing the
>|>value on to down-stream applications.
>|>
>|>Given that the RDF spec says that whitespace is eliminated by
>|>validation, I can easily imagine writing an application that assumes
>|>typed values like integers and URIs won't have insignificant
>|>whitespace around them.
>|
>| Hmmm.  Let's try a test case.
>|
>| Does this:
>|
>| <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>|           xmlns:dc="http://purl.org/dc/elements/1.1/">
>|    <rdf:Description rdf:about="http://www.example.org/">
>|       <dc:title>  The trouble with spaces   </dc:title>
>|    </rdf:Description>
>| </rdf:RDF>
>|
>| RDF-entail this:
>|
>| <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>|           xmlns:dc="http://purl.org/dc/elements/1.1/">
>|    <rdf:Description rdf:about="http://www.example.org/">
>|       <dc:title
>| rdf:datatype="http://www.w3.org/2001/XMLSchema#string">  The trouble
>| with spaces   </dc:title>
>|    </rdf:Description>
>| </rdf:RDF>
>|
>| I think it should, but under your suggested regime I think it would not.
>
>Leading and trailing whitespace is significant in strings. The more
>interesting test would be one that includes
>
>      <geo:lat>  42.4   </geo:lat>
>
>Where the datatype of geo:lat is float or double or something like
>that.

Did you mean that to be datatyped?  If not then the literal graph node 
denotes a string, not a number.  (The <geo:lat> property may define a 
relation that involves interpreting the string as a number, but such an 
interpretation depends on (extra-RDF) knowledge of the property used.  As 
far as RDF is concerned, it's no different from

    <dc:title>  42.4   </dc:title>

If you *did* mean that to be datatyped, as in:

    <geo:lat rdf:datatype="http://www.w3.org/2001/XMLSchema#double"
     >  42.4   </geo:lat>

then this goes to the heart of my question.  I'd expect that to treated the 
same as:

    <geo:lat rdf:datatype="http://www.w3.org/2001/XMLSchema#double"
     >42.4</geo:lat>

But I think the mere fact of being datatyped is insufficient to determine 
the correct whitespace handling, as the xsd:string case shows.

#g
--

>| <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>|           xmlns:dc="http://purl.org/dc/elements/1.1/">
>|    <rdf:Description rdf:about="http://www.example.org/">
>|       <dc:title>  The trouble with spaces   </dc:title>
>|    </rdf:Description>
>| </rdf:RDF>
>|
>| RDF-entail this:
>|
>| <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>|           xmlns:dc="http://purl.org/dc/elements/1.1/">
>|    <rdf:Description rdf:about="http://www.example.org/">
>|       <dc:title
>| rdf:datatype="http://www.w3.org/2001/XMLSchema#string">  The trouble
>| with spaces   </dc:title>
>|    </rdf:Description>
>| </rdf:RDF>


------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact

Received on Monday, 12 July 2004 05:36:58 UTC