Re: A question for RDF parser implementers - whitespace

Graham Klyne wrote:
> Hi,

Hi Graham,

You're raising an interesting point here. Currently, Sesame's Rio parser
package does not do any whitespace normalization. However, it does have
'support' (validation & normalization) for a number of the built-in
datatypes.

I just dived a little deeper into the XML Schema datatypes document to
find out what should be done with these built-in datatypes. If I
understood correctly, the value of the whiteSpace facet is 'collapse'
for all built-in datatypes, except for xsd:string and its subtypes. I
guess this means that for these datatypes any leading and trailing
whitespace characters should be removed.

Arjohn


Graham Klyne wrote:
> I've just noticed something the the RDF syntax which has me wondering 
> how RDF parser implementers are dealing with whitespace in literals.
> 
> The RDF syntax spec 
> (http://www.w3.org/TR/rdf-syntax-grammar/#section-Nodes), section 6.1.9 
> on typed literals, mentions "In XML Schema (part 1) [XML-SCHEMA1], white 
> space normalization occurs during validation according to the value of 
> the whiteSpace facet. The syntax mapping used in this document occurs 
> after this, so the whiteSpace facet formally has no further effect."
> 
> But, given that RDF/XML an open-ended tag set, schema validation of 
> RDF/XML doesn't make a lot of sense.  Further, I don't have an XML 
> schema processor to do such validation.
> 
> How are other implementers dealing with this?  My inclination is to pass 
> the original literal, with whitespace intact, and allow the subsequent 
> datatype processing to treat it with the same effect as if the 
> whiteSpace had been eliminated by schema validation.  For this purpose, 
> the whitespace facet is implicitly part of the datatype.
> 
> Does anyone have any other ideas on this?
> 
> #g

-- 
arjohn.kampman@aduna.biz
Aduna BV - http://aduna.biz/
Prinses Julianaplein 14-b, 3817 CS Amersfoort, The Netherlands
tel. +31-(0)33-4659987  fax. +31-(0)33-4659987

Received on Wednesday, 7 July 2004 04:58:03 UTC