RE: whitespace facet - xmlsch-02

> -----Original Message-----
> From: ext Jeremy Carroll [mailto:jjc@hpl.hp.com]
> Sent: 12 March, 2003 14:20
> To: w3c-rdfcore-wg@w3.org
> Subject: whitespace facet - xmlsch-02
> 
> 
> 
> 
> In 1.2 in
> http://lists.w3.org/Archives/Public/www-rdf-comments/2003JanMa
> r/0489.html
> the XML Schema WG say:
> [[
> In brief, XML Schema's simple types each define a whitespace facet,
>     which governs the kind of whitespace pre-processing done by an XML
>     Schema processor before the lexical form is checked for 
> type validity.
> ]]
> 
> We have ignored this, and all facets, to date.
> 
> A result, which may cause confustion is that:
> 
> <rdf:Description>
>   <eg:p rdf:datatype="&xsd;int"> 3 </eg:p>
> </rdf:Description>
> 
> has an ill-formed literal (however we say that).
> 
> It is ill-formed because we make no provision for the 
> whitespace normalization 
> envisaged in XML Schema Datatypes.
> 
> We could have decided that:
> [[
> For those RDF datatypes that are XML Schema datatypes,
> - the facets are a part of their formal structure in RDF
> - whitespace handling is done before the lexical to value mapping, in 
> conformance with Xml schema part 2.
> ]]

I think this is the way to go.

> We would have significant editorial obstacles; the most 
> obvious way to descibe 
> this would be as part of the lexical to value mapping, which 
> is *not* how XML 
> Schema do it.

Well, XML Schema can be seen as simply partitioning their L2V
mapping into two phases: (1) whitespace handling and (2) mapping.

RDF just treats L2V mapping in a more general manner, whereby
some lexical form (character sequence) is mapped to some value.

Most (if not all) scanning functions for programming languages
also take into account whitespace and other superfluous markup
when mapping a character sequence to a value.

I don't see an issue with RDF offering a more general description
of the L2V mapping process than XML Schema (or any other datatype
framework).

Perhaps we could simply add a note that states that some/most
datatyping frameworks define a means of discarding superfluous
characters from character strings as part of or prior to actual
mapping, and leave it at that.

> In the corresponding XML Schema case the string " 3 " in the 
> document has its 
> whitespace stripped to "3" early, and "3" is the lexical 
> value that is mapped 
> to the integer three.
>
> I see a lrage amount of RDF data and implementations that are 
> confused about 
> whitespace handling in RDF. Partly this may reflect on M&S, 
> but some examples 
> are (under)informed by our WDs. 


Perhaps an example in the Primer with pointer to the XML Schema
specs on whitespace facets would suffice?

Patrick

Received on Monday, 17 March 2003 05:33:09 UTC