Re: Canonical form requirement in Turtle (and N3?) spec from Dan Connolly on 2006-01-16 (public-cwm-talk@w3.org from January to March 2006)

From: Dan Connolly <connolly@w3.org>
Date: Mon, 16 Jan 2006 08:58:43 -0600
To: Arjohn Kampman <arjohn.kampman@aduna.biz>
Cc: dave@dajobe.org, public-cwm-talk@w3.org
Message-Id: <3d367c902103d6d7b2617db73aa29900@w3.org>

On Jan 16, 2006, at 8:19 AM, Arjohn Kampman wrote:
> Dear all,
>
> (this is a follow-up on a private e-mail Q&A between Dave and me)
>
> The Turtle specification (2006/01/02 version) indicates that a parser
> for this format should normalize any integers and booleans to their
> canonical form.

Let's see...http://www.dajobe.org/2004/01/turtle/ 2006/01/02 21:39:51

I don't see "normalize". Ah...

[[
"Interpreted as an xsd:integer and generates a datatyped literal with 
the datatype uriref http://www.w3.org/2001/XMLSchema#integer and 
canonical lexical representation of xsd:integer which includes allowing 
no leading zeros.
]]

>  For doubles and decimals, however, this is not required.
> Dave thought that he tried to align the Turtle spec with cwm/N3 here,
> but wasn't 100% sure about this.

I can understand how the syntax of the languages can be said to
align or not, but I would need more details to understand how the
comments in the turtle grammar can be said to align or not.

Is conformance of "turtle parsers" specified? Ah...
"Systems conforming to Turtle ..."

So the turtle "parser" interface is specified as something that
turns turtle into RDF/XML. I suppose SPARQL's CONSTRUCT
feature can be used to emulate that.

So it's a question of what a query such as this returns:

CONSTRUCT { <#x> rdf:value ?v }
   WHERE { <#x> rdf:value ?v }

given input data:
   <#x> rdf:value 0003.

or

   <#x> rdf:value 0003.1 .

   <#x> rdf:value 0003.1e7 .

You might try those queries on some SPARQL services.
   http://esw.w3.org/topic/DawgShows

> I find the differences between integers and booleans on the one side,
> and doubles and decimals on the other a bit strange.

Canonical forms for doubles can be hairy, so I can see why an exception
for that, but why treat decimals and integers differently?

>  IMHO, a parser
> should either normalize all values to their canonical form, or none of
> them. I myself have a strong preference for the latter as I don't think
> of value normalization as a task for a parser.
>
> Comments anyone?
>
> Regards,
>
> Arjohn
>

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/

Received on Monday, 16 January 2006 14:58:51 UTC