W3C home > Mailing lists > Public > public-rdf-comments@w3.org > April 2013

Re: normalization issues with Turtle spec tests

From: Ruben Verborgh <ruben.verborgh@ugent.be>
Date: Sun, 7 Apr 2013 08:46:21 +0200
Cc: public-rdf-comments@w3.org, Gregg Kellogg <gregg@greggkellogg.net>, gavin@carothers.name
Message-Id: <A6A2C9F0-9044-4B64-AA2F-29322F19BA33@ugent.be>
To: Eric Prud'hommeaux <eric@w3.org>
Dear Eric,

> The tests are enforcing checking that the term generaged from e.g. '"+1"^^xsd:integer' is distinct from '1' (which is the same as '"1"^^xsd:integer').
>  https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#dfn-literal-equality
> This might be an opportunity to comment out some code.

1) Do you perhaps know the reason for this choice—and has this changed somewhere along the way?
If I take
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
    <a> <b> "1"^^xsd:integer.
    <a> <b> "+1"^^xsd:integer.
    <a> <b> "0001"^^xsd:integer.
and put it through cwm, I get
    <a>     <b> 1,
                1 .
and if I put that again through cwm, I get
    <a>     <b> 1 .
However, does the section you point to means this changed so that '"1"^^xsd:integer’ and the others are no longer equivalent to ‘1’?

2) Directly above the “linear equality” section, https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#dfn-language-tag says:
"The language tag must be well-formed according to section 2.2.9 of [BCP47], and must be normalized to lowercase.”
So test langtagged_LONG_with_subtag seems wrong to answer with '@en-UK’.
Furthermore, the definition of “literal equality” might be slightly off then. Are the following equal?
- "test"@en-uk
- "test"@en-UK

3) The whole equality thing makes it quite tricky to find triples. For instance, if I search for:
   triples.find(any, any, 1), what should be returned?
- triples with an object of ‘1'?
- triples with an object of ‘”1"^^xsd:integer’?
- triples with an object of ‘”+1"^^xsd:integer'?
- triples with an object of ‘”01"^^xsd:integer’?
- triples with an object of ‘”000001"^^xsd:integer'?
How do existing implementations deal with this?

So yes, I might comment out some code.
But then the result will either be more difficult to work with for the library user (because of inequalities),
or far less performant (as I’d have to index a normalized version and still store and return the original).
I really wonder why the choice against normalization was made.

> I use SWObjects with a command line "-d test.nt --compare ref.nt”
Wonderful, I will try that. Seems much easier.


Received on Sunday, 7 April 2013 06:50:24 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:59:32 UTC