W3C home > Mailing lists > Public > public-prov-wg@w3.org > November 2012

Fwd: XML schema for provenance

From: James Cheney <jcheney@inf.ed.ac.uk>
Date: Fri, 23 Nov 2012 16:44:42 +0000
To: Provenance Working Group <public-prov-wg@w3.org>
Message-Id: <8EBB83D7-B299-4E43-B4F8-8D8678806FEA@inf.ed.ac.uk>
('binary' encoding is not supported, stored as-is)
Please find below initial comments from Henry Thompson on the XML schema and identifier interoperability issues.  He will try to do a more detailed review of the schema in time for the next release.


Tracker, this is ISSUE-553

Begin forwarded message:

> From: ht@inf.ed.ac.uk (Henry S. Thompson)
> Date: November 23, 2012 3:43:43 PM GMT
> To: James Cheney <jcheney@inf.ed.ac.uk>
> Subject: Re: XML schema for provenance
> James Cheney writes:
>> At the last provenance working group face-to-face meeting there was
>> some discussion of how to align the different types of identifiers
>> used in RDF and XML schema (and in the "convenience" notation used
>> in the PROV specifications, PROV-N, which uses RDF-style
>> namespace-qualified identifiers).
>> For example, in PROV-N or RDF, something like the following is legal:
>> // PROV-N
>> document
>>  prefix ex <http://www.example.com/>
>>  entity(ex:42)
>> endDocument
>> // RDF
>> @prefix ex: <http://www.example.com/>
>> @prefix prov: <http://www.w3.org/ns/prov#> .
>> ex:42 a prov:Entity.
>> but in XML this is not an acceptable QName.  
> But it is an acceptable CURIE, as used e.g. in RDFa.  That doesn't
> make it acceptable in abbreviated for in RDF/XML (so although foo:43
> is a valid CURIE, and <rdf:Description
> rdf:about="#43">...</rdf:Description> isvalid RDF/XML,
> <rdf:Description rdf:ID="43">...</rdf:Description> is _not_ valid
> Similarly, N3 and Turtle have the same constraint as RDF/XML for
> qnames, but not for full URIs.  Which means, btw, that your "//RDF"
> example above, if it's meant to be N3, is in fact not valid.  At least
> not per the BNF I found -- ah, wait -- the most recent Turtle draft
> _does_ allow all digits [1].
>> I was asked to check with you whether there is a standard way of
>> dealing with this.  Do we just tell people to watch out for this or
>> is there some common way to describe the identifiers that are
>> interoperable between XML and RDF?
> So, that situation is in flux.  Full IRI references have never
> constrained their frag-id component to exclude integers.  Wrt
> abbreviated forms of IRIs, CURIEs don't constrain their local part,
> the most recent Turtle draft [1] doesn't exclude integers either, but
> the equivalent non-IRI parts of RDF/XML (and N3 and original Turtle)
> _do_ use the XML Name or NCName production or an approximation
> thereto, and so _do_ exclude integers.
> So you have to be very careful what you cite, and you should help your
> readers by calling out the point explicitly.
>> I think the group would especially appreciate if you had a moment to
>> look over the XML schema being developed for PROV.  It is at:
>> http://dvcs.w3.org/hg/prov/raw-file/default/xml/prov-xml.html
>> and is slated for release as a first public working draft soon,
>> ultimately to be a "note".
> I've had a brief look at this -- the schema is syntactically sound,
> and it at least validates the first two examples (nearly, but the
> error is a common one, and I'll explain in detail later).
> I'll have to get back to you on the schema, as I think I should look
> at it in some detail, and answer the Review Questions, but my time for
> this today is now used up. . .
> ht
> [1] http://www.w3.org/TR/2012/WD-turtle-20120710/
> -- 
>       Henry S. Thompson, School of Informatics, University of Edinburgh
>      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
>                Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
>                       URL: http://www.ltg.ed.ac.uk/~ht/
> [mail from me _always_ has a .sig like this -- mail without it is forged spam]

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Received on Friday, 23 November 2012 16:45:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 23 November 2012 16:45:25 GMT