- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Fri, 5 Feb 2010 10:58:27 -0500
- To: Dave Beckett <dave@dajobe.org>
- Cc: Steven Pemberton <Steven.Pemberton@cwi.nl>, Toby Inkster <tai@g5n.co.uk>, pfps@research.bell-labs.com, semantic-web@w3.org, sandro@w3.org, Mark Birbeck <mark.birbeck@gmail.com>
* Dave Beckett <dave@dajobe.org> [2010-02-04 06:12-0800] > (replying to the latest msg in this thread) > > Jeremy Carroll wrote: > ... > > > > ?s?o:n1.?s2?p2:n2 as a single CURIE > ... > > > > ?s?o:n1. ?s2?p2:n2 > > OK, that's line noise. Turtle should be readable and this is why whitespace > is a good idea to sometimes mandate or VERY strongly suggest. The turtle > spec doesn't say that very well and the sparql spec does let you get away > with this. I'm tempted to make mandatory spaces between components now. I don't think there's good ROI on chasing down and eliminating paths that could allow unpleasantly terse expression. I'd favor backward- compatibility and compatibility with SPARQL instead. I'd say forward- compatibility is less of an issue as folks frequently rev their SemWeb tools. CURIEs allow you to eliminate similar prefix declarations. This can lead to more readability in any graph which includes a two tier semantics in its names, e.g. view on an RDB (<stem>/table/pk.value) or any system which assigns node names hierarchically (medications/anticoag/warfarin). In this example, I've tweaked the Uniprot schema to use the LOD naming convention: @prefix u: <http://purl.uniprot.org/> . u:Proteins/P30090#it a u:Protein ; u:mnemonic "UPA3_HUMAN" ; u:annotation u:Annotations/P30090-A1#it . The nodes ending in "#it" are not expressible as qnames. OTOH, if we allow #, folks have to put whitespace between localnames and comment charaters. > I don't see the user need to allow such things. If you are worried about > the storage or network cost of extra spaces, you should compress. > > We probably don't need to go all the way to > "represent any URI in a compact form" (Steven) > as CURIES need to do in a constrained place - xml/html attribute value, > since Turtle has a place to write full URIs in all cases, and also has > additional syntax constraints in order to allow other abbreviated forms. > The nearest we could get would be any URI that doesn't use the Turtle > syntax symbols (anywhere) - [];,_. etc. We could allow full CURIEs by following e.g. :foo. with a space, as you suggested above. If we don't, we lose the value of calling it a CURIE (don't get parser/generator re-use, don't get mindshare with folks reading the spec). There is still value to liberalizing the grammar for identifiers. Unfortunately, many identifiers we want to import into the SemWeb contain [\._,]. Machine generation of readable, valid turtle containing these identifiers is easier if they are allowed in the "local name". I think our choices look like: • full CURIEs: use whitespace to disambiguate e.g. :foo. from :foo . • liberalize localname: allow '/'s and other non-puncutating chars • leave it alone My preferenes are (descending): full CURIEs leave it alone liberalize localname. -- -ericP
Received on Friday, 5 February 2010 15:59:09 UTC