- From: Sandro Hawke <sandro@w3.org>
- Date: Fri, 09 Sep 2011 11:27:02 -0400
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: public-rdf-wg@w3.org
On Fri, 2011-09-09 at 10:05 +0100, Andy Seaborne wrote: > > On 07/09/11 18:34, Sandro Hawke wrote: > > I argued in todays meeting, off the cuff, that option 2 (in Pat's > > email [1]) offers only aesthetic improvements, while options 3 and 4 > > will result in simpler code. I claimed that without simpler code, > > we'd be better off staying with option 1 (no change). Andy asked me > > to provide a concrete example of my claimed code simplification. > > Thank you for taking the time to write these out. > > > Perhaps Andy is thinking, quite correctly, that a good API (like > > Jena's?) already encapsulates and hides the trifold nature RDF > > 2004 literals. I would agree that simplifying or restructuring RDF > > literals wont change/simplify application code in that world. > > > > So the examples I've come up with depend on stepping out of the clean > > OO model. Without making any normative claims here, or getting into > > the type-system wars, I observe that a lot of code in the world is > > like this, and it's probably not helpful to tell people who write this > > kind of code not to. > > > > Example 1. Encode triples in some temporary/hack syntax, eg for > > talking to a front-end which doesn't have an RDF library > > > > def encode_term_option_1(t): > > # assumes "simple literals" already folded into datatyped literals > > # as per RDF WG Aug 2011; otherwise there would be another branch. > > if t.is_literal: > > if t.language: > > return "lt_literal("+quoted(t.text)+","+quoted(t.lang)+")" > > else: > > return "dt_literal("+quoted(t.lexrep)+","+quoted(t.datatype)+")" > > else > > return "node("+quoted(t.iri)+")" > > > > def encode_term_option_3_or_4(t): > > # additionally assumes folded-in language tags > > if t.datatype: > > return "literal("+quoted(t.lexrep)+","+quoted(t.datatype)+")" > > else > > return "node("+quoted(t.iri)+")" > > Obviously it's as much a matter of coding style but if the literal has 3 > slots, lexrep, language, datatype, encoding to send to a non-RDF app > suggests to me something competely regular so this CSV-like appraoch is > what would occur to me: > > def encode_term_option_1_or_2_or_3_or_4(t): > if t.is_literal > return > "literal("+quoted(t.lexrep)+",@"+t.language+","+quoted(t.datatype)+")" > else > return "node("+quoted(t.iri)+")" > > Empty string for no language (c.f. XML). > > For any of the proposals, having a 3-slot internal representation seems > to me quite natural, even option 4, which (contrary to the charter's > farming) breaks all data which uses a language tag. And as we already > altered simple literals, that's now all plain literals. I certainly agree this works. I can definitely live with Option 1 (what we have now). I just think RDF would be more elegant with Option 3a, and the handling of language tags moved off into the handling of datatypes. > From experience, sending just the lexical part to many non-RDF-aware > applications works very well, including URIs as strings. It's > information lossy so it's not about passing information that will be > republished. I think you and I are agreed Option 4 is problematic for this reason, among others. -- Sandro > Andy > > > > > def encode_triple(s,p,o): > > return (encode_term(s)+","+ > > encode_term(p)+","+ > > encode_term(o)) > > > > > > Example 2. Look for a string in any kind of literal (regardless of > > language). This is for a naive search, where the user just types some > > stuff, without us knowing their language, or whether it's part of a > > literal. > > > > def search_option_1(triples, keyword): > > for s,p,o in triples: > > if o.is_literal: > > value is o.language or o.lexrep > > if keyword in value > > yield s > > > > def search_option_3_or_4(triples, keyword): > > for s,p,o in triples: > > if o.is_literal and keyword in o.lexrep: > > yield s > > > > I'm sure there are more examples, but hopefully this clarifies what I'm > > talking about. I recognize this is not dramatic; it's one or two > > lines. But those lines come up a lot (in this non-OO world). And RDF > > is trying to be the ultra-elegant core data bus, so there is a very > > strong light shining on the odd little bits (like language tags). > > > > Oh look, I almost got through this without mentioning JSON. (I think > > the JSON world will very much like the simplification of 3&4.) > > > > -- Sandro > > > > [1] http://lists.w3.org/Archives/Public/public-rdf-wg/2011Sep/0019 > > > > > > > > > > > > > >
Received on Friday, 9 September 2011 15:27:10 UTC