- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Fri, 09 Sep 2011 10:05:38 +0100
- To: public-rdf-wg@w3.org
On 07/09/11 18:34, Sandro Hawke wrote:
> I argued in todays meeting, off the cuff, that option 2 (in Pat's
> email [1]) offers only aesthetic improvements, while options 3 and 4
> will result in simpler code. I claimed that without simpler code,
> we'd be better off staying with option 1 (no change). Andy asked me
> to provide a concrete example of my claimed code simplification.
Thank you for taking the time to write these out.
> Perhaps Andy is thinking, quite correctly, that a good API (like
> Jena's?) already encapsulates and hides the trifold nature RDF
> 2004 literals. I would agree that simplifying or restructuring RDF
> literals wont change/simplify application code in that world.
>
> So the examples I've come up with depend on stepping out of the clean
> OO model. Without making any normative claims here, or getting into
> the type-system wars, I observe that a lot of code in the world is
> like this, and it's probably not helpful to tell people who write this
> kind of code not to.
>
> Example 1. Encode triples in some temporary/hack syntax, eg for
> talking to a front-end which doesn't have an RDF library
>
> def encode_term_option_1(t):
> # assumes "simple literals" already folded into datatyped literals
> # as per RDF WG Aug 2011; otherwise there would be another branch.
> if t.is_literal:
> if t.language:
> return "lt_literal("+quoted(t.text)+","+quoted(t.lang)+")"
> else:
> return "dt_literal("+quoted(t.lexrep)+","+quoted(t.datatype)+")"
> else
> return "node("+quoted(t.iri)+")"
>
> def encode_term_option_3_or_4(t):
> # additionally assumes folded-in language tags
> if t.datatype:
> return "literal("+quoted(t.lexrep)+","+quoted(t.datatype)+")"
> else
> return "node("+quoted(t.iri)+")"
Obviously it's as much a matter of coding style but if the literal has 3
slots, lexrep, language, datatype, encoding to send to a non-RDF app
suggests to me something competely regular so this CSV-like appraoch is
what would occur to me:
def encode_term_option_1_or_2_or_3_or_4(t):
if t.is_literal
return
"literal("+quoted(t.lexrep)+",@"+t.language+","+quoted(t.datatype)+")"
else
return "node("+quoted(t.iri)+")"
Empty string for no language (c.f. XML).
For any of the proposals, having a 3-slot internal representation seems
to me quite natural, even option 4, which (contrary to the charter's
farming) breaks all data which uses a language tag. And as we already
altered simple literals, that's now all plain literals.
From experience, sending just the lexical part to many non-RDF-aware
applications works very well, including URIs as strings. It's
information lossy so it's not about passing information that will be
republished.
Andy
>
> def encode_triple(s,p,o):
> return (encode_term(s)+","+
> encode_term(p)+","+
> encode_term(o))
>
>
> Example 2. Look for a string in any kind of literal (regardless of
> language). This is for a naive search, where the user just types some
> stuff, without us knowing their language, or whether it's part of a
> literal.
>
> def search_option_1(triples, keyword):
> for s,p,o in triples:
> if o.is_literal:
> value is o.language or o.lexrep
> if keyword in value
> yield s
>
> def search_option_3_or_4(triples, keyword):
> for s,p,o in triples:
> if o.is_literal and keyword in o.lexrep:
> yield s
>
> I'm sure there are more examples, but hopefully this clarifies what I'm
> talking about. I recognize this is not dramatic; it's one or two
> lines. But those lines come up a lot (in this non-OO world). And RDF
> is trying to be the ultra-elegant core data bus, so there is a very
> strong light shining on the odd little bits (like language tags).
>
> Oh look, I almost got through this without mentioning JSON. (I think
> the JSON world will very much like the simplification of 3&4.)
>
> -- Sandro
>
> [1] http://lists.w3.org/Archives/Public/public-rdf-wg/2011Sep/0019
>
>
>
>
>
>
Received on Friday, 9 September 2011 09:06:18 UTC