Re: example of options 3 & 4 simplifying code (ACTION-86)

On Fri, 2011-09-09 at 10:05 +0100, Andy Seaborne wrote:
> 
> On 07/09/11 18:34, Sandro Hawke wrote:
> > I argued in todays meeting, off the cuff, that option 2 (in Pat's
> > email [1]) offers only aesthetic improvements, while options 3 and 4
> > will result in simpler code.  I claimed that without simpler code,
> > we'd be better off staying with option 1 (no change).  Andy asked me
> > to provide a concrete example of my claimed code simplification.
> 
> Thank you for taking the time to write these out.
> 
> > Perhaps Andy is thinking, quite correctly, that a good API (like
> > Jena's?) already encapsulates and hides the trifold nature RDF
> > 2004 literals.  I would agree that simplifying or restructuring RDF
> > literals wont change/simplify application code in that world.
> >
> > So the examples I've come up with depend on stepping out of the clean
> > OO model.  Without making any normative claims here, or getting into
> > the type-system wars, I observe that a lot of code in the world is
> > like this, and it's probably not helpful to tell people who write this
> > kind of code not to.
> >
> > Example 1.  Encode triples in some temporary/hack syntax, eg for
> >              talking to a front-end which doesn't have an RDF library
> >
> > def encode_term_option_1(t):
> >     # assumes "simple literals" already folded into datatyped literals
> >     # as per RDF WG Aug 2011; otherwise there would be another branch.
> >     if t.is_literal:
> >       if t.language:
> >         return "lt_literal("+quoted(t.text)+","+quoted(t.lang)+")"
> >       else:
> >         return "dt_literal("+quoted(t.lexrep)+","+quoted(t.datatype)+")"
> >     else
> >       return "node("+quoted(t.iri)+")"
> >
> > def encode_term_option_3_or_4(t):
> >     # additionally assumes folded-in language tags
> >     if t.datatype:
> >       return "literal("+quoted(t.lexrep)+","+quoted(t.datatype)+")"
> >     else
> >       return "node("+quoted(t.iri)+")"
> 
> Obviously it's as much a matter of coding style but if the literal has 3 
> slots, lexrep, language, datatype, encoding to send to a non-RDF app 
> suggests to me something competely regular so this CSV-like appraoch is 
> what would occur to me:
> 
> def encode_term_option_1_or_2_or_3_or_4(t):
>      if t.is_literal
>        return 
> "literal("+quoted(t.lexrep)+",@"+t.language+","+quoted(t.datatype)+")"
>      else
>        return "node("+quoted(t.iri)+")"
> 
> Empty string for no language (c.f. XML).
> 
> For any of the proposals, having a 3-slot internal representation seems 
> to me quite natural, even option 4, which (contrary to the charter's 
> farming) breaks all data which uses a language tag.  And as we already 
> altered simple literals, that's now all plain literals.

I certainly agree this works.  I can definitely live with Option 1 (what
we have now).   I just think RDF would be more elegant with Option 3a,
and the handling of language tags moved off into the handling of
datatypes.

>  From experience, sending just the lexical part to many non-RDF-aware 
> applications works very well, including URIs as strings.  It's 
> information lossy so it's not about passing information that will be 
> republished.

I think you and I are agreed Option 4 is problematic for this reason,
among others.

    -- Sandro

>  Andy
> 
> >
> > def encode_triple(s,p,o):
> >     return (encode_term(s)+","+
> >             encode_term(p)+","+
> >             encode_term(o))
> >
> >
> > Example 2.  Look for a string in any kind of literal (regardless of
> > language). This is for a naive search, where the user just types some
> > stuff, without us knowing their language, or whether it's part of a
> > literal.
> >
> > def search_option_1(triples, keyword):
> >     for s,p,o in triples:
> >        if o.is_literal:
> >           value is o.language or o.lexrep
> >           if keyword in value
> >                 yield s
> >
> > def search_option_3_or_4(triples, keyword):
> >     for s,p,o in triples:
> >        if o.is_literal and keyword in o.lexrep:
> >           yield s
> >
> > I'm sure there are more examples, but hopefully this clarifies what I'm
> > talking about. I recognize this is not dramatic; it's one or two
> > lines.  But those lines come up a lot (in this non-OO world).  And RDF
> > is trying to be the ultra-elegant core data bus, so there is a very
> > strong light shining on the odd little bits (like language tags).
> >
> > Oh look, I almost got through this without mentioning JSON.  (I think
> > the JSON world will very much like the simplification of 3&4.)
> >
> >      -- Sandro
> >
> > [1] http://lists.w3.org/Archives/Public/public-rdf-wg/2011Sep/0019
> >
> >
> >
> >
> >
> >
> 
> 

Received on Friday, 9 September 2011 15:27:10 UTC