Re: RDF-ISSUE-12 (String Literals): deprecate language tags?

On Sat, 2011-03-05 at 09:24 -0600, Pat Hayes wrote:
> We can allow language tags on xsd:string literals

How?  I don't think so.  As I recall (from watching the group, not being
in it like you), that was the constraint that got us into this mess in
the first place.  The i18n WG said RDF had to have language tags on
text, and the xsd WG said we couldn't put language tags on datatyped
values.  So our best option seemed to be to have strings which were not
datatyped values.

We could try pushing on those constraints again and see what has
changed, given years of additional experience.

My first inclination, in approaching this ISSUE-12, is to first see if
we can get rid of language-tagged literals.  Are there people who will
fight to keep them?  If so, please speak up.   I know a lot has been
invested in them over the years, but are people happy with the results?
I genuinely don't know.

To be a little more detailed, this straw proposal is: 
    - we weakly deprecate language-tagged literals, saying folks
      should stop generating them 
    - we recommend a different way of getting the same functionality that
      does not require changes to RDF, SPARQL, OWL, etc.
    - we explain how to map from the old way to the new way, and
      suggest that software do the conversion, offering higher layers
      the new style of access, even if data came in "old style".

There are several "new" ways to go, involving introducing one or more
new nodes.  So, instead of:

  db:cat dbo:abstract "The cat (Felis catus), also known as..."@en,
                      "Le chat domestique (Felis silvestris..."@fr,
                      ...

We could instead have the abstract be a single "MultiLanguageString",
which has versions in various languages, like this:

  db:cat dbo:abstract [ l:en "The cat (Felis catus), also known as...";
                        l:fr "Le chat domestique (Felis silvestris..."],
                      ...

... or the abstract could have multiple values, each of which is a
text-in-some-language, like this:

  db:cat dbo:abstract [ a l:Text-en; 
                        l:text "The cat (Felis catus), also known as..."],
                      [ a l:Text-fr; 
                        l:text "Le chat domestique (Felis silvestris..."],
                      ...

The first option has the advantages of brevity; the second allows more
extension to many other kinds of annotations on the strings, aside from
language, like long-version and short-version,
approved/proposed/deprecated, or whatever.  I suspect it would be hard
to pick between these two if we had to.

Personally, I don't have a strong opinion about this issue.  I think
language tagged literals are an unfortunately design, but I think most
of the cost of them has already be paid for, and changing at this point
would probably be more trouble than it's worth.  On the other hand, if
it turns out folks are mostly avoiding language tagged literals in favor
of one of the above styles, or something else, I wouldn't mind us
changing.

   -- Sandro

Received on Saturday, 5 March 2011 23:50:19 UTC