Re: [LC response] To C. M. Sperberg-McQueen

Thanks for your careful analysis and suggestions, Michael.  While you
were drafting that, I was having a discussion with Addison Phillips, and
ended up drafting some text that he thought would be acceptable to the
I18N WG and which makes sense to me:

      Note that this specification does not, in itself, provide any
      mechanisms for controlling the display of text.  In particular,
      if users need to represent bi-directional text using an rdf:text
      literal, they may need to use Unicode bidirectional control
      characters.  See _Unicode Controls Vs. Markup for BiDi Support_
      [BIDI] for more information.  For some applications, it may be
      appropriate to another representation, such as rdfs:XMLLiteral,
      instead of rdf:text literals.
 
      [BIDI] http://www.w3.org/International/questions/qa-bidi-controls

I think this is very similar to your suggested text, below.

> To make a concrete proposal for purposes of discussion, I suggest
> that you add a fourth bullet item to the list at the beginning of
> section 4 of the spec.  If I were drafting it, the first draft
> might read like this:
> 
>    - Like xsd:string and the plain literals (with or without
>      language tags) of RDF, typed rdf:text literals are suitable
>      primarily for text that can be adequately represented as
>      a sequence of UCS characters, without additional information
>      or markup.  They are not satisfactory for the representation
>      of text with Ruby annotation or bidirectional text in
>      which the default Unicode bidirectional algorithm fails
>      to produce acceptable results. For such material, it is
>      recommended that values of the rdf:XMLLiteral datatype
>      be used instead; since it allows embedded markup, it can
>      readily be usd for such values.
> 
> Optionally break into two paragraphs before "They are not
> satisfactory" and replace "They" with "Typed rdf:text literals".

I somewhat prefer your text.  Without putting any pressure on him to do
so, I expect Boris will synthesize them both into something even better.

> This draft assumes that the right way to handle the problem is to
> use an XML literal; obviously, if you reach a different
> conclusion, then that bit needs to change.

That's roughly my conclusion as well, although it may be that the
solution is to use something ad hoc, or that we don't know about.

If rdfs:XMLLiteral is dropped from OWL 2 (it's 'at risk'), that will be
a problem.  Hopefully someone in the community cares enough about i18n
text and/or markup to push OWL 2 vendors to implement rdfs:XMLLiteral.

see http://www.w3.org/2007/OWL/wiki/At_Risk

> But you do need to say something.  It's really not tenable for a
> spec defining and internationalized text datatype to have nothing
> to say about the treatment of textual material that doesn't fit
> comfortably into sequences of UCS characters.  Surely this
> problem has come up before: If RDF can handle them, say how.  If
> RDF cannot handle them, then the entire Semantic Web Activity has
> a problem.

This is a sensitive topic.  I'm not sure anyone else on this list was
involved then, or that I should even bring it up, but many of us thought
RDF should be designed to represent natural language text the way it
represents just about everything else: using triples.  From where I sat
at the time (not on the RDF Core WG, but in daily communication with
people who were), language-tagged-literals were a fairly poor solution,
but one the RDF Core WG felt forced into.  It's not the worst flaw in
the design of RDF, IMHO, but it's probably in the top 5.  :-)  To me,
rdf:text make the situation a bit better.

Personally, I'd be happy to change the title to something that makes a
smaller claim, eg:
      rdf:text -- a datatype for strings with optional language tags


     -- Sandro (not on behalf of any WG)

Received on Wednesday, 6 May 2009 21:30:50 UTC