Proposed NTriples changes for literal notation

This closes action:
  2002-02-26#12  DaveB  Propose n-triples changes to represent the
                        new form of rdf literals.
  in F2F minutes http://www.w3.org/2001/sw/RDFCore/20020225-f2f/

During the F2F I was actioned to come up with a syntax to support the
structured literal form we agreed.  I discussed this with 
Dan Connolly what we could do and we agreed something like the
following was sufficient:

  xml("<b>foo</b>")              XML content, no language
  xml("<b>foo</b>", "en")        XML content, language given "en"

  "chat"                         Unicode string, no language
  "chat"-en                      Unicode string, language given as "en"

Features:
  * Makes all existing literals legal
  * Provides only one way to encoded the literal-structures
    and so in that sense is canonical.

In order to try this out, I've implemented the above in my N-Triples
parser, and it works just fine.

Issues:
  1. "chat"-en might not be good enough if languages can contain
     whitespace or other things (I need to check the RFCs)
   Solution if this is needed:
     "chat"-"en"

  2. Want one way to describe all literal structures:
    Solution:
      literal(unicode string value, unicode string language, boolean isXML)
    and define the abbreviated forms in terms of that

  3. I assume "chat" != "chat"-"" (need to check language RFCs)
    Solution if this is needed:
      Restrict the language string to always 1+ chars


I'm propose to change the N-Triples specification to have this notation
unless specific problems are raised with it.

Dave

Received on Friday, 8 March 2002 12:38:24 UTC