Re: Proposed NTriples changes for literal notation

>>>Jeremy Carroll said:
> >     This is any allowed xml:lang content as defined in
> >     http://www.w3.org/TR/REC-xml#sec-lang-tag
> 
> YES.
> 
> >
> >     ISSUE #2: I don't think specifying this more precisely here is
> >     worth it.  If the consensus is to do this, it would be something
> >     like this (after RFC 1766):
> >        language ::= [a-zA-Z]{1,8} ('-' [a-zA-Z]{1,8})
> >
> 
> NO, don't go there.
> XML first edition did that, then RFC3066 updated RFC1766 and changed it
> (digits are allowed in some places now).
> 
> I think XML first edition actually went further ...
> 
> Second edition fixed it by removing a load of rules.

That's what I meant by not worth specifying further.


So, re-summarising and slightly modifying.  The proposed changes are:

-------
Changing production
  http://www.w3.org/TR/2001/rdf-testcases/#literal
to
  literal  ::=  langString | XMLstring

and adding new productions:

  langString   ::= '"' string '"' ('-' language)

  xmlString    ::= 'xml' langString

    ISSUE #1: OR maybe?
      xmlString    ::= 'xml"' string '"' ('-' language)

  language     ::= character+ with no spaces allowed

    This is any allowed xml:lang content as defined in
    http://www.w3.org/TR/REC-xml#sec-lang-tag

    ISSUE #2 (closed)
-------

ISSUE #3

I'd also like to slightly modify N-Triple so that white space is
required after all terms of
  http://www.w3.org/TR/2001/rdf-testcases/#triple 
in order that the end of the langString could be found, specifically
when there are literal objects.

At present:
  <a> <b> "foo".
and
  <a> <b> "foo" .
are allowed

However if we add the above language production using the character
production - that includes '.' - so need to define termination on the
language string.

Alternatively, I can define the legal set of characters in language,
and exclude whitespace and '.'.  Given the historic slight
character-creep of RFC 1766 in RFC3066, this might not be a good
idea.

Dave

Received on Monday, 11 March 2002 09:40:56 UTC