Re: rdfms-literal-is-xml-structure: Why?

Hi Ron,

Ron Daniel wrote:

> I agree with the need to represent the namespace information.
> I am not sure that we have to do it by encoding that
> information directly into the literal. By that I assume you mean
> the RDF processor would change the literal content from
>   "<foo:bar>foobar</foo:bar>"
> to something like
>   "<foo:bar xmlns:foo="http://foo">foobar</foo:bar>

That's not exactly what I had in mind.  My literal objects at the
moment look a bit like:

  struct {
    String  string;
    String  lang;
    boolean wellFormed;
  }

I was thinking of extending this to:

  struct {
    String  string;
    String  lang;
    boolean wellFormed;
    Set     namespaces;
  }

so the object that represents a literal would know what namespaces and their
prefixes are.  I think this is similar/same to/as you are suggesting.

> 
> My personal preference would be to not rewrite that content if
> we can avoid it. Also - since M&S 1.0 says we don't process
> the content, we probably shouldn't process the content. Third,
> that would be a backward-incompatible change and would
> invalidate any current XML literals held in RDF databases.

Processing it, at least on output, is going to be unavoidable.

> 
> Viewed abstractly, RDF processors must report a tuple of
> (namespace initializations, literal XML content). This,
> along with xml:lang, is missing from the M&S 1.0 model.

We need to do an analysis of what M&S actually says about this.
You were around when this was done Ron, what did the original wg
have in mind when they added parseType="literal" to the spec.
I suspect that they just wanted a convenient way of representing
markup as a string without having to to escape all those < chars.
Did the working group realise that parseType="Literal" introduced
a new kind of literal into the model.  Was that the intent?

I kinda take issue with the repeated claims that xml:lang is 
'missing' from the model described in m&s.  m&s is quite clear
that the language is 'part of' the literal.  The locale is
not explicitly represented as a triple, true, but its part of
a literal which is part of the model.

> 
> There are multiple APIs possible, the spec should not require
> or preclude any particular one. However, we will have to
> represent this information in the n-triples format in order to
> test things.
> 
> Before we talk about the exact changes to the n-triples format
> that we may make, we should agree on the info it must pass along.
> (I don't want to let something like escape character considerations
> influence the information we need to convey).
> 
> It seems that a record for a statement in the format must, at
> a minimum, convey:
>   subject  - an (absolute?) URI or URI reference
>   predicate - an (absolute?) URI or URI reference
>   object:
>       an (absolute?) URI or URI reference
>    OR a literal string, and
>       a locale (xml:lang value, if given)
>    OR a literal string,
>       a parseType=Literal flag,
>       a locale, and
>       a set of namespace initializations

That looks about right, assuming we go ahead with this line on adding
these kinds of literal.  There are other approaches which we should
also explicitly consider.

> 
> Depending on our resolution of the aboutEach issue, we may
> also need to add a flag to tell if the subject is simple or
> an aboutEach.

I'm not convinced of that.  An implementation/api might well have
that, but that's not what we are designing here, are we?

Brian

Received on Sunday, 15 July 2001 13:43:21 UTC