Re: Proposal

----- Original Message -----
From: "ext Martin Duerst" <duerst@w3.org>
To: <Patrick.Stickler@nokia.com>; <bwm@hplb.hpl.hp.com>
Cc: <w3c-rdfcore-wg@w3.org>; <w3c-i18n-ig@w3.org>
Sent: 11 July, 2003 01:54
Subject: Re: Proposal


> ...
> I think you forgot the xml:lang="fi"; I assume this is just an oversight.

No. I didn't forget it. It's in the rdf:RDF element wrapper defined
at the beginning of the post.

> But more than that, I think it collapses two things that should
> be distinct: Strings that happen to look like XML fragments, and
> strings that are actually XML fragments. XML makes a clear distinction
> between these, but the above would blur this distinction. It would
> most probably lead to a great deal of confusion among a wide range
> of users. It would also not help with a natural transition from
> 'plain' to 'xml' literals. In particular:
> ...

There would be no such thing as "XML literals". RDF would only
have plain literals.  (whether or not RDF also defines a datatype
that has XML encoded lexical forms does not constitute a
distinct kind of literal).

But due to the use of XML for the RDF serialization,
the RDF/XML *syntax* provides a means to express literals,
of either type, plain or typed, having XML markup as inline
content in the RDF/XML serialization.

And per your examples of confusion, I could offer counter examples
of how equal confusion can arise with the present solution, and I
see either option as having equal potential for confusion and equal
need for education.

And one answer to that confusion is for users to use typed literals
based on a datatype which supports XML markup in its lexical forms.

I don't think the WG is prepared to posit a third type of literal. The
potential for confusion you suggest is IMO (a) speculative at best and
(b) not sufficient motivation, even if true.

> Having to escape the actual characters &amp; and &lt; in the abstract
> syntax looks somewhat ugly, but these are only two out of more than
> 90,000 characters defined in Unicode.
>
> Note: Please note that while escaping is not really something that
> looks related to internationalization, it is something we have had
> to work on extensively since we started with HTML internationalzation.

IMO, escaping is an issue for serialization, not representation in
the graph. Characters should *not* be escaped in the graph
representation according to the conventions of one particular
serialization model. To do so is simply wrong.

Any escaping in the RDF/XML serialization that is specific to
that XML encoding should be removed prior to representation
of the sequence of bytes in the graph.

--

Here is a pointed question for you, Martin:

Which of the two solutions would you prefer?

a. The present solution as specified in the latest drafts

 or

b. This new proposal


Patrick


>
> Regards,    Martin.
>
>
> >--
> >
> >Users are then free to choose between legacy M&S literals, with lang
> >tag, with no special distinction made in the graph regarding the
> >presence or absence of markup; or alternatively, typed literals
> >with no lang tags and likewise no distinction made in the graph regarding
> >the presence or absence of markup in the lexical forms.
> >
> >There remains no semantic distinction between a plain literal and
> >an XML literal. An "XML literal" is simply a plain literal with
> >XML markup that is serialized as unescaped XML. Nothing more.
> >
> >RDF continues to have two kinds of literals, plain and typed, and
> >comparison of plain literals, regardless of the presence of markup,
> >is by simple string comparison. All reference to canonicalization
> >is removed from the specs -- hopefully moved to a Note addressing the
> >use of RDF with datatyped literals having XML encoded lexical forms,
> >and including the definition of a datatype equivalent to rdf:XMLLiteral
> >or a similar interpretation of xsd:complexType.
> >
> >Let the market and user community decide which alternative,
> >plain or typed literal, is best for which application.
> >
> >Equivalences between plain literals and typed literals is
> >left to each individual specification of each datatype.
> >
> >Note again, that this alternative proposal introduces nothing
> >substantively new to the mix. And in fact, the minor changes
> >to the RDF/XML syntax represent how most earlier RDF parsers
> >treated rdf:parseType="Literal" to begin with.
> >
> >It also will allow folks to say useful things like
> >
> >    <rdf:Description rdf:about="#x">
> >       <ex:foo rdf:datatype="&xhtml;b" rdf:parseType="Literal">
> >          <xhtml:b>bar</xhtml:b>
> >       </ex:foo>
> >    </rdf:Description>
> >
> >i.e.
> >
> >    <#foo> ex:foo "<xhtml:b>bar</xhtml:b>"^^xhtml:b .
> >
> >and thus take advantage of being able to serialize those typed
> >XML encoded lexical forms without escaping.
> >
> >--
> >
> >Martin, does that meet your expectations and wishes
> >better than the present solution?
> >
> >If so, is the WG favorable to such a proposed change?
> >
> >Regards,
> >
> >Patrick
> >
> >--
> >Patrick Stickler
> >Nokia, Finland
> >patrick.stickler@nokia.com
> >
>
>

Received on Friday, 11 July 2003 04:34:41 UTC