RE: JJC's take on I18N concerns from Martin Duerst on 2003-08-18 (w3c-rdfcore-wg@w3.org from August 2003)

From: Martin Duerst <duerst@w3.org>
Date: Mon, 18 Aug 2003 10:17:50 -0400
To: <Patrick.Stickler@nokia.com>, <jjc@hpl.hp.com>, <w3c-i18n-ig@w3.org>, <w3c-rdfcore-wg@w3.org>
Cc: <swick@w3.org>, <timbl@w3.org>, <sandro@w3.org>, <djweitzner@w3.org>
Message-Id: <4.2.0.58.J.20030815132347.05f5d120@localhost>
Hello Patrick,

At 12:14 03/08/15 +0300, Patrick.Stickler@nokia.com wrote:

> > -----Original Message-----
> > From: ext Martin Duerst [mailto:duerst@w3.org]

> > >[[
> > >1. The current approach fails to preserve markup integrity for XML
> > >literals that have been scraped or obtained from another repository.
> > >I18N is not convinced that there will not be use cases where markup
> > >integrity is important, and that the current approach will
> > amount to an
> > >insuperable issue in those situations.
> > >]]

>No XHTML specific solution is required.
>
>I've already pointed out XML Fragment as a solution to
>the XML scraping problem, and to the general problem
>of capturing information about the context of XML
>Fragments.

I just have looked at the XML Fragment spec. Let's for
a moment assume that XML Fragment is a full REC. There are
several issues that make it difficult to see how this could
be used to address the actual problems we are working on.

First, the spec is written in a way that transfers the
context of an XML Fragment separate from the actual
fragment. While it also allows the fragment to be integrated
in the context with some specific packaging solution, we
don't have such a solution yet. This would mean that
we would have to develop such a solution or end up with
a tuple for XML Literals anyway (which I understand is one
of the things that the RDF Core WG seems to want to avoid).

Second, there is the question of whether it would be the
duty of creators of XML Literals (e.g. in RDF/XML) to
include the fragment context, or if that would be handled
by the RDF/XML parser.

In the former case, this would mean that RDF/XML writers
(people and programs) would have to include XML fragment
contexts as content of rdf:parseType='Literal' elements.
This would need a packaging solution. It would also mean
that questions of how an XML fragment context gets included
in another XML document are decided. It would also mean that
people doing scraping,... would probably have a hard time
deciding exactly what to include, and how to include it.

In the later case (parser constructing fragment contexts),
the fragment would come to include all
namespaces from the RDF/XML context (which would correspond to
canonical XML rather than exclusive canonicalization as chosen by
RDF Core), and all anchestral elements from the RDF/XML context
(most of which would be totally unrelated to the actual XML
Literal). This would also mean that the definition of equality,
and the definition of how to write out XML Literals back to
RDF/XML, would become very complex, in essence having to answer
the question of relevance for things such as xml:lang,... in
the RDF specs, an issue that you wanted to avoid by using XML
Fragments.


So overall, the XML Fragment spec doesn't really help.
It is designed to transport freestanding fragments, mostly
smaller parts of huge documents. It is not designed to
transport a series of XML fragments in a compound XML
document. XML itself is designed to transport smaller
XML pieces in bigger XML documents, but not in a completely
independent way.


>I've also pointed out that it is not RDF's duty to
>solve the general XML Fragment context issues.

I have agreed with that. RDF only has to solve these
issues as far as RDF is concerned. RDF makes some very
explicit decisions (it cannot avoid making these decisions).
And it explicitly decides that some stuff is inherited,
in particular:
- 'visibly utilized' namespace declarations
- xml:lang for plain literals


>I don't consider the XML scraping use case to
>identify any failure in the present (editors'
>draft) design.

You don't. We do.


>It is a problem that will exist for *any* case where
>an XML Fragment is taken out of its context and
>encapsulated in *any* formalism.
>
>Any solution that RDF might offer to this problem will
>be an RDF-specific solution. And such a solution will
>be a partial solution, limited only to language
>context, insofar as the general problem of capturing
>XML Fragment context is concerned.

I18N is interested in language context. XML 1.0 defines
how xml:lang is working. This is different from DTD-specific
stuff. RDF inherits language context for plain literals.
What RDF has in the post-lastcall draft is an RDF-specific
solution. It is a partial solution, and a solution that
is affecting I18N negatively.


>Far better for the I18N WG to work with the **XML**
>community on a general, consistent solution to this
>problem using XML Fragment, or similar,

I probably said this earlier, but if the RDF Core WG
had asked the I18N WG for help on this issue at the
Technical Plenary in Cannes, I'm quite sure we would
have helped you. (I don't think it would have been
only the I18N WGs job to do that.) The RDF Core WG
asked the I18N WG about blocking off language information
when inserting XML fragments into a larger document,
and I18N made sure that you got the solution
(xml:lang=''). If you now think that you found out
that you looked for the wrong thing, please don't
blame us.


>than to continue
>to harass the RDF Core WG to provide a proprietary
>solution that will likely be *INCOMPATIBLE* with
>any general solution that might later arise within
>the XML community.

Sorry, but the I18N WG and the RDF Core WG amicably
agreed to a solution in Cannes. The RDF Core WG up
to last call didn't see any problem with this
solution. And as explained above, RDF has to make
some decisions, and does make some.


>--
>
>Here is the last I personally plan to say on this matter:
>
>I've seen two key issues that you have brought to our attention:
>
>1. XML literals don't have a lang tag indicating
>    language scope, so how do XML users capture
>    this contextual information about fragments.
>
>2. The relationship between plain literals and
>    XML literals without markup has not be addressed.
>
>As for #1, I've said enough, I think, above about that.
>The context of an XML Fragment is not RDF's concern. And
>there exists a W3C rec (albeit only CR) for addressing
>such issues in a consistent manner. It is up to the
>XML community to bring it to maturity and promote its use.
>RDF should remain agnostic as to such solutions, supporting
>whatever the XML community decides (which it will do,
>without any need for modification, based on the latest
>editors' draft).

See above. The XML Fragments CR doesn't cover the use case
you are thinking about. And RDF isn't agnostic, it currently
makes some choices, and it always did.


>As for #2, the editors are working on text to clarify this
>relationship. In a nutshell "abc" and
>"abc"^^rdf:XMLLiteral denote two distinct values,
>but applications may choose to "convert" or "coerce"
>between such values, without (substantial) loss of
>meaning, just as they may convert between
>"1"^^xsd:integer and "1.0"^^xsd:decimal. As far
>as the MT is concerned, however, plain literals values
>and XML literal values are disjunct.

With the exception of the question of how language information
is handled, this is okay as far as we are concerned.


Regards,   Martin.
Received on Monday, 18 August 2003 10:26:44 UTC