W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > July 2003

RE: XML observation

From: <Patrick.Stickler@nokia.com>
Date: Mon, 7 Jul 2003 11:21:41 +0300
Message-ID: <A03E60B17132A84F9B4BB5EEDE57957B0263015A@trebe006.europe.nokia.com>
To: <duerst@w3.org>, <phayes@ihmc.us>, <w3c-rdfcore-wg@w3.org>

> -----Original Message-----
> From: ext Martin Duerst [mailto:duerst@w3.org]
> Sent: 04 July, 2003 17:18
> To: pat hayes; w3c-rdfcore-wg@w3.org
> Subject: Re: XML observation
> I would like to answer in more detail, and hope to do so next week,
> but here just a few points.
> - Yes, the usage of XML for both textual markup and data is confusing
>    to many, even many who are working on the specs.
> - Please note that for parseType="Literal", we are actually mostly
>    dealing with textual markup, not with data, and it is not only not
>    adequate, but plain backwards to raise the 'data issue' to suppress
>    the conventions that XML has for textual markup.

On the contrary, it is because RDF/XML is a data markup language,
not a textual markup language, that this 'data issue' is precisely

And by what basis do you presume that all XML literals will 
constitute natural language textual content?!

That's why the encapsulation view is paramount. It does not 
matter in the least what is encapsulated. The semantics of the
encapsulation language have no relevance whatsoever to the
semantics of the encapsulated content.

Again, the error in RDF/XML is that plain literals are infected
by xml:lang and xml:base applies to canonicalization of XML
literals. We should ourselves have been more diligent to preserve
the boundary between encapsulated content and encapsulating structure.

But in this final moments, to the degree we are able, we
should try to minimize those errors and not use them
as justification for proliferating them or their like

> - I myself see the fact that XML can deal with both text and data
>    as a great potential, but a potential that has to be worked on
>    quite a bit to be achieved.

Agreed. But it will take alot more time and work to resolve,
and we are very much at the end of our time.

What we now have is not perfect. It never was expected to be.
But it's an optimal solution, all things considered.

Future WGs, both RDF and others, will surely continue to
work on these issues. We no longer have the time to solve
so far reaching a problem as the distinction between presentational
versus structural content in XML serializations.

> - Whereas the XML conventions for real datatypes in many ways can be
>    taken as just a notational convention for abstract concepts such as
>    'integer' that RDF treats as abstract concepts, in the case of
>    XML literals, we are dealing with marked-up text, and so there the
>    abstraction we are dealing with is XML, not just the notation.
>    (if RDF would want to create their own abstraction of marked-up
>    text, that would be a different thing, but currently, it doesn't)

Again, you seem to be presuming that if it is an XML literal, it
is natural language content. That presumption unfounded.

> - XML Schema is confusing in many ways, because it was created by a
>    group of people that contained both people with a great deal of
>    expertize in textual markup, and people with excellent experience
>    in traditional datatypes, but did not spend too much effort to
>    combine these things. There is a top-level split into simple and
>    complex types, and then two separate worlds. The question 'of which
>    simple type is the text content in complex types' was never asked.
>    Also, it was not realized that 'string' is very special, because it
>    is the simple type for 'everything for which we don't have 
> a type yet'
>    and also the type of the actual representation of all the other
>    datatypes.

Fair enough, but I don't know that it is the RDF Core WG's task
to clarify XML Schema, even if we had time to try, which we don't.

This is perhaps an issue for the TAG.


Patrick Stickler
Nokia, Finland
Received on Monday, 7 July 2003 04:21:44 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:24:23 UTC