W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > July 2003

Re: Updated summary from I18N on 'XML Literals'

From: Brian McBride <bwm@hplb.hpl.hp.com>
Date: 10 Jul 2003 10:03:44 +0100
To: Martin Duerst <duerst@w3.org>
Cc: rdf core <w3c-rdfcore-wg@w3.org>, w3c-i18n-ig@w3.org
Message-Id: <1057827824.2734.7.camel@dhcp-91-136.hpl.hp.com>

Hi Martin,

Thank you for this.  I'm sure it will be helpful in understanding your
point of view.

However, I note with some frustration, that after three weeks of
discussion, I still don't have a single use case illustrating
significant harm done to the cause of internationalization by RDFCore's
post last call decision.

As things stand, RDFCore has no case to answer.


On Thu, 2003-07-10 at 00:35, Martin Duerst wrote:
> As promised yesterday, here is an updated list of arguments
> from the I18N side on the issues surrounding 'XML Literals',
> and our understanding behind it.
> Many thanks to many of you for helping me to understand
> how to better explain things; anything that is not yet
> clear is clearly my fault.
> The rationale is split into various parts, such as technical
> and procedural arguments.
> Technical Arguments
> -------------------
> One of the needs in internationalization is the need for what we
> may call 'micro-markup'. Although this is not necessarily very
> frequent, it turns up repeatedly in all kinds of instances such as:
>    - multilingual text fragments
>    - bidirectionality
>    - ruby
>    - special glyph variants
>    - translation hints for localization
> There are other cases where similar micro-markup is desirable,
> such as using mathematical or chemical formulae in text.
> In the RDF context, the typical example for this is a book title.
> The RDF M&S specification uses a Math example, and not an i18n
> example, mainly because that was felt that it was easier for a
> wide audience to understand.
> The need for micro-markup for i18n is not limited to RDF at all.
> We have had long discussions with XML Schema about it, which have
> led to the 'text' type described in the XML Schema primer (see 
> http://www.w3.org/TR/xmlschema-0/#any), which ended up in the
> XML Schema type library (see
> http://www.w3.org/2001/03/XMLSchema/TypeLibrary.xsd
> http://www.w3.org/2001/03/XMLSchema/TypeLibrary-text.xsd
> http://www.w3.org/2001/03/XMLSchema/TypeLibrary-nn-text.xsd,
> you may have to do 'view source' to get the full picture).
> We have also made review comments to other WGs requesting that
> elements rather than attributes be used for anything that looks
> like natural text rather than enumerations, numeric values, and
> the like, in order to be able to use micro-markup. It is also
> leading to changes from XHTML 1.0 to XHTML 2.0.
> The need for micro-markup in various ways makes it quite important
> that 'text' and 'text-with-markup' are not seen as two completely
> different things, but that text-with-markup, to whatever extent
> possible, be seen and handled as an extension of text. This is even
> more important because the need for micro-markup may often not
> appear for very long, in particular if mainly data of particular
> languages is handled. Therefore, it should be as natural as possible
> to make the transition from plain text to text with micro-markup.
> All this is not just an issue from the view of RDF/XML (Pat's
> view X), but very much also, or actually first and foremost,
> applies to the model and the graph (Pat's view G). The best thing
> would be the way M&S treats literals, with language information,
> and with literals containing markup as a natural extension of
> literals with only plain text. Please note that this does not at
> all preclude other usages of XML content in RDF literals, be it
> larger textual pieces (typical example is documentation) or,
> as some people in this discussion have, to my surprise, suggested,
> as blobs of data-oriented XML. There is no need for every XML literal
> to have a language in the Graph (we just want it to be possible to
> have one), and in RDF/XML, xml:lang="" does the job.
> We understand that there are some implementation problems, primarily
> motivated by the desire to store plain strings as strings without
> escaping, but we see this as an implementation issue, not as a
> reason for making plain literals and XML literals completely
> different things, and having them behave completely differently.
> We see parseType='Literal' as what it says, namely an instruction
> about how to parse RDF/XML, similar to the other values of parseType,
> and not as something that should be directly reflected in the
> graph.
>  From an user point of view, users familar with XML will be surprised
> that xml:lang tag does affect plain literals (as the XML 1.0 REC says)
> but does not affect xml literals. Why are the rules for XML switched
> off just for XML?
> The idea that applications use an application-specific wrapper element
> is in conflict with the idea of micro-markup, which should be used
> only when necessary, and with the concept of markup integrity,
> which means that in some way or another, markup may always be
> significant, and changing it may be unapropriate.
> Procedural
> ----------
> - It is our understanding that RDF Core was chartered with
>    clarifying the RDF M&S spec, not changing it. Already by
>    separating plain literals and XML literals, and much more
>    by removing language information from XML literals, the
>    new spec is a clear change from M&S, rather than a
>    reinterpretation.
> - We agreed in Cannes that the ambiguity in M&S that RDF applications
>    may or may not consider language information would be resolved
>    to that the RDF graph would provide the language information.
> - Later, RDF Core asked us about the problem of integrating
>    arbitrary pieces of XML without language information into
>    an RDF/XML document. The same problem was brought up by
>    XML Signature (or was it encryption) and SOAP. The I18N
>    WG recognized this problem, checked with the experts on
>    language tagging standards, and recommended to XML Core
>    to issue an erratum to define xml:lang="" for this case,
>    which they did.
> - Later, RDF Core asked about the applicability of language
>    information to datatypes such as (XML Schema) integer.
>    We told them that these were designed as language- and
>    locale-independent datatypes, and so it would be appropriate
>    to specify that they did not carry language information.
> - Although this was rather implicit (in the sense of a common
>    understanding that didn't have to be made explicit), I think
>    neither side ever assumed that removing language information
>    from XML Schema simple datatypes would affect plain literals
>    or XML literals.
> - After last call, RDF Core asked us whether we would be okay
>    with removing language information from XML literals. It was
>    nice for them to ask, but it also clearly indicates that they
>    understood it to break our previous agreement. We had a look
>    at it and decided that, for the reasons explained above, it
>    would not be okay. It also helped us to understand that the
>    RDF M&S design for literals had been changed rather substantially,
>    with undesired consequences for internationalization, and that
>    ideally, more than just putting language information back on
>    XML literals was needed, but that if really necessary, we
>    could live with only that change back (to the last call state).
> The bigger picture
> ------------------
> Some people have said that the M&S solution for dealing with
> language information is not very good. We don't deny that there
> may be other, potentially better solutions. We also would have
> been ready (and would still be ready), I guess, to discuss such
> solutions with RDF Core. But we are not ready to trade something
> that we have currently (M&S, or with somewhat less satisfaction,
> the last call status) with something that is according to our
> understanding clearly less consistent and less useful, with
> or without the potential to 'get things fixed' in the future.
> Regards,    Martin.
Received on Thursday, 10 July 2003 05:07:47 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:24:23 UTC