W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > August 2003

JJC's take on I18N concerns

From: Jeremy Carroll <jjc@hpl.hp.com>
Date: Wed, 13 Aug 2003 21:32:59 +0300
To: w3c-i18n-ig@w3.org, w3c-rdfcore-wg@w3.org
Cc: swick@w3.org, timbl@w3.org, sandro@w3.org, djweitzner@w3.org
Message-Id: <200308132131.05894.jjc@hpl.hp.com>


This is a reply to

http://lists.w3.org/Archives/Member/w3c-i18n-ig/2003Aug/0022

which was (AFAICT) endorsed by I18N at their recent telecon.

I am also copying the recipients of
http://lists.w3.org/Archives/Member/w3c-archive/2003Aug/0027
(other than those who I believe are already on the To lists)


[[
1. The current approach fails to preserve markup integrity for XML
literals that have been scraped or obtained from another repository.
I18N is not convinced that there will not be use cases where markup
integrity is important, and that the current approach will amount to an
insuperable issue in those situations.
]]

A simple reversible algorithm for XHTML family is:
- take the XML fragment
- take the enclosing lang tag
- wrap the XML fragment with a span elemetn if legal, or otherwise a div 
element.
- apply the lnaguage tag to the span element

This algorithm needs to be applieid systematically. In particular it must be 
applied to XML content consisting of precisely a span or a div element. This 
then ensures that the algorithm is reversible. Given reversibility markup 
integrity can be preserved.
For non-xhtml markup see 3.

[[
2. I18N feels that the currently proposed implementation is overly
complicated for the user, and that this will introduce a strong risk
that users do not implement language information properly.
]]

RDF Core had feedback against other implementions on grounds of their 
complexity. This was a tradeoff decision.

[[
3.  The current approach assumes the existence of constructs to describe
language and carry language information in the native markup associated
with a fragment.  Such constructs may not exist, in which case it seems
impossible to ascribe such information at a meta level.  I18N feels that
such a situation is very bad.
]]

RDF Core only has compelling use cases for XHTML and friends.
A martkup intended to carry natural language without the ability to use XHTML 
constructs and without the ability to add arbitrary language markup is 
deficient, and RDF Core is not tasked with correcting those deficiencies.

[[
4. It seems to I18N that it will be difficult to convert rdf created
using the old syntax to the new syntax. Where legacy documents simply
declared xml:lang at the top of the file, they will now have to declare
it for every XML literal.  Also, there is no provision for automatic
conversion from the old to the new syntax.
]]

Old style was vague, no indication that xhtml namespace needed declaring 
(predated xhtml?). Not really useable because of such problems, certainly not 
in a portable fashion. The old spec is sufficiently bad to make this problem 
a non-starter since it is not clear what old style xml literal are supposed 
to mean, particularly the treatment of namespaces. Also old spec was somewhat 
unclear how language was supposed to be treated.

[[
5. I18N considers that it should be possible to conclude that a plain
literal and an XML literal without markup are the same text. Introducing
language markup as proposed in the current solution makes this
impossible, since it is never clear whether the markup was in the
original text or not.
]]
These seems like a more sophisticated language string oriented feature that 
belongs near the postponed issue
http://www.w3.org/2000/03/rdf-tracking/#rdfs-lang-vocab

I think RDF Core could consider broadening the scope of the postponed issue

[[
6. I18N has not been convinced that either of the alternative proposals
for including language information are problematic, and feels they are
more intuitive and workable than the current proposal because they do
not entail the problems cited above.
]]
I think this is answered by Sandro
http://lists.w3.org/Archives/Public/www-archive/2003Aug/0004
[[
There's a serious concern that people who don't care about XML wont
bother to implement these bits if they are bolted onto to the side
like that.  As just another datatype, it fits in smoothly, with no
particular extra work required.  (except for that language tag...)
Would you rather many implementations not support XML at all?
(Perhaps not really a fair question....)
]]

Jeremy
Received on Wednesday, 13 August 2003 15:33:30 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 3 September 2003 09:59:37 EDT