Re: Change in definition of RDF literals

Brian McBride wrote

> I couldn't see how it related to I18N issues.  


I will try and be on-topic, and I'll get off to a good start by digressing.


In Budapest, I spoke with the bbc world service website people.
They have material in 43 languages in an XML workflow in their website 
production system.

I was thinking about the rdf:XMLLiteral issue and their application.

1. Mapping their data to RDF would suggest 43 different subgraphs each 
hanging off a blank node labelled with the correct language.

2. Within the italian subgraph we would expect all/most plain literals to 
be marked with language tag: it

3. An XMLLiteral (always? XHTML) in this subgraph would need to have <span 
xml:lang="it">  </span> surrounding it.

4. When searching the whole thing for an italian word "domani"@it we could:
   a) find the italian subgraph and search that for "domani"
  or
   b.1) search for all plain literals with language range @it-* containing 
domani as a substring
   and
   b.2) search for all XMLLiterals with xml:lang="it-*" in them and domani 
occurring in scope.



Looking at the decision about language tag and XML Literal by itself, 
intially I thought the decision looked bad because of 4.b.1 no longer 
working for XMLLiterals. Then on second thoughts it seemed that 4.b.2 is 
the correct algorithm, and always was the correct algorithm. i.e. to 
operate on language within an XMLLiteral it is necessary to treat the 
XMLLiteral as XML not merely a text string. That seems fundamental to the 
problem; and the WG decision avoids encouraging implementors to half-solve 
the problem.

Martin appears to be advocating treating XMLLiterals just like a text 
string, which, in the end, is likely to encourage incorrect code.


Jeremy

Received on Wednesday, 28 May 2003 05:19:40 UTC