Re: I18n and Linked Data - an important (but fixable) omission?

On Sat, Sep 10, 2011 at 12:07:00PM +0900, "Martin J. D?rst" wrote:
> It's unfortunately somewhat hidden, so you may not be aware of, but
> in terms of technology, in particular RDF, what it calls "URIs" are
> actually IRIs. Please have a look at
> http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref.
> 
> So (very roughly paraphrased), to write something like "Linked Data
> uses URIs (ASCII only), but you might also think about supporting
> IRIs", would be wrong. It would be better to write something along
> the lines ""Linked Data uses URIs. By definition, this includes IRIs
> (see Section 6.4 of RDF Concepts)."

I like this.  The following section is particularly helpful in 
clarifying the issue.  Comments interspersed...

> >...or indeed, whether the advocates of IRIs advocate their use in libraries
> >regardless of scripts used -- i.e., even for Latin-script URIs?
> 
> I'm not sure I understand your question here.
> 
> If you want to ask whether libraries e.g. in the US should make sure
> that the semantic web technology and products they use conforms to
> the specifications and does not limit identifiers to US-ASCII only,
> then the answer would be clearly YES.
> 
> If your question is whether a library e.g. in the US should use an
> URI or an IRI for an identifier such as
> http://en.wikipedia.org/wiki/Football, the answer is that this is
> irrelevant; by definition, all URIs are also IRIs.
> 
> If your question is whether such a library, for such an identifier,
> should add non-ASCII characters to make it an IRI but not an URI,

Re: "an IRI but not a URI", see my comment below.

> e.g. by modifying the above URI/IRI to something like
> http://en.wikipedia.org/wiki/Fóòtbåll, then the answer is of course
> NO. I hope nobody advocates such nonsense.

Based on my own misunderstanding of IRIs, that had indeed been my question.

> If your question is whether a library e.g. in Germany or France or
> Italy, where the languages used are written with the Latin script
> including diacritics, should create IRIs that are not URIs as

As above, perhaps you mean "IRIs that are not US-ASCII-only URIs", since by
definition all IRIs are also URIs, if I correctly understand?

> identifiers, then this may depend on various circumstances, i.e. the
> availability and familiarity of people with US-ASCII fallbacks,...
> As an example, the German Wikipedia has
> http://de.wikipedia.org/wiki/Fußball, but this is also available
> under http://de.wikipedia.org/wiki/Fussball.
> (Please note that I wrote 'create'; for 'use', the answer is
> different because identifiers may come from the outside without a
> choice.)

I think this can be handled without confusing the average reader and will
propose a wording.

Tom

-- 
Tom Baker <tom@tombaker.org>

Received on Saturday, 10 September 2011 18:00:41 UTC