I18n and Linked Data - an important (but fixable) omission?

Date: Fri, 9 Sep 2011 15:33:35 -0400
How about something like this. The first couple sentences are taken from
the RFC 3986 and RFC 3987 abstracts almost verbatim and may need to be


A Uniform Resource Identifier (URI) is a compact sequence of characters
[in a standardized syntax] that identifies an abstract or physical
resource. [RFC 3986]. An Internationalized Resource Identifier (IRI)
[RFC 3987] compliments URIs by including characters from the Universal
Character Set (Unicode/ISO 10646). While this report follows common
Linked Data practice of using the term "URI", readers should note the
increasing prominence of IRIs as non-Latin script resources and
participants are being joined in the Linked Data environment.





On 9 Sep 2011, at 19:21, Tom Baker wrote:

On Fri, Sep 09, 2011 at 06:08:21PM +0100, Jodi Schneider wrote:

This complicates the section on Linked Data -- one of the key places I

	we need to simplify. So I would propose reverting that change,
so that this

	paragraph focuses only on Linked Data -- the concept it is

I take your point that squeezing a reference to IRIs into a definition
"Linked Data" interrupts the flow of that brief definition.

Then, if we do feel the need to cover URIs in the Scope section, I'd

	that we gave it its own line (similar to how we define

That sounds like a good solution.  I think the section should end with
"Library Linked Data", so my preference would be to insert a new item 
between "Linked Data" and "Open Data" -- i.e., right after "Linked
Maybe it could be called "Uniform Resource Identifiers (URIs)", define
URIs, and refer also to IRIs.


+1 -- particularly your location proposal makes sense.




	Alternately we might want to put it in the "Available
Technologies" Appendix

		section of the report: We have considerably simplified a
number of issues in

		the main report.

	I think IRIs are important enough to emphasize up-front -- in
the Appendix,
	the point would be much less prominent.

	While I'm not sure that the *term* "IRI" is that much harder to

		than "URI" (which is different from the "URL" which is
in common practice),

		you make a good point that URIs, rather than IRIs, are
currently emphasized

		in Linked Data. It would be helpful to know whether, for
instance, the

		National Diet Library is currently using IRIs for Linked

	...or indeed, whether the advocates of IRIs advocate their use
in libraries
	regardless of scripts used -- i.e., even for Latin-script URIs?
Bottom line:
	since this report is about Linked Data, and the Linked Data
message always 
	talks about URIs (or even URLs), I think we need to stick with
URIs.  But we 
	can and should draw attention to IRIs up-front.  Inserting a
separate item
	into Scope would do that even better than the solution I
	Would you like to propose a text?


Here's an attempt:

A URI--or Uniform Resource Identifier--is a string of characters used as
an identifier. ISBNs and Web URLs are both examples of URIs. While this
report follows common practice in emphasizing URIs, readers should note
the increasing role of Internationalized Resource Identifiers (IRIs)
<http://tools.ietf.org/html/rfc3987>  as multilingual Web addresses
<http://www.w3.org/International/articles/idn-and-iri/>  that support
non-Latin scripts.

Jeff? Others? Thoughts/improvements?




