W3C home > Mailing lists > Public > www-international@w3.org > July to September 2008

RE: URIs and i18n

From: Jean-Guilhem Rouel <jean-gui@w3.org>
Date: Tue, 29 Jul 2008 21:33:00 +0200 (CEST)
Message-ID: <49789.82.122.98.178.1217359980.squirrel@webmail.sophia.w3.org>
To: "Richard Ishida" <ishida@w3.org>
Cc: www-international@w3.org, "'Jean-Guilhem Rouel'" <jean-gui@w3.org>

> I think it is important to note that Jean-Gui plans (if I correctly
> understood his plans from an off-line discussion I had with him) to allow
> people to enter their name into his application in native script, but will
> provide an additional field for them to input an ascii-only version of the
> name that will be used in the URIs.

At the beginning, the plan was to only enter the name in native script and
automatically transform it to ASCII, because users will be (at least at
the beginning) European. My first thought was that European scripts were
close enough to ASCII to allow such automatic translations. But I realize
now that this was quite naive and that I was probably wrong. Allowing
users to enter their name in native script and ASCII is probably better.
It would then be possible to use both in URIs/IRIs.

> [...] One question in my mind is why people should be reading/writing
the URI,
> rather than using the application's power to automatically
> obtain/present/package the needed information.

My hope for the tool is that people won't have to read/write URIs. If they
have to, I will consider I failed in designing it. My concern was that
many people like to see/understand URIs and to be able to type them
directly (perhaps another wrong assumption, but at least I like to, am I a
geek? :)). That's why I'd like to present nice URIs to users.

> [...]
> So I think that asking people, when they supply their name, to provide it
> both in native format and in their preferred ascii-only spelling is
> probably
> the best way. Then both forms of the name should be available whenever a
> name is used.  This means that if people are expected to read/write URIs,
> the IRI and the ascii-only URIs should be equivalent.

I think that's what I'll do, seems to be the best solution.

> Note that this can also be extended to postal addresses, company names,
> etc.
> People should be able to choose to look up, say, a Russian company address
> in either local (Cyrillic) or international (Latin) formats, so you need
> to
> collect and store both forms.

For this specific app, I don't think that will be necessary as we don't
store addresses or anything else that would need to be written in native
script.

> This also ties in with what I've been saying for years now about the
> general
> use of IRIs.  I think you should register two domain names, one in native
> script and one in ascii only, if you want your URIs to be used
> internationally as well as by your home market/user base.  I certainly
> encourage the use of IRIs for local use, but there needs to be an
> alternative for others if your URI is exposed to other cultural/linguistic
> groups.

It's a tool for a European project so I don't think I need two domains,
but there scientists (yes, users will be researchers) can be from all
around the world, working for European institutes. Allowing people to use
both native and ASCII is quite cheap to do.

Thanks all for you inputs.
Jean-Gui

PS: I won't have any computer for the next two weeks so I'm unlikely to
participate to the discussion for that time, but I'll check the thread as
soon as I'm back from vacation.
Received on Tuesday, 29 July 2008 19:33:36 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:17 GMT