W3C home > Mailing lists > Public > www-international@w3.org > January to March 2004

RE: How to organize a multilingual website

From: Addison Phillips [wM] <aphillips@webmethods.com>
Date: Tue, 10 Feb 2004 15:59:57 -0800
To: "Richard Ishida" <ishida@w3.org>, <www-international@w3.org>
Message-ID: <PNEHIBAMBMLHDMJDDFLHEEAOHKAA.aphillips@webmethods.com>

At webMethods, our servers work a bit differently. We use "silos", but...

The server keeps the current user language preference in the "session" object (the state mechanism associated with a particular user). For each request, the server searches the most appropriate silo for the file, falling back through the language hierarchy until it reaches the "default" resource (in our case, generally this is the US English).

Thus, a browser with an Accept-Language "de-CH-1996" requesting page "/boo/foo.html" issues a request that might look like:

http://www.example.com/foo.html

The server actually tries to load:

{webroot}/de-ch-1996/boo/foo.html
{webroot}/de-ch/boo/foo.html
{webroot}/de/boo/foo.html
{webroot}/boo/foo.html
{404}

This means that the links in translated files can stay exactly as they are. Relative links work. Absolute links work. This makes the localization process loads easier (no more mangled links--let the TM protect them). The localized language sets are delivered as a single JAR file that is unpacked at install time and requires no changes to the configuration of any part of the running software (it can be delivered to a RUNNING server).

For shared files: you can sparsely populate any of the languages. The first match is returned. This means that shared files like .js don't have to exist multiple times (as in the pure silo model).

This is the equivalent of your [5], except that the content isn't modified in any way. Note that you can achieve a similar effect using a taglib or other active content mechanism: I have a JSP taglib that works exactly the same way without a webMethods server...

The downsides to this model are:

a) synchronization of source and target language files. A "fix" the default page doesn't propagate automagically.
b) there are faster potential mechanisms.

Best Regards,

Addison

Addison P. Phillips
Director, Globalization Architecture
webMethods | Delivering Global Business Visibility
http://www.webMethods.com
Chair, W3C Internationalization (I18N) Working Group
Chair, W3C-I18N-WG, Web Services Task Force
http://www.w3.org/International

Internationalization is an architecture. 
It is not a feature.

> -----Original Message-----
> From: www-international-request@w3.org
> [mailto:www-international-request@w3.org]On Behalf Of Richard Ishida
> Sent: mardi 10 février 2004 14:35
> To: 'Richard Ishida'; www-international@w3.org
> Subject: RE: How to organise a multilingual website
> 
> 
> 
> A fourth and fifth model came to mind as I drove home from the office.  
> 
> [4] eliminates advantages for localization, but offers 
> significant linking benefits when linking to files on a server.
> 
> The arrangement of files is as for model [1] (see below), (ie. 
> all localised versions of a given file in the same place) but all 
> internal links explicitly reference a specific language version.  
> For example, to link from index.fr.html to conversion.fr.html 
> your href would be "conversion.fr.html" (whereas in model 1 that 
> would have been simply "conversion").  This would mean that all 
> links would need to be 'translated'.  So why do it?  Because 
> let's suppose that conversion.html could be linked to directly 
> from outside the site - you'd want to have a single URI and 
> enable content negotiation. If conversion.html and  
> conversion.fr.html were in different directories (as they are in 
> the other models) this wouldn't work.
> 
> The explicit links enable the internal links to work equally well 
> from a CD.
> 
> The motivation behind this model appears to provide a good reason 
> for avoiding any silo based approach.
> 
> 
> [5] comes at the problem from a different angle.  This model uses 
> exactly the same approach as model [1] for accessing files on the 
> server, but before files are accessed from CD some process is run 
> to convert all links to explicit, language-specific ones. 
> 
> This seems to be the only approach that reaps the benefits all 
> round. Whether you supply the server version or the CD version 
> for localisation, they don't need to change any links. Content 
> negotiation works on the server. Converted links also work on the 
> CD.  You just need to be able to convert between the two.
> 
> 
> 
> RI
> 
> 
> 
> > -----Original Message-----
> > From: www-international-request@w3.org 
> > [mailto:www-international-request@w3.org] On Behalf Of Richard Ishida
> > Sent: 10 February 2004 13:36
> > To: www-international@w3.org
> > Subject: How to organise a multilingual website
> > 
> > 
> > 
> > I am exploring alternatives for organising static (ie. Not dynamically 
> > built from a database) multilingual files comprising a site.  I'd 
> > welcome any comments and suggestions.
> > 
> > Let's assume a concrete situation, to help us be practical.
> > I have a bunch of static files on an Apache server that can
> > be viewed in English or French. The base URL is 
> > http://www.example.org/uniview/ Some files are not language 
> > sensitive: this includes .js, and .gif files.  There is one 
> > image, however, which has to be different in the French 
> > version. All internal links are relative.
> > 
> > I'm assuming that,
> > 1. to help avoid errors and reduce time during the localization 
> > process, we'd prefer to avoid the need to change links during 
> > translation 2. to save server space and facilitate maintenance we'd 
> > prefer not to have duplicate copies of the same file 3. we'd want to 
> > be able to do language negotiation for files on the server, using the
> > Accept-Language information in the HTTP header, to get people 
> > to the right starting point.
> > 
> > Let's also assume that there are two possible usage scenarios: 1. the 
> > files are read from the server 2. the files are read from server or CD
> > 
> > I put together a few models for discussion. The tabbed lists show 
> > possible directory structures. The files local.xx.css import the file 
> > shared.css and add any language overrides or extensions needed.
> > 
> > 
> > [1] Content negotiation
> > -----------------------
> > uniview
> > 	index.html
> > 	index.fr.html
> > 	conversion.html
> > 	conversion.fr.html
> > 	u.js
> > 	u.fr.js
> > 	descriptions.js
> > 	descriptions.fr.js
> > 	local.css
> > 	local.fr.css
> > 	shared.css
> > 	functions.js
> > 	conversion.js
> > 	images
> > 		img1.gif
> > 		img2.gif
> > 		img2.fr.gif
> > 
> > This seems to be the simplest model.  All links are specified without 
> > extensions. No links need translation.
> > 
> > Apache can do language negotiation for relative internal links. This, 
> > unfortunately, would not work if read from a CD, and the site would 
> > fail.
> > 
> > Users reading from the server could be automatically directed to the 
> > appropriate starting point based on their Accept-Language preferences. 
> > (CD users would have to specify the correct index file for any of 
> > these alternatives.)
> > 
> > Seems like this would be best for server-only sites; not an option 
> > though for CD-based sites.  Translation cost would be minimal.
> > 
> > 
> > 
> > 
> > 
> > [2] Silos
> > ---------
> > uniview
> > 	en
> > 		index.html
> > 		conversion.html
> > 		u.js
> > 		descriptions.js
> > 		local.css
> > 	fr
> > 		index.html
> > 		conversion.html
> > 		u.js
> > 		descriptions.js
> > 		local.css
> > 		images
> > 			img2.gif
> > 	shared
> > 		functions.js
> > 		conversion.js
> > 		shared.css
> > 		images
> > 			img1.gif
> > 			img2.gif
> > 
> > 
> > In this model all links are specified with standard extensions. No 
> > links need translation apart from those linking to 
> > /fr/images/img2.gif.
> > 
> > The relative internal links would work equally well on server or CD.
> > 
> > I can't see how you could automatically route people to the right 
> > starting point on the server based on their Accept-Language 
> > preferences.  Alternatives include: A. link to the correct index 
> > explicitly B. provide a (probably
> > annoying) intermediate page to allow language selection C.
> > link to invisible index pages, uniview/index.html and 
> > uniview/index.fr.html, that automatically redirect you to 
> > uniview/en/index.html or uniview/fr/index.html (seems a 
> > little complicated, but may work)
> > 
> > Seems like this would be fine for CD-based sites, but would require 
> > extra stuff to allow language negotiation to serve up the right index 
> > file. Translation cost would be minimal other than for the localised 
> > img file(s).
> > 
> > 
> > 
> > [3] Half & half
> > ---------------
> > uniview
> > 	index.html
> > 	index.fr.html
> > 	en
> > 		conversion.html
> > 		u.js
> > 		descriptions.js
> > 		local.css
> > 	fr
> > 		conversion.html
> > 		u.js
> > 		descriptions.js
> > 		local.css
> > 		images
> > 			img2.gif
> > 	shared
> > 		functions.js
> > 		conversion.js
> > 		shared.css
> > 		images
> > 			img1.gif
> > 			img2.gif
> > 
> > In this model all links except those to the index files would have 
> > standard extensions.  Links to the index files, and img2.gif files 
> > would need translation.  All links in the index files would need 
> > translation.
> > 
> > All relative internal links would work equally well on server or CD.
> > 
> > Language negotiation could be used on the server to point the person 
> > to the appropriate index file for starting.
> > 
> > Seems like this would work equally well on server- or CD-based sites. 
> > But translation complexity would be higher.
> > 
> > 
> > 
> > Are there any other models? Are these conclusions correct? Comments 
> > and suggestions welcome!
> > 
> > RI
> > 
> > 
> > ============
> > Richard Ishida
> > W3C
> > 
> > contact info: http://www.w3.org/People/Ishida/
> > 
> http://www.w3.org/International/ 
> http://www.w3.org/International/geo/ 
> 
> W3C Internationalization FAQs 
> http://www.w3.org/International/questions.html
> RSS feed: http://www.w3.org/International/questions.rss 
> 
Received on Tuesday, 10 February 2004 19:04:35 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 07:14:11 UTC