RE: faq suggestions

Hi Søren,

It depends on the language negotiation strategy of the site.

For sites that use simple URL rewriting, web crawlers can just go find the alternative language resources. In other words, if the English lives at http://example.com/en and the French at http://example.com/fr, then web crawlers can find the resources under those paths and index them. The resources are still on the site. Language negotiation typically refers to software logic at the server for giving a specific language version of a resource given a generic URI. Specific URIs in this case still work to retrieve a specific language.

If the site uses extension based language negotiation (as with Apache MultiViews) or if web crawlers cannot automagically find where the alternate languages live (perhaps they are generated dynamically), then things get more complicated. One way to do it is to put a lot of extreneous META information in page headers (yuck). A manual link page that leads to specific language versions can be put on the site (the page might be the one used by users to change their language manually or it might just be a page intended to guide web crawlers into specific language "channels"). Most web crawlers will index both directories and try to reach links found on a site and you can assist this with ROBOTS meta tags that direct the crawler to follow specific links. Providing links with specific language preferences in them for web crawlers to index can help get all of the language versions indexed and you can use otherwise "hidden" pages to help the robot along...

Best regards,

Addison

Addison P. Phillips
Director, Globalization Architecture
webMethods | Delivering Global Business Visibility
http://www.webMethods.com
Chair, W3C Internationalization (I18N) Working Group
Chair, W3C-I18N-WG, Web Services Task Force
http://www.w3.org/International

Internationalization is an architecture. 
It is not a feature.

> -----Original Message-----
> From: www-international-request@w3.org 
> [mailto:www-international-request@w3.org]On Behalf Of Søren Roug 
> (by way of Martin Duerst <duerst@w3.org>)
> Sent: 2004年10月6日 5:33
> To: www-international@w3.org
> Subject: faq suggestions
> 
> 
> 
> 
> 
> 
> I just read your FAQ on language negotiation at 
> http://www.w3.org/International/questions/qa-when-lang-neg
> 
> My question is; With language negotiation, how will search engines ever 
> discover other pages than the ones in the default language?
> 
> --
> Sincerely yours / Med venlig hilsen, Soren Roug <soren.roug@eea.eu.int>
> European Environment Agency, Kongens Nytorv 6, DK-1050 Copenhagen K
> Tel: +45 3336 7212 Fax: +45 3336 7199 Jabber: roug@jabber.eionet.eu.int
> Invalid signature? http://www.eionet.eu.int/certificates/untrusted
> 
> 

Received on Thursday, 7 October 2004 01:01:06 UTC