Re: [DOM4] XML lang from Bjoern Hoehrmann on 2011-10-05 (public-webapps@w3.org from October to December 2011)

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Wed, 05 Oct 2011 21:27:10 +0200
To: Marcos Caceres <w3c@marcosc.com>
Cc: public-webapps <public-webapps@w3.org>
Message-ID: <ak7p871ldk4rhh0d04nbmuhkppvdduglmo@hive.bjoern.hoehrmann.de>

* Marcos Caceres wrote:
>1. I need to find elements of a particular type/name that are in a
>particular language (in tree order), so that I can extract that
>information to display to a user.

  .selectNodes("//type[lang('language')]")

>2. I need to check what the language of an element is (if any),
>without walking up the tree to look for an xml:lang attribute.
>Walking the tree is expensive, specially when XML says that xml:lang
>value is inherited by default.

  .selectSingleNode("ancestor-or-self::*[@xml:lang][1]/@xml:lang")

Where these methods are not supported you can use DOM Level 3 XPath,
and in the first case you can use the CSS Selectors API aswell.

>//using BCP 47 [lookup] algorithm 
>var listOfElements = document.getElementsByLang("en-us"); 

If you really want to write code, you might want to check out Silver-
light which has an object model not stuck in the mid-1990s. There you
can use the lazy "LINQ to XML" methods, the latter case would be e.g.

  .AncestorsAndSelf().Attributes(XNamespace.Xml + "lang").First()

I note that your getElementsByLang method does not actually do what
you want as it does not filter by element type, you would end up with
possibly an enumeration of all elements in the document and would then
have to filter them all by element type. Similarily, if you filter by
element type and then filter by comparing the language as in

>listOfElements[1].lang == "en"; 

you might end up with rather inefficient code: the implementation may
for instance redundantly walk the tree for each element. That's also
true when evaluating query language expressions, but it's easier for
the implementation to recognize you want to filter by both than if it
has to comprehend your custom filtering code. Also note problems such
as comparing "en-us" and "en". It may well be wiser to have a testing
method like .isLanguage('en') instead or in addition, but even then
it's reasonable to expect authors to use this incorrectly and perform
their own substring matches and so on.

In principle there is a point in exposing some of this at a low level
like the DOM as there may actually be many sources for an element's
language, but having support for this in the query languages, there is
not much to be gained by supporting additional APIs for it.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/

Received on Wednesday, 5 October 2011 19:27:42 UTC