- From: Felix Sasaki <fsasaki@w3.org>
- Date: Thu, 19 Jun 2008 13:30:44 +0900
- To: Asgeir Frimannsson <asgeirf@redhat.com>
- CC: public-i18n-its-ig@w3.org
Hi Asgeir, Asgeir Frimannsson さんは書きました: > Hi Felix, all, > > On Tuesday 17 June 2008 11:04:49 Felix Sasaki wrote: > >> Jirka Kosek さんは書きました: >> >>> Asgeir Frimannsson wrote: >>> >>>> I guess this is one of the areas where you have a gut feeling that >>>> something could be done better, but have no implementations to >>>> justify that claim :) Some of the main drawbacks with ITS at the >>>> moment are: >>>> - Having to load the instance document into memory for processing >>>> - Having to traverse the in-memory DOM for each rule, as most xpath >>>> processors take one expression and returns a node set. >>>> >>> Please note that as long as you stick to XPath patterns (not full >>> expressions) you can use internal pattern matching API of XSLT >>> processor which is optimized for this task and gives much better >>> performance then naive evaluating of each XPath against document tree. >>> >> Asgeir, thanks for pointing to the Blog from Jeni, and Jirka, thanks for >> pointing out the benefit of using XPath (XSLT) patterns here. I'm >> wondering if these patterns would do the job for Asgeir, and I'm aware >> that this is no perfect solution. If you, Asgeir, still want to have >> something more streamable, "Compile a state machine based on a set of >> rules", it would be good to know how you want to construct these rules: >> based on XPath, a subset of XPath (like the XSLT patterns or the EBNF in >> the Wiki), or something completely different. >> > > A bit of background: This topic initially started over a conversation between > Yves (Savourel), myself and Jim (Hardgrave), where Yves briefly mentioned his > work on the ITS api. I - perhaps prematurely - argued that there had to be a > better solution than using a memory-intensive DOM parser for converting XML > documents to/from typical localisation formats. > > Now, much thanks to the wisdom of Jirka and Felix, I do see that this problem > is not as simple a I initially thought :) > > The deeper question I'm asking is perhaps if the full ITS spec is a bit > overkill for many situations. For most formats (docbook, dita, etc), isn't a > very limited knowledge of the structure of a document enough to determine > these i18n attributes? Look e.g. at the example ITS rules in the 'best > practices' document, where the majority of rules uses a very simple > "contextual subset" of xpath. In most cases the namespace+element names (or > attribute + parent element) are enough information to determine the i18n- > attributes. This looks more than a 'schema' like language than the ITS > pattern-based approach, and perhaps a way of annotating the schema/dtd/etc > would be a better approach for many formats? > In the "beginning" of ITS 1.0 (the first public draft), we had something called schemaRule, see below which is taken from http://www.w3.org/TR/2005/WD-its-20051122/ <xs:element name="p"> <xs:annotation> <xs:appinfo> <its:schemaRules> <its:schemaRule translate="yes"/> <its:schemaRule locInfo="This has to be handled carefully" locInfoType="alert"/> </its:schemaRules> </xs:appinfo> </xs:annotation> ... </xs:element> we dropped schemaRule for various reasons, but for streaming we might think of something similar. > Now, I'm NOT suggesting a change to ITS itself, as it serves many other use- > cases than what I deal with. And once we go beyond the use-case I described > above, ITS suddenly becomes very powerful and attractive. I do not have an > immediate need for a streaming ITS processor, hence neither time to work > develop one. ...Although at some point when we do start using ITS more > heavily, I might have to revisit this. It's nevertheless a very interesting > problem :) > Of course :) Felix
Received on Thursday, 19 June 2008 04:31:53 UTC