- From: Asgeir Frimannsson <asgeirf@redhat.com>
- Date: Mon, 16 Jun 2008 17:47:43 +1000
- To: Felix Sasaki <fsasaki@w3.org>
- Cc: public-i18n-its-ig@w3.org
On Sunday 15 June 2008 00:35:07 Felix Sasaki wrote: > Asgeir Frimannsson さんは書きました: > > Felix, Jirka, all, > > > > On Saturday 14 June 2008 01:43:02 Felix Sasaki wrote: > >>> Reading through the ITS spec, it seems like ITS only uses a subset of > >>> xpath, limited to the child and attribute axes (same as xslt patterns). > >> > >> in XSLT patterns you can have predicates, like "*[predicate]" , which > >> can make use of any axis. Would you limit the content of these too? > > > > This would make streaming-implementations slightly more complicated yes > > :) Thank you both for pointing out this issue. > > sorry for the overlap in replies, I did not see Jirka's mail while > sending mine. > > > I guess for many (if not most) formats, limiting the content of > > predicates would be feasible, and this would also speed up the xpath > > processing. Creating a streaming-like ITS processor that could handle > > "most documents" in a more efficient manner could perhaps be a useful > > alternative to a memory-intensive processor that can handle all > > documents... > > I've cretated a Wiki page > http://www.w3.org/International/its/wiki/ITS_Simplified_XPath > linked from > http://www.w3.org/International/its/wiki/ITS_Processing > Which contains a proposal for a simplified EBNF. Asgeir or others: Could > you see if it fits your needs, and edit the page accordingly? If you > have problems with the Wiki account please tell me. This blog-post by Jeni Tennison gives a very good overview of the streamability-problem: http://www.jenitennison.com/blog/node/61 Quote: "there is no clear line that can be drawn between a streamable XPath and an unstreamable one, only a scale between “buffering nothing” and “buffering everything” (building an object model). Second, you can’t judge the streamability of an XPath expression on its own: there are multiple other factors that effect how streamable a given XPath expression is for a particular processor." I guess this is one of the areas where you have a gut feeling that something could be done better, but have no implementations to justify that claim :) Some of the main drawbacks with ITS at the moment are: - Having to load the instance document into memory for processing - Having to traverse the in-memory DOM for each rule, as most xpath processors take one expression and returns a node set. This is naturally costly compared to a solution that could: 1) Compile a state machine based on a set of rules 2) Apply those rules on in a pseudo-streaming fashion ITS is nevertheless much more powerful that the existing approaches to identifying i18n aspects of XML documents :) cheers, asgeir
Received on Monday, 16 June 2008 07:48:46 UTC