- From: Anne van Kesteren <annevk@opera.com>
- Date: Fri, 29 May 2009 12:05:32 +0200
- To: "Henri Sivonen" <hsivonen@iki.fi>, "Sam Ruby" <rubys@intertwingly.net>
- Cc: "HTML WG" <public-html@w3.org>
On Thu, 28 May 2009 15:42:56 +0200, Henri Sivonen <hsivonen@iki.fi> wrote: > On May 28, 2009, at 16:15, Sam Ruby wrote: >> Anybody care to identify any more specifics? > > My understanding is that search engines that process massive amounts of > data may want to do so with a streaming parser that doesn't abort on > errors for which compliant recovery isn't streamable. It seems possible > to perform indexing usefully without complying with the spec in the > non-streamable cases. > > I don't have first-hand experience of working on a search engine, I'm > not sure how much of a concern full streamability actually is, and I'm > not sure if it's worthwhile to address this case in the spec. > > (It's inconceivable to expect browsers to switch to streamable recovery, > so that's not an option.) Yeah, I recall this being discussed on IRC at some point. I think it was also discussed to actually define what exactly streaming APIs would have to do that do not have some tree-like representation and do not want to abort on errors for which a tree-like representation is required to "recover". Such an algorithm could also be useful for highly optimized data extraction. E.g. <title> / <link> / <meta> etc. -- Anne van Kesteren http://annevankesteren.nl/
Received on Friday, 29 May 2009 10:06:25 UTC