Re: RDFa DOM API - New Editor's Draft (regarding DataIterator) from Ivan Herman on 2010-06-03 (public-rdfa-wg@w3.org from June 2010)

From: Ivan Herman <ivan@w3.org>
Date: Thu, 3 Jun 2010 15:16:00 +0200
To: Mark Birbeck <mark.birbeck@webbackplane.com>
Cc: benjamin.adrian@dfki.de, RDFa WG <public-rdfa-wg@w3.org>
Message-Id: <5B288A7D-BE34-4461-804F-53224DEBE391@w3.org>
Hey Mark,

I certainly prefer this approach to the current one. I still have a certain unease of having, in effect, a mechanism to completely bypass (if wanted) our document-store model but I understand the need for that in some cases (I had a chat with Manu yesterday who was giving me use cases). So it is fine.

Ivan



On Jun 2, 2010, at 19:30 , Mark Birbeck wrote:

> Hi Benjamin,
> 
> (This is also taking on board some of Ivan's comments about this interface.)
> 
> I wonder if we're looking at this the wrong way round, in that it's
> not so much that the 'normal' mode is to place triples in a store, but
> rather that the 'normal' mode involves a default 'triple handler'
> which has the task of placing triples in a store.
> 
> I'll explain...
> 
> To go back a step, let's say that we had defined the parse method to
> take a callback function which gets called whenever a triple is found:
> 
>  var parser = document.data.createParser( /* no store */ );
> 
>  parser.parse(document, function( t ) { ... });
> 
> Don't worry for now what the parameter is, the key thing is that we're
> passing a triple to this function, and this function then does what it
> wants with each triple. Maybe we also return 'false' if we want to
> continue parsing, and 'true' if we want to abort (or the other way
> round...whatever...).
> 
> Anyway, this means that a programmer could easily set up a SAX-style
> scenario where their code ignores all triples other than the one that
> they are looking for in the document:
> 
>  var parser = document.data.createParser( /* no store */ );
> 
>  parser.parse(document, function( t ) {
>    if (t.predicate === "a" && t.object === "<http://...Person>") {
>      doSomething();
>      return true;
>    }
>    return false;
>  });
> 
> As you can see, no memory is used because no store is used -- which I
> think fits your use-case.
> 
> Now, if we go up a level we also want to be able to store each triple
> so that we can run queries:
> 
>  var store = document.data.createStore();
>  var parser = document.data.createParser( /* no store */ );
> 
>  parser.parse(document, function( t ) {
>    store.add( t );
>  });
> 
> In this scenario, each time a triple is found in the document the
> callback function places it into the store.
> 
> Of course, the parameter for the callback function is the same as the
> parameter for the add function, so this pattern can be abbreviated to
> this:
> 
>  var store = document.data.createStore();
>  var parser = document.data.createParser( /* no store */ );
> 
>  parser.parse(document, store.add);
> 
> Anyway, this will be such a common pattern that it would be a useful
> convention to say that if there is no callback function, then the
> parser should call the add() method on a store. So authors can also do
> this (note that the store is now passed to the createParser() method):
> 
>  var store = document.data.createStore();
>  var parser = document.data.createParser( store );
> 
>  parser.parse(document);
> 
> And wouldn't you know, that is what we currently support. :)
> 
> In other words, we can achieve both use-cases via the same mechanism,
> and using only one interface; storing triples in a store is actually
> an 'overlaid' feature, that builds upon the default behaviour.
> 
> What do you think?
> 
> Regards,
> 
> Mark
> 
> On Tue, Jun 1, 2010 at 3:54 PM, Benjamin Adrian <benjamin.adrian@dfki.de> wrote:
>> Am 31.05.2010 10:39, schrieb Ivan Herman:
>>> 
>>> First of all, I think my question had two parts. One is why having the
>>> DataParser interface separately (and that is what you are arguing for below)
>>> and the second is what is the role of the 'DataIterator' method within that
>>> interface. You did not answer on the second...
>>> 
>> 
>> The DataIterator gives developers the chance to parse RDFa content by less
>> consuming memory than the standard parse method that
>> stores all triples into the store. Therefore it provides you an iterator
>> that let's you traverse through RDF content inside the DOM tree.
>> For each triple you can decide to store it or to do something else with it.
>> 
>> It's similar to the NodeIterator of the DOM API.
>> 
>> Best regards,
>> 
>> Benjamin
>> 
>> --
>> __________________________________________
>> Benjamin Adrian
>> Email : benjamin.adrian@dfki.de
>> WWW : http://www.dfki.uni-kl.de/~adrian/
>> Tel.: +49631 20575 145
>> __________________________________________
>> Deutsches Forschungszentrum für Künstliche Intelligenz GmbH
>> Firmensitz: Trippstadter Straße 122, D-67663 Kaiserslautern
>> Geschäftsführung:
>> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff
>> Vorsitzender des Aufsichtsrats:
>> Prof. Dr. h.c. Hans A. Aukes
>> Amtsgericht Kaiserslautern, HRB 2313
>> __________________________________________
>> 
>> 
>> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Attachments

application/pkcs7-signature attachment: smime.p7s
Received on Thursday, 3 June 2010 13:14:56 UTC