Re: RDFa DOM API - New Editor's Draft (regarding DataIterator) from Mark Birbeck on 2010-06-02 (public-rdfa-wg@w3.org from June 2010)

From: Mark Birbeck <mark.birbeck@webbackplane.com>
Date: Wed, 2 Jun 2010 18:30:20 +0100
To: benjamin.adrian@dfki.de
Cc: Ivan Herman <ivan@w3.org>, RDFa WG <public-rdfa-wg@w3.org>
Message-ID: <AANLkTilpGfTfzOqS7h7Oh2vqT6Vk3MyA8nGDYkMzrT9N@mail.gmail.com>
Hi Benjamin,

(This is also taking on board some of Ivan's comments about this interface.)

I wonder if we're looking at this the wrong way round, in that it's
not so much that the 'normal' mode is to place triples in a store, but
rather that the 'normal' mode involves a default 'triple handler'
which has the task of placing triples in a store.

I'll explain...

To go back a step, let's say that we had defined the parse method to
take a callback function which gets called whenever a triple is found:

  var parser = document.data.createParser( /* no store */ );

  parser.parse(document, function( t ) { ... });

Don't worry for now what the parameter is, the key thing is that we're
passing a triple to this function, and this function then does what it
wants with each triple. Maybe we also return 'false' if we want to
continue parsing, and 'true' if we want to abort (or the other way
round...whatever...).

Anyway, this means that a programmer could easily set up a SAX-style
scenario where their code ignores all triples other than the one that
they are looking for in the document:

  var parser = document.data.createParser( /* no store */ );

  parser.parse(document, function( t ) {
    if (t.predicate === "a" && t.object === "<http://...Person>") {
      doSomething();
      return true;
    }
    return false;
  });

As you can see, no memory is used because no store is used -- which I
think fits your use-case.

Now, if we go up a level we also want to be able to store each triple
so that we can run queries:

  var store = document.data.createStore();
  var parser = document.data.createParser( /* no store */ );

  parser.parse(document, function( t ) {
    store.add( t );
  });

In this scenario, each time a triple is found in the document the
callback function places it into the store.

Of course, the parameter for the callback function is the same as the
parameter for the add function, so this pattern can be abbreviated to
this:

  var store = document.data.createStore();
  var parser = document.data.createParser( /* no store */ );

  parser.parse(document, store.add);

Anyway, this will be such a common pattern that it would be a useful
convention to say that if there is no callback function, then the
parser should call the add() method on a store. So authors can also do
this (note that the store is now passed to the createParser() method):

  var store = document.data.createStore();
  var parser = document.data.createParser( store );

  parser.parse(document);

And wouldn't you know, that is what we currently support. :)

In other words, we can achieve both use-cases via the same mechanism,
and using only one interface; storing triples in a store is actually
an 'overlaid' feature, that builds upon the default behaviour.

What do you think?

Regards,

Mark

On Tue, Jun 1, 2010 at 3:54 PM, Benjamin Adrian <benjamin.adrian@dfki.de> wrote:
> Am 31.05.2010 10:39, schrieb Ivan Herman:
>>
>> First of all, I think my question had two parts. One is why having the
>> DataParser interface separately (and that is what you are arguing for below)
>> and the second is what is the role of the 'DataIterator' method within that
>> interface. You did not answer on the second...
>>
>
> The DataIterator gives developers the chance to parse RDFa content by less
> consuming memory than the standard parse method that
> stores all triples into the store. Therefore it provides you an iterator
> that let's you traverse through RDF content inside the DOM tree.
> For each triple you can decide to store it or to do something else with it.
>
> It's similar to the NodeIterator of the DOM API.
>
> Best regards,
>
> Benjamin
>
> --
> __________________________________________
> Benjamin Adrian
> Email : benjamin.adrian@dfki.de
> WWW : http://www.dfki.uni-kl.de/~adrian/
> Tel.: +49631 20575 145
> __________________________________________
> Deutsches Forschungszentrum für Künstliche Intelligenz GmbH
> Firmensitz: Trippstadter Straße 122, D-67663 Kaiserslautern
> Geschäftsführung:
> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff
> Vorsitzender des Aufsichtsrats:
> Prof. Dr. h.c. Hans A. Aukes
> Amtsgericht Kaiserslautern, HRB 2313
> __________________________________________
>
>
>
Received on Wednesday, 2 June 2010 17:30:55 UTC