Re: New Editors draft of RDFa API spec from Manu Sporny on 2010-09-14 (public-rdfa-wg@w3.org from September 2010)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Mon, 13 Sep 2010 21:51:35 -0400
To: RDFa WG <public-rdfa-wg@w3.org>
Message-ID: <4C8ED527.1070804@digitalbazaar.com>
On 09/13/2010 05:17 AM, Ivan Herman wrote:
> I removed everything from the answer that does not require further
> comment and where your change are fine with me!

Replies to your replies on my replies to your comments, below... :)

>> Fixed. I added a conformance section to 4.1. This is fairly late in
>> the document, but every section up to that point is non-normative
>> and doesn't must MUST, SHOULD, etc. We can always move the
>> conformance section to the top of the document if people don't like
>> how this reads. I also added the Java clause as well as a clause
>> stating that a best effort should be performed for languages other
>> than ECMASCript and Java.
> 
> Fine with Java for now, but that might be a slightly different
> discussion on how to handle the RDFa specific API parts and the
> generic RDF API issue (Sandro's comments). One approach would be to
> really really emphasize the ECMAScript part only, not to have some
> sort of a collision course with all the various RDF toolkit
> implementations out there in Java already (let alone other
> languages).

I don't see how this would put us on a collision course? Anybody can
create an RDF API that isn't conformant to this RDF API... not everyone
needs to conform to this API. All we're doing is providing guidance for
people that want to create interoperable APIs, right?

I probably don't understand what you mean by "collision course".

>>> ---- 2.1. Goals: although there should be a general discussion
>>> on this, it may be worth emphasizing that not only the API allows
>>> for non-RDFa parsers to be used, but the interface offers some
>>> sort of a generic API to RDF...
>> 
>> Fixed. I added a sub-section called "A Modular RDF API" to try and 
>> clarify this a bit more.
> 
> Actually... in view of the discussion we had last week on the call,
> maybe it is better not to have that there for now. It was not an
> original design goal in the first place, it just happened that way.
> It definitely was not part of the charter, for example... Sorry to
> have led you to this!

It may have not been in the charter, but it was a design goal from the
very early stages of the document. Even before we started work, we were
wondering whether or not we'd be able to create an RDFa API before
having an RDF API. Besides, we're still in the design stages. I
distinctly remember Benjamin, Mark and I having a conversation about the
separation between the RDF API and the RDFa API in the very early days
of the spec - probably around May 2010.

Now, I think I disagree that we should take it out :). What's the harm
in keeping it in there?

>>> ---- 2.2 Concept diagram: I am not sure how, but it might be good
>>> to have on the diagram and the accompanying text, references to
>>> some of the 'sections' of the document. We use, for example, the
>>> term 'RDF Interfaces' in the text; maybe using the same term on
>>> the diagram would be good (if the diagram is in SVG, it should be
>>> a clickable link to the relevant section...). Same for the others
>>> and the text itself.
>> 
>> I agree with you in principle - things start to fall apart after
>> that...
>> 
>> I tried SVG without all of the extra non-W3C shim code required to
>> make SVG work cross-browser. I tried to make native SVG work for 4
>> hours straight one day... couldn't get it to work across all
>> browsers - sizing issues. I gave up. The source document is in SVG
>> if someone would like to give it a shot.
> 
> What I did in the past is to use <object> with a fall back to a png
> version. Didn't that work?

It wasn't the fallback that was the issue. As you alluded, it was that
Firefox and Google Chrome use two different SVG rendering engines and
one of the two was creating black squares in the diagram and performing
resizing incorrectly. I'm sure it had to do with the output for
diagramming tool I was using... but it was difficult to debug exactly
what was going on.

>> Could you provide a suggestion? It took me 2 hours to come up with
>> and implement that example for an advanced query :). I don't want
>> to implement something else unless we have some kind of general
>> agreement that the example is not esoteric. That and my brain hurts
>> right now... help? :)
> 
> using the 'title' attribute because I want to hire people with a 'Dr'
> degree? It sounds very similar...

Seeing that the W3C has this stigma attached to it of containing ivory
tower academics, we may want to pick an example that is more accessible
to the general public :). Not to say that my example was any more
accessible... but since we're talking about it... those external to W3C,
that don't have PhDs, could view this in a less than favorable light.
"Oh, so you have to have 'Dr' in your name to not be glossed over for a
job, eh?!"

What about searching for a person's birthday to send them a Happy
Birthday wish?

>>> ---- TypedLiteralConverter interface: I do not understand what
>>> the targetType parameter is for. Either give a good (and
>>> convincing:-) example, or drop it if it has only a very
>>> restricted use...
>> 
>> Added an example and another method to the DataContext to aid 
>> TypedLiteral conversion. We can't remove this Mark has a plan for 
>> stating targetTypes for TypedLiteral converters. I don't fully 
>> understand his plan, so the interface is a bit shoddy as I don't
>> know exactly how Mark wants to see it implemented. It's a bit
>> clunky right now... perhaps Mark has some insight into how we could
>> make it cleaner.
> 
> 
> The example with 'caps' is actually not a 'targetType'. Maybe the
> issue is having a misnomer here. In this context we have types as
> xsd:integer, xsd:boolean, etc, whereas the targetType is some sort of
> a modifier or an extra attribute or something like that.
> 
> I have the feeling of pushing a functionality into the RDFa API
> interface that is not really necessary and can be handled on the
> application level. The example is simply to have an all caps for
> certain strings; this is hardly something I would see on that level,
> to be honest!
> 
> At the minimum I would like to see some editorial note should be
> added to the interface that this is still to be discussed. I am not
> yet ready to agree having it in the interface...

Good points. I've added an editorial issue in the spec. I also changed
'targetType' to 'modifier'.

>>> I continue to be puzzled by the filter method, ie, by the fact
>>> that it returns another DataStore, rather than just an array of 
>>> RDFTriple-s. I just do not get it... The PropertyGroup, for
>>> example, returns a 'Sequence' argument, ie, it is possible to
>>> just return an array. This should be discussed.
>> 
>> We could achieve the same result by returning Sequences from the 
>> DataStore.filter() method, but if we take that route, we have to
>> make it easy to construct a new DataStore... and the code is much
>> more verbose/bloated.
> 
> Sorry Manu, still not convinced. I do understand the argument, but as
> you say yourself, this is a /very/ advanced use case. Do we have to
> define an API that makes very advanced use cases easier while making
> the simple use cases more complicated or less natural and also more
> complicated to implement or vice versa? My impression is that the
> current approach chooses the former and I wonder whether the right
> approach would not be the latter. Besides, we also have a query
> interface that can take care of many things!
> 
> What I mean: I have not implemented a data store myself, but I had a
> glimpse into what, say, RDFLib for what it calls a graph. It _is_
> more complicated that just an array of triples; because it has to be
> prepared to check the presence of a triple, its internal storage is
> more than just an array. I can imagine that it has several
> dictionaries for the triples, depending on whether you are look for
> them via predicates, subjects, etc. Ie, it is a relatively heavy
> stuff. What we have here that even for simple cases the
> implementation has to use this heavy stuff when a simple array would
> do. I think therefore we optimize on the wrong place.
> 
> If an advanced user wants to use a datastore, he/she can create it.
> Maybe we could extend the DataStore interface by an 'addTriples'
> method that takes a whole array and adds all triples in the array in
> one call. If we do that, then the output of the filter method could
> be directly fed into a new datastore and the next filter could be
> applied if necessary. Yes, it is a little bit more complex but only
> for the advanced examples...
> 
> I think I would like to see an editorial note saying that this is
> still under discussion (and maybe an explicit issue?)

You're convincing me to make the return type a Sequence instead of a
DataStore. I need to think about it a bit more, but your argument that
the DataStore could be a very "heavy" data structure resonated with me.

My issue before was that we would have to construct a DataStore if a
sequence was returned... however, we have to do that anyway if we want
to return a DataStore (unless DataStore's supported views... but that is
such an advanced concept that I don't think it would be implemented for
years).

I've placed an issue marker in the spec.

>>> I am not fully convinced about the necessity of having the
>>> 'forEach' method. Sure, I can see its utility, but its
>>> functionality can easily be programmed by a cycle through the
>>> triples of the store and it seems to add too much to the Data
>>> store interface. I would consider removing it altogether,
>>> including the DataStoreIterator interface.
>> ...
>> So, while it's good to simplify... I'd be bothered by removing it
>> at this point in time... perhaps we should discuss this more as I
>> don't necessarily thing the explanation I give above is good enough
>> to be used as the reason we have the forEach interface.
> 
> Ok, let us discuss that. Editorial note?

An editorial note has been added.

>> Perhaps... the division is fairly awkward at present. The idea is
>> that the DataParser has two modes of operation - read-and-store
>> and stream-and-discard:
>> 
>> parse() -> process the document and store every triple
>> (read-and-store) iterate() -> stream triples as they are found
>> (stream-and-discard)
>> 
>> The first requires quite a bit of memory, the second is far more
>> memory efficient. Think desktop vs. smartphone.
> 
> And we had this issue before, and you convinced me about the
> necessity of having something like iterate. I do not dispute that.
> But having that 'two modes of operation' seems to be very convoluted
> to me.
> 
> As I can see it the iterate approach constitutes a completely
> separate model of operation and programming. One has an iterate, has
> the basic RDF interfaces like triples and... that is it? Nothing of
> the store and the related stuff, no property groups, nada. Right?
> Well then, let us make that explicit, maybe even a completely
> separated top level section. What is there right now is utterly
> confusing to me:-(

I added an editorial note, we'll revisit this when we get a chance.

>>> ---- Property group interface.
>>> 
>>> Editorial issue: the property group template section comes a bit
>>> out of the blue, because the query is defined later. I would
>>> expect this section to be moved down to the definition of Data
>>> Query...
>> 
>> Unfortunately, if we move it down there, people may not understand
>> that Property Groups are meant to be language-native containers for
>> Linked Data. Perhaps we need a better introduction to that section
>> so it doesn't come from out of the blue? Would that address your
>> concern, Ivan?
> 
> I guess... I would have to see what you mean...

Added an editorial note... One of us will try to come up with something
over the next few months.

> B.t.w., the tabulation of the RDFaDocument interface seems to have
> problems...

Yeah, the problem is with ReSpec identifying WebIDL markup like this:

Sequence<PropertyGroup>

The parser is somewhat broken... going to have to figure out why later on.

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: Saving Journalism - The PaySwarm Developer API
http://digitalbazaar.com/2010/09/12/payswarm-api/
Received on Tuesday, 14 September 2010 01:52:06 UTC