Re: Comments on the API document (version of Monday from Manu Sporny on 2010-05-26 (public-rdfa-wg@w3.org from May 2010)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Wed, 26 May 2010 02:12:30 -0400
To: W3C RDFa WG <public-rdfa-wg@w3.org>
Message-ID: <4BFCBBCE.2000308@digitalbazaar.com>
On 05/20/2010 07:39 AM, Ivan Herman wrote:
>> It is very close to filtering, except that the DataIterator mechanism is
>> meant for low-power or low-memory devices. It was Benjamin's (very good)
>> idea. When I was updating that section, I was thinking that the
>> interface would not require more than 100KB of code + data to execute
>> (that includes RDFa Processor + HTML document + RDFa DOM API).
> 
> That will require more explanation for me.

Basically, .iterate() is set aside for mobile phones that may not have
enough memory to store large triple-stores in memory. A developer can
decide to skip the Data Store part of the API and just process triples
directly via the parser interface. This results in huge savings and
allows the RDFa DOM API/RDFa Processor to use a purely SAX-based
mechanism to extract structured data from the document.

>> We should discuss that MAY. I was primarily concerned with requiring
>> that the environment be initialized in any particular order. As long as
>> the objects exist when the Web developer requests access to the API,
>> we're okay. I'm worried about over-specifying the exact steps.
>
> I think that there should, somehow, two ways of 'creating' or
> accessing the whole RDFa environment. One way should be one step and
> should return a reasonable default setup. The other way is the
> detailed setup where the user can specialize along the way. The
> latter is obviously for advanced users only; the former is obviously
> a wrapper around the latter with default settings.

Yes, agreed. Hopefully, this is made more clear in the restructuring.
Basically - if you are in a browser environment, after onload() is
called, you should be ready to go.

>> The main reason that there are a "large" number of interfaces (there are
>> 10 interfaces) is because we're trying to create something that's
>> modular and pluggable. If we integrate too heavily, it means that parts
>> of the API cannot be replaced by the developer.
>>
>> See "modularity and pluggability" in this section:
>>
>> http://www.w3.org/2010/02/rdfa/sources/rdfa-dom-api/Overview.html#the-design-goals
> 
> Hm. To provoke: who and when decided that we need that as a design
> goal?

I did, after discussing this with Benjamin and Mark. :)

That's not to say that we can't remove it - just that the design goals
reflect the API as it stands right now, not the other way around. So, we
figured out what we wanted to do with the API and the design goals of
the API became apparent at that point.

> I do not remember having this discussion... If the price for
> this level of modularity and pluggability is a very complex API that
> the community will reject, than we would loose on long term. In my
> view, simplicity may be more important.

I agree that simplicity is very important, but we should also be very
careful about being complete in specifying the API. What we were doing
before was a fair amount of hand-waving on how all this stuff would fit
together. We were assuming that it would happen naturally and we'd
figure out how to fit all of these pieces together after a year or two
of field deployment.

Now it is apparent, at least to me, that had we not specified everything
that is in the document now, that we'd be in trouble two years down the
line.

In other words, specifying the entire environment helps clarify how all
of this stuff fits together. In fact, for each piece that we remove from
the document at this point, while it may seem like it simplifies the DOM
API... what it is really doing is making the spec more vague.

By making the spec vague, it may seem like we're left with a simpler
API... but what we end up with is an under-specified API.

>> Those are "complex" concepts, not basic API stuff... I can explain more
>> on the call, but the idea was that all basic developers would have to do is:
>>
>> var person = document.getItemsByType("foaf:Person")[0];
>>
>> and then:
>>
>> var name = person.get("foaf:name");
> 
> well... I have difficulties to believe that we can get away without the 
> IRI, Literal, etc. If I have
> 
> <span typeof="foaf:Person>
>   <span property="ex:dataproperty">http://www.example.com</span>
>   <span rel="ex:dataproperty" resource="http://www.example.com"/>
> <span>
> 
> then I would like n1 and n2 below:
> 
> var n1 = person.get("ex:objectproperty")
> var n2 = person.get("ex:dataproperty")
> 
> to be different. My current feeling is that the IRI, Literal and
> blank node (possibly nothing else) are to be defined as basic
> datatypes that permeate the whole API. Then of course you may get
> something like what you have.

I still disagree. Look at the code that you wrote - not a single mention
of IRIs, Literals or Blank Nodes. The API is string-based, first and
foremost. This means that developers don't have to mess around with
IRIs, PlainLiterals, TypedLiterals and BlankNodes unless they want to do
so. Those concepts are advanced because you don't need to know that they
even exist to use the basic API.

>>> With restrictions: I do not
>>>   want to know about the possibility of redefining the
>>>   datatype->javascript object conversion; 80% of users won't do
>>>   that, some defaults should prevail
>>
>> There will be default converters specified for all XSD types in
>> Javascript. You're right, 80% of developers won't need to do this... but
>> for the ones that define their own datatypes, or use non-XSD datatypes,
>> it'll be important to allow this in the "advanced" API.
> 
> And I agree with that. But there is a presentation issue: I would prefer 
> not to see that in the section for simple users.

This has been moved out to the bottom of the document.

> I presume it is possible to edit something like:
> 
> section 1:
> 
>   interface IF {
>      method1 .....
>   }
> 
> section 2:
> 
>   interface IF {
>      method2 ...
>   }
> 
> ie, that the full definition of an interface is, editorially, spread
> over the text. Of course, you can have an appendix with the full
> definition. That is what I meant, not to remove that functionality.

You can do that, and I tried it, and it's really ugly. Also, if we take
this approach, it either makes implementers lives harder, or if we have
the entire API in an appendix at the bottom of the document - it
duplicates information.

>>> - Notion of PropertyGroup (note that the current definition does not
>>     include the origin, and should be)
>>
>> Yes, Mark wants this too - it was an accidental omission. However, we
>> still need to discuss /what/ the default origin should be. Is it the
>> origin of the subject? Probably.
> 
> My understanding is that the property group is all about all the
> triples bound to one subject. If so, than the origin should be the
> DOM node for that subject...

There is now an .origin property on PropertyGroup.

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: Bitmunk 3.2.2 - Good Relations and Ditching Apache+PHP
http://blog.digitalbazaar.com/2010/05/06/bitmunk-3-2-2/2/
Received on Wednesday, 26 May 2010 06:16:09 UTC