RDFa API implementation feedback

Hi All, Manu,

I've been implementing the RDFa API in ECMAScript and came to a point 
where I need to give feedback and resolve some issues to be able to 
implement fully, thus, here it is :)


[NoInterfaceObject]
=====================
no interface object basically means that in ECMAScript typeof 
InterfaceName should return undefined, and that there is no constructor 
for an interface of the same name.
IRI, PlainLiteral, TypedLiteral, BlankNode and RDFTriple all have the 
[NoInterfaceObject] extended attribute, which means that if any 
implementation in ECMAScript wants to provide named constructors (in 
order to be somewhat less verbose) they will be nonconforming with the 
RDFa API.
I would ask, and suggest that NoInterfaceObject be removed from IRI, 
PlainLiteral, TypedLiteral, BlankNode and RDFTriple interfaces, and 
quite the opposite, that NamedConstructor[1] be added to PlainLiteral, 
TypedLiteral, BlankNode and RDFTriple (IRI is debatable, but would leave 
is as neither NoInterfaceObject or NamedConstructor and let 
implementations decide what to do).

[1] http://www.w3.org/TR/2008/WD-WebIDL-20080829/#NamedConstructor


IDL and inheritance
======================
RDFNode specifies the `value` attribute, however the RDFResource, 
PlainLiteral, TypedLiteral and BlankNode interfaces all specify it too, 
even though it is inherited - unsure if this is okay in WebIDL, if it is 
then I'd suggest adding a comment beside each to say // inherited from 
RDFNode.


BlankNode ?name
======================
DataStore.createBlankNode, and thus the constructor for an 
implementation of BlankNode accepts an optional ?name argument, this 
troubles me because:
  - it indicates to the user that they can rely on the name in some way
  - it means if somebody specifies the same name twice an exception 
would need thrown when constructing the BlankNode
  - from an implementers point of view it means an internal dictionary 
of all used names is needed which adds weight where it isn't really needed.
Again, I'd ask and suggest that the optional ?name is removed from 
BlankNode.


RDFTriple
======================
RDFTriple essentially extends Array in that it can be accessed by [] 
offsets and I presume iterated over using standard language constructs, 
this seems to complicate the implementation unnecessarily for little 
gain, certainly when implementing in ECMAScript. That said, if it is to 
remain like this, then:
  - issue, there's a slight conflict between .length (native on an 
array) and .size as specified, renaming size to length would remove this 
conflict.
  - note, an ECMAScript implementation won't be able to lock the 
properties to read only, until ECMAScript V5 features are common enough 
(freeze etc)
  - personally, I can't really see the use-case for the indexes? but 
this doesn't really matter :)


The RDF Interfaces in general
=============================
Great job with them, the only thing I see missing or mixed up somewhat 
is that RDFTriple stringifies as NTriples whereas no other interfaces 
do, this essentially means that an implementation of RDFTriple is going 
to have to hold all the logic for turning any RDF Interface in to 
NTriples - seems like a bit of a mismatch, I'm happy to forget this 
minor issue, but would normally suggest having all Interfaces stringify 
as NTriples, or have all interfaces implement a toNT() method, it's an 
easy hit, easy to implement, and is a v nice feature to have. As in, 
either support NTriples stringification fully or not at all.


DataContext.convertType
=======================
?modifier on DataContext.convertType seems completely out of place here, 
there's no need to add in the scope for multiple errors and unexpected 
functionality when it really really isn't needed in any way at all, I'm 
positive that any use case you threw at me for it I could address with a 
minor snippet of code. I'd ask and suggest that the modifier parameter 
is removed from the API. The phrase "asking for trouble" comes to mind.

In line with the comments in the documentation some guidance on two 
things is asked:
- known issue, what to do when conversion fails (exception probably..)
- perhaps new, what to do when there is no converter registered for the 
datatype?

Overall comment, the structure of the Data* interfaces, especially the 
dependencies on each other and specifically DataContext feels a bit 
wrong, the dependencies are implied rather than implicit and unexpected 
behaviour (for users) could result + the implementation is full of cross 
cutting concerns, this is probably because CURIES and TypedLiteral 
conversion has to be global for the API and in some respects it's easy 
to think of there only being one instance of document/the API/Data* 
Interfaces however multiples of each are more than possible and there is 
no global or default scope, you could suggest that document.data is the 
default scope, but you can easily have several documents at the same 
time, each with their own .data, I understand the issues, however feel 
that there may be a way to reorganise the Data* interfaces to be more 
logic, will cover this later in the mail or under separate cover.. 
(really it needs a chat and a whiteboard session!)


DataStore
=======================
?type is completely unused but an optional parameter for the 
constructor, further it implies that you can have a different "type" of 
datastore which is completely undocumented and also implies that a 
certain "type" of DataStore would have to be used with a certain "type" 
or parser, it either needs specified fully or dropped.

The create methods seem entirely out of place on this Interface.

createIRI still has an `optional Node node` param with no usecase, usage 
or information about what it's for - perhaps it's slipped but from all 
angles would ask and suggest it's removed from this method.

It appears that DataStore is designed to extend Array in ECMAScript and 
be able to use indexed getters + native language iterative features. 
However (certainly with ECMAScript):
  - .size is mismatched with .length the default Array property
  - expected functionality will be to be able to use an indexed setter 
store[12] = sometriple; which is undocumented, unhandled, and probably 
inadvisable.
  - filter() is mismatched with the default ECMAScript Array.filter method
  - forEach is mismatched with the default ECMAScript Array.forEach method

Further, .filter introduces `pattern` which again just seems asking for 
trouble, and likewise it introduces `element` which makes the RDFa API 
RDFa/DOM only (it's the only thing that does) - element could be removed 
because the RDFaDocument interface already has methods to cover this 
functionality, and pattern is just a poorman's DataQuery so is generally 
unneeded.

.clear seems entirely unneeded too because you can just dispose of the 
object and create a new one.

.add returns a boolean, however I can't see how add can return anything 
other than true, because if a triple doesn't exists it returns true 
because triple has been added, and if the triple does exist then it 
*must* return true as well, so add is a safe method with no exceptions 
that can always return true, thus needs swapped to void, likewise .merge 
inherits this.

DataStoreIterator is mismatched with the default .forEach callback 
method signature, and the name seems entirely wrong (it's not an iterator).

DataStoreIterator and RDFTripleFilter are mismatched, one accepts 
RDFTriple (nice and light, type safe, expected functionality) and 
DataStoreIterator accepts subject,property,object which adds weight and 
is, well imho not as nice :)

Overall (and in order to implement), DataStore either needs to be an 
object which doesn't use native iterators and indexed getters, or be 
aligned with Array/sequence<T> (thus removing clear and the create* 
methods) or, well there are a few approaches which can be considered :)


DataParser
===================
in short, iterate doesn't make any sense here (iterate over what) is DOM 
/RDFa specific again and overall DataParser feels like it should be a 
callback and a parse method on a different interface.
Additionally parse is now changed to parse( Element domElement) which 
means you can't parse a Document, since Document extends Node (not 
Element) so is virtually unusable and there is now no way to parse an 
RDFa document!


DataIterator
===================
As noted needs changed, has a triplePattern which is undocumented 
anywhere else, and I'm goign to leave it their as all these Interfaces 
need discussed.


DataQuery, RDFaDocument
====================
all cool :)


PropertyGroup
====================
Looks good and aligned with RDFaDocument methods, however I haven't 
tried to implement yet, would prefer to cover separately / at a later date.


DocumentData
====================
Really needs reviewed in with the other Data* interfaces, it would be a 
great home to the create** methods, and could hold a parse( Node n, 
DataParser p) method nicely - a few other bits but will cover separately 
/ hopefully discuss.


FWIW, good work thus far, seems like everything is covered but needs 
some refactoring and aligning with common APIs / ways of doing things 
(to be implementable) - should be quite an easy hit, even if it seems 
like a lot could change (on the Data* interfaces), I can't see it taking 
too long to sort (as long as you are open to change for the better) and 
I can implement quickly this week so that people can try using it.

ps: I started to give feedback to Manu yesterday on skype, as was 
entirely puzzled about some bits, then subsequently realised at some 
point in the week I'd swapped windows to the June version of the API and 
spent two days implementing the wrong thing, rather embarrassing! All 
the above is definitely on the latest editors draft of the RDFa API 
though and I redid the "wrong" work to be correct last night :p

Best,

Nathan

Received on Sunday, 19 September 2010 17:02:56 UTC