API topologies

i see 3 main patterns emerging in software APIs to RDF. given these it is often simple to classify existing libraries and toolkits. more interesting perhaps is the question of whether an 'approaching-perfect'  API embraces utmost-flexibility and provides for all of these categories at once (linux/*bsd, bazaar), or avoids muddying the waters and provides a more focused and clean 'one way' of doing things (apple, cathedral). which is most composable in terms of abstraction-building amongst RDF APIs and interacting with the non-RDF world of information (JSON, and so forth)?

1: abstract-concept orientation

an arrangement popular in utility-focused tookits, like Redland. Triples, Resources, Nodes, Graphs are the main object classes. each class typically has methods which take various other classes and additional specifiers as arguments. generaly the implementations are complex and require a fairly large suite of methods on all classes, with some redundancy due to overlapping concepts like (URI, Subject, Resource, Node, BlankNode). the functions may by similar but the concepts are indeed seperate, necessitating mix-ins/multiple-inheritance (much easier in Ruby than Redland/C) and similar trickery to flesh things out to completeness.

this is the most likely candidate for a low level library , with one of the two following topologies built on top of it for easier usage:


2: RDF schema / class-orientation

RDF provides the notion of a 'Class' of resources. these resources can be subclasses of others, and so on. this is largely overlapping with the popular OO programming topology. indeed, RDF Resources could be thought of as OO Objects. with 'predicates' being object 'slots/accessor-methods' and so on.

a good example of this style is ActiveRDF, a library written in Ruby. it allows <foaf:person> URIs to be created using a 'Person.new' syntax. this is the modeled after ORM (Object-Relational-Mapper) classes designed for persisting via SQL stores. SQL's static data-layout is simply glommed over and allowed to be used from a more comfortable place than select/insert statements. the inflexibility of model is still there, and you can't add new property slots to an object without circuitous-route tricks like persistence-layer data-migration/massaging scripts. this is the most popular model in use today in the non-RDF world, despite these limitations and features. this is probably owing more to SQLs huge installed base rather than these inherent shortcomings. (maybe theres little demand for dynamic models outside of ad-hoc realtime-reconfigurabl data-mining iniatives, incidence of static schemas immortalized into RFCs (like Atom) may lend credence to this possibility)

although ActiveRDF does away with SQL's coupling of the persistence layer to the data model, it still encourages encoding a particular static data model. eg: Person.new.employer = Employers.find('there_is_only_goog'). a key point is that Person is a language-implementation symbol, not a URI. a single in data in the model could break all existing code, code cannot be reused on other classes without massive find/replace on strings or changing the underlying Symbol<>URI mapping.

the kinds of things one is encouraged to write using this style (using Rails as a guide, controllers and views) can be generalized to work on 'predicates of a subject' and 'objects of a statement', factoring out class-specific code in favor of on-the-fly configurable lenses and faceted browsers/editors, as championed by Fresnel, Tabulator, etc.

this concise, tightly-coupled-with-Class topology may be useful for business logic, where processing was always meant to be specific to begin with, and something readable, rather than reusable, is desired.


3: one class to rule them all (jQuery style)

the hallmark of jQuery, a popular library for DOM manipulation and smoothing out cross-browser issues and underlying-API ugliness, is a single class named 'jQuery'. functions on these return a different jQuery object, ad nauseum. this allows for easy composing of operations using the language's natural facilities, like method-chaining or inline-lambdas. frosh developers find ease and fun in simply chaining together a few name sfrom the docs and finding out their code already works. they also dont have to learn about all the underlying details but think in terms of actions like 'filter', 'hide' and so forth: jQuery('p').parent().filter('.subsection').color('red')


one could argue that normal developers dont want/need to be confused/turned-off by all the RDF concepts that exist in Redland's source and W3C spec docs. the trump concept is the Resource, appearing in Subject, Uri, Node, BlankNode, Predicate, Object, etc. 'everything is a resource' as the RDF parallel to jQuery's 'everything is DOM node (s)', with an API to match. anything that has some sort of powerful generic Resource class begins to fit into this category. the main difference is libraries here eschew other toplevel classes. everything is expressed in reference to a resource..


jQuery goes a step further unifying nodes with sets of nodes. any operation is implicitly carried out on all in the set (or the set is used as input for filtering). this further cleans up the surface code by eliminating iteration constructs.


4. RDF _is_ the language

N3 hints at this. a syntax approaching python, a notion of 'functions' and basic operators like equality. Adenine compiled RDF to VM bytecode. as far as i can tell this is the domain of rocket scientists and not typical application developers...

Received on Monday, 11 February 2008 08:42:43 UTC