Re: RDFa API for browsers from Mark Birbeck on 2009-10-21 (public-rdf-in-xhtml-tf@w3.org from October 2009)

From: Mark Birbeck <mark.birbeck@webbackplane.com>
Date: Wed, 21 Oct 2009 12:52:06 +0100
To: Toby Inkster <tai@g5n.co.uk>
Cc: Manu Sporny <msporny@digitalbazaar.com>, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <640dd5060910210452r36f5d207v3b717f2c75054d0c@mail.gmail.com>
Hi Toby,

On Wed, Oct 21, 2009 at 11:43 AM, Toby Inkster <tai@g5n.co.uk> wrote:
> On Wed, 2009-10-21 at 11:12 +0100, Mark Birbeck wrote:
>
>> This is interesting...why would you prefer to keep the triples from
>> different formats separate?
>>
>> I'm not saying you shouldn't. :)
>
> It might speed up practical implementations. If you only use RDFa on
> your page, then calling:
>
>        document.meta('RDFa').whatever();
>
> means that the browser doesn't need to waste its time checking all the
> page's profile URIs (if any are given) for GRDDL profileTransformations,
> and namespace declarations (which will almost certainly be given on any
> page that uses RDFa) for GRDDL namespaceTransformations.

Ah, I see.

I was imagining a different model. You have one or more parsers, that
run after page load. Each parser is independent, but places its data
into a common location. As a programmer using a library, you don't
need to know whether there is one parser or five running against the
page.

In that scenario, the only issue is whether to put the data into one
location, or multiple ones. I don't have a problem with having
multiple stores, though.


> Also, if different browsers support different serialisations (e.g. RDFa,
> GRDDL, Microdata, eRDF, etc), testing whether document.meta('RDFa') is
> empty might be a useful tool.

I think we should work towards a standard interface for metadata,
regardless of how the data was originally obtained. So I feel that:

  document.meta

should provide the basic 'entry point' and then from that, we have one
or more stores. Anything can create a store, including the programmers
themselves, although a store will most likely be created by a
processor for one of the formats that you mentioned.

Obviously, if you can think of any interfaces that would be specific
to a format, then they could be placed on format-specific objects, but
I think it best to avoid going that route if we can, and just have a
set of common interfaces on the store object.


>> It's just that I've always worked on the assumption that everyone
>> would want all the metadata to be bundled into one common, queryable
>> location.
>
> I imagine that most people would, yes. That's why I showed meta() with
> no parameters to return the union graph.

Sure.


>> > for (var i in r)
>> > {
>> >  // r[i].foo typeof 'RDFNode'.
>> >  if (r[i].foo.type == 'literal')
>> >    window.alert(r[i].foo.datatype);
>> > }
>>
>> My preference here is for the default mode to be JSON objects. If you
>> look at it from the point of view of a JS programmer, then a query is
>> essentially a request to construct an array of JSON objects, that are
>> based on a certain template.
>>
>> For example, a query for "?name" is really a request to create an
>> array of objects, each with the single property "name":
>>
>>   [
>>     {
>>       name: "Toby Inkster"
>>     },
>>     {
>>       name: "Manu Sporny"
>>     }
>>   ]
>
> This is essentially the same as what I'm suggesting, but they'd get
> back:
>
> [
>  {
>    "name": {
>      "value" : "Toby Inkster" ,
>      "type"  : "literal" ,
>      "lang"  : "en"
>    }
>  } ,
>  {
>     "name": {
>      "value" : "Manu Sporny" ,
>      "type"  : "literal" ,
>      "lang"  : "en"
>    }
>  }
> ]

But that's not the same. :)

I'm suggesting that we put JavaScript features to the fore, not simply
use it to mirror triples.


> This is pretty similar to the SPARQL Results JSON serialisation
> <http://www.w3.org/TR/rdf-sparql-json-res/> and RDF/JSON
> <http://n2.talis.com/wiki/RDF_JSON_Specification>, both of which are
> pretty widely implemented and supported.

Techniques such as RDF/JSON use JS simply as a serialisation mechanism
for triples, whilst what I'm proposing is the creation of 'semantic
JavaScript'.

I'm not suggesting that we get all of this 'semantic JS' in one go --
but I'd like to see us put the foundations in place.


> The main difference would be that these objects like {
>       "value" : "Manu Sporny" ,
>      "type"  : "literal" ,
>      "lang"  : "en"
> } would also have some object methods defined, such as ".token()" which
> outputs a Turtle-compatible token.

My approach is almost completely the other way around.

Why do we need these 'exploded' objects? What we're doing here is
ignoring the objects that are provided to us in the language (strings,
integers, dates, etc.), and replacing them with RDF objects. That's
not something JS programmers will thank us for. :)

If we have an integer, then why not just make it an integer? Same for
floats, functions, and so on.

True, the difference between a literal and a URI is more subtle, but
here we simply add some helper methods to take a string containing "<"
and ">", and convert it to a URI. (In other words, where you have a
function to go from exploded object to a turtle string, I'm using
strings by default, and providing functions to convert URIs.)


>> There's no need for our JavaScript authors to be testing types, etc.
>> -- they should get back an immediately usable set of objects.
>>
>> I know people are going to say "what about the difference between URIs
>> and literals?", "what about data types?"...and so on.
>
> If people care about whether things are URIs or literals, and what
> datatype has been used, and the language, then they can look
> at .type, .datatype and .lang. If they don't care, they can just look
> at .value and ignore the rest.

Exactly. So why not go the further step and use native JS objects,
rather than everything the programmer writes having to have '.value'
on it?


>> For example:
>>
>>   [
>>     {
>>       name: "Toby Inkster",
>>       foo: 17
>>     },
>>     {
>>       name: "Manu Sporny",
>>       bar: 93.25,
>>       action: function () {
>>         ...
>>       }
>>     }
>>   ]
>>
>> As you can see, the principle is always to make the object feel
>> 'natural' to a JS programmer, and in a sense to conceal its RDF
>> origins.
>
> I can see the value in "flattening" the structure, so that {
>  "value" : "foo" ,
>  "type"  : "blah"
> } just becomes a simple string "foo", but the problem is that it's
> impossible to go the other way - to get back to the unflattened
> structure reliably.

It's not impossible, but yes, there are issues.

But before we go any further with that, we'd need to define the
use-cases for going the other way; we don't want to make this more
complicated for programmers, because we want to allow for something
that never happens.

In the main I see the results of a query as being something that
programmers then work with, but not necessarily put back into the
triple store. However, longer term I do have an idea that there would
be 'live' JS objects, which when updated cause the triple store to
also be updated, but I think that's probably too ambitious for now.

As it happens, in my parser I do allow JSON objects to be imported
into the triple store, and I think it's reasonable to map JSON
integers to XSD integers, etc.

Are there other scenarios that you are thinking of?


> If the outer array was replaced by an iterator object that could act
> like an array but have methods...

Right. In fact I see this containing object as having a number of
other properties that would help when processing the data. For
example, it could contain a base URL, against which all URLs within
the data objects are relative. It might also contain prefix mappings,
so that CURIEs could be used within URIs as well. That kind of thing.


> ... then your flattened structure could be
> returned using:
>
> var r = document.meta.query('...').simple();

That's possible, although I would still prefer my 'JS-native' format
to be the default, because I think we should aim to make the default
functionality as 'natural' as possible for JS programmers.


> I wonder how much of this could be prototyped in a Javascript library
> before browser pick it up? (And to allow scripts using document.meta to
> function in older browsers.) Probably quite a lot. GRDDL might be
> difficult because of cross-site requests.

Most of what I've described here is in my Ubiquity RDFa parser [1]; it
uses document.meta, but there is only one store for the triples:

  document.meta.store.query( );

However, it wouldn't take much to expose it using the technique we've discussed:

  document.meta.store["rdfa"].query( );

and start to make it play nicely with other parsers.

The library is currently going through a revamp (being merged with my
other JS libraries, such as XForms, notifications, and SMIL), and a
rename -- which is why I haven't mentioned it a great deal yet.

But I'm working on it now, to get it into a position where some of
these ideas can be tested out.

Regards,

Mark

[1] <http://ubiquity-rdfa.googlecode.com/>

--
Mark Birbeck, webBackplane

mark.birbeck@webBackplane.com

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)
Received on Wednesday, 21 October 2009 11:52:43 UTC