Re: Experimental RDFa extractor in JS from Niklas Lindström on 2012-04-20 (public-rdfa-wg@w3.org from April 2012)

From: Niklas Lindström <lindstream@gmail.com>
Date: Fri, 20 Apr 2012 18:31:51 +0200
To: Gregg Kellogg <gregg@greggkellogg.net>
Cc: Ivan Herman <ivan@w3.org>, public-rdfa-wg <public-rdfa-wg@w3.org>
Message-ID: <CADjV5jfXJrujnPvxPvsrX3pUF-AX3z+KeYR=TaTxsYtUt=aJ=Q@mail.gmail.com>
Hi!

2012/4/20 Gregg Kellogg <gregg@greggkellogg.net>:
> On Apr 20, 2012, at 8:23 AM, Niklas Lindström wrote:
>
>> Hi Gregg,
>>
>> 2012/4/20 Gregg Kellogg <gregg@greggkellogg.net>:
>>> On Apr 20, 2012, at 3:52 AM, "Niklas Lindström" <lindstream@gmail.com> wrote:
>> [...]
>>>> Yes, that's what I do too, for exactly those reasons. The shape of the
>>>> output is entirely based on the form of the input, i.e. using the same
>>>> terms and CURIEs (populating @context as needed). One thing I haven't
>>>> yet done, but plan to, is to merge descriptions about the same
>>>> resource even if they're dispersed throughout the page.
>>>
>>> Note that you can leave such merging to JSON-LD framing, which does this anyway.
>>>
>>>> While that
>>>> does deviate from the actual shape in the source page, it is so much
>>>> better for consumption, and I think is to be expected. Another thing I
>>>> don't do is any kind of coercion. Literals with datatype or deviating
>>>> from any given @language are represented in expanded JSON-LD form.
>>>> I've yet to decide whether to change that or make it configurable.
>>>
>>> This might also be left to JSON-LD API methods. For instance, the "automatic" flag to compaction could generate the best context for you to use, and coerce your data for you. It can be expensive, though, and for any real application, a JSON-LD context matching the data could be provided to compact or frame.
>>
>> At this point I'd like to stick to a strict and very simple solution,
>> with one predicable result tree (based on the source RDFa structure,
>> but merging anything dispersed). I'd like this to be lightweight and
>> simple, with close to no API. The fact that this solution produces
>> JSON-LD is a benefit, but it is basically skimmed data, mainly usable
>> for simple things. I think of it mostly as an RDFa equivalent to the
>> microdata-to-JSON approach. (And the merging I speak of is roughly
>> corresponding to how that handles the @itemref stuff.)
>
> For some reason, my point is being confused. I think the approach your taking is just great. My point was that if anything more complicated needs to be done, it can be left to JSON-LD tools. I'm all for keeping your implementation as simple possible, and be close to the form of the document you've produced.
>
> If a developer wants to do more with the data than what you produce, the JSON-LD API has a number of useful tools. There should't be any direct dependencies between your tool and the JSON-LD implementations, leave the that up to a developer.

Ah, sorry; then all is good and well, and we're definitely on the same
page! For my part, I suspect that my argumentation here was equally
directed towards myself, since I've actually had impulses to embark on
the very path I argue against. :) That is, instead of making decisions
to end up with a basic simplicity, I've actually pondered on whether
to adopt JSON-LD API flags for controlling the resulting shape.. Which
as we all agree on is better done in separate steps, with separate
libraries.

The main thing yet to decide then is whether to merge or keep
disparate description objects of the same resource. Do you find any
merit in my reasoning there, regarding expectations, usability, and in
the "microdata-using-@itemref to JSON" parallell? To some extent, I
already diverge from the exact "frame" of the source when it comes to
@rev handling. And as mentioned, when dealing with multiple references
to the same resource. So I think a certain amount of "normalization"
is inevitable. Although I believe it best to keep nesting in general,
rather than to move all resource objects to the top-level @graph. But
that's also debatable. We know one size won't fit all, so the goal
here is to establish some kind of "reasonable expectation based on the
source", if possible...

(By the way, I've pushed the changes I mentioned. Still quite a moving
target, but it may be an improvement.)

Best regards,
Niklas
Received on Friday, 20 April 2012 16:32:51 UTC