Re: WG: Schema.org considered helpful from Rurik Greenall on 2011-06-20 (public-lld@w3.org from June 2011)

From: Rurik Greenall <rurik.greenall@ub.ntnu.no>
Date: Mon, 20 Jun 2011 11:45:32 +0200
To: Lukas Koster <l.koster@uva.nl>
Cc: Dan Brickley <danbri@danbri.org>, public-lld@w3.org
Message-Id: <629D62E0-53AE-41F3-9492-AA8BDB1C420B@ub.ntnu.no>
Lukas, Dan, others,

For the record(!), I'd like to clarify my position on records.

When using the term, I am thinking about data objects not unlike rows in a databases (think library catalogue records).

This means that values of a record are treated as information within a context (the record). Because the values exist only within the context of the record, which exists only within the context of the system to which it belongs, etc. there is a problem when using the information outside of the system. When this occurs, it becomes difficult to answer questions like "what is the record about?" and "in what context was this created?".

To avoid this problem, values are typically transferred as complete, saturated records — with an implicit provenance and supra-contextual understanding (i.e. here is a record from NTNU's SRU service which provides bibliographic data). This ensures that a) I know where I am getting something from, b) what I am getting and c) that when I finally get the information, I can assume that there is some cue within the record (or even the source from which it was retrieved) that will allow me to glean the information I need from the record in order for it to be meaningful to me. Think MARCXML; think an understanding of AACR2, ISBD…

To my mind, both of the difficult questions are easily answered when using linked data because the subject is obvious and the contextual provenance is in the URI (i.e. it comes from ntnu.no). It is also therefore safe to extract values individually.

In cases where structured data is embedded in and intrinsically linked to a document (without the context-providing "subject" of the triple); this information becomes contextless (and therefore meaningless) if extracted from its context. This seems to me to be the case with microdata.

As someone who works with library data, I see that technologies for record transfer (like z39.50, SRU, OAI-PMH) are generally hobbled because they transfer information as records, rather than as values that I can manipulate directly. 

I don't think I have ever said that the record concept is obsolete (if I have, I apologize for any offense caused) — we just have a better alternative; this is something that I stand by. 

As someone implementing systems, I don't see any problem using a record concept at a superficial level; it's an easy, friendly concept, but your business data shouldn't be affected by this — in the case of libraries, this is — sadly — generally the case.

Cheers,

R.

On Jun 19, 2011, at 2:58 PM, Lukas Koster wrote:

> OK, agreed, point taken. I take that back. I'm all for building bridges. The term "obsolete" could possibly be applied to the concept of "fixed introvert structures" (like MARC records). A microformat could be thought of as one of a number of possible views on a specific subject area, useful in specific circumstances. But there may be other microformats aimed at the same subject that are more useful in other contexts.  Let's see if microformats may bridge the gap.
> 
> PS: I still think Z39.50 works better and faster than SRU in most cases ;-) And yes, it's obsolete in the same sense that I used that term for 'records'
> 
> Lukas
> 
> On 19-6-2011 14:38, Dan Brickley wrote:
>> On 19 June 2011 11:44, Lukas Koster<l.koster@uva.nl>  wrote:
>>> Just a few remarks:
>>> 
>>> - The internet was created by the USA army
>>> (http://en.wikipedia.org/wiki/Internet#History), the World Wide Web came out
>>> of the scientific/research world.
>> 
>> ...btw the first Web browser, "World Wide Web" (later aka "Nexus") was
>> perfectly capable of displaying images (and movies, sounds etc), by
>> passing them on to the local operating system, NeXT. Mosaic's charm
>> was inline images and running on a more consumer-accessible OS.
>> 
>>> - As Rurik Greenall states: microformats are just another form of the
>>> obsolete "record" concept. In linked data it's about networked information
>> 
>> Oh, I disagree here! The idea of a record or document format will be
>> useful forever. RDF and Linked Data basically give us dictionaries of
>> terms (schemas/ontologies) but don't generally say anything about
>> kinds of documents. This has left a bit of a gap: it's very nice
>> having RDF vocab like 'shippingOrder', ... but in the XML tradition we
>> had something more: we could express our expectations for what useful
>> package of info we would find in any particular XML shipping order
>> document. I don't think this is obsolete, but rather something we'll
>> slowly rebuild on top of RDF, eg. DC's notion of application profile
>> is in this area.
>> 
>> And re 'microformats' and calling them 'obsolete' is not the nicest
>> way of building bridges with that community. There has been a lot of
>> pointlessly hostile 'microformats vs rdf' discussion over the last
>> several years, I hope we can move away from that and see things more
>> as explorations of different tradeoffs in design space. Many of the
>> features of RDFa 1.1, for example, address concerns repeatedly raised
>> by microformats and html5 people; and the design discussions around
>> microformats.org/wiki/microformats-2 likewise bring it closer to the
>> approach rdf takes. In this context throwing around language like
>> 'obsolete' can do a lot more harm than good, in terms of the slow
>> drift towards consensus.
>> 
>> cheers,
>> 
>> Dan
>> 
>> ps. I've been calling z39.50 obsolete since mid-90s but it hasn't gone
>> away yet ;)
> 

Rurik Thomas Greenall
NTNU University Library | NTNU Universitetsbiblioteket
rurik.greenall@ub.ntnu.no
@brinxmat
http://folk.ntnu.no/greenall
Received on Monday, 20 June 2011 17:33:29 UTC