Vocabularies for file data, content events, errors from Niklas Lindström on 2009-09-02 (semantic-web@w3.org from September 2009)

From: Niklas Lindström <lindstream@gmail.com>
Date: Wed, 2 Sep 2009 11:37:08 +0200
To: Semantic Web <semantic-web@w3.org>
Message-ID: <cf8107640909020237y745f014es81c3b0739986a160@mail.gmail.com>

Hi all!

I'm looking for any common vocabularies/ontologies describing the following:

* simple file data properties, describing:
  - checksum+algorithm (and/or direct properties for md5, sha1/-2 etc.),
  - filename/slug (unless dct:identifier is suitable enough?).

* content-related events, such as "the act of reading from a
dataset/collection (e.g. a feed)", "create", "update" and specifically
"delete" (or "deletion")

* (content) error reporting, primarily for transport and
(RDF-)validation failures

I have looked at e.g. PRISM, OAI, DSpace, Harmony and similar (e.g.
properties used by the Talis Platform, the old(?) PICS metadata, EARL,
voiD, SIOC...). But I'm not sure what would be considered most simple
and interoperable in this domain. I'm prepared to dig more into these,
but I'd really appreciate some advice (in case I've overlooked some
initiative or upcoming unification effort).

This is to be used to describe (a log of) the aggregation of several
datasources, specifically reading Atom archives to compile a dataset
of (legal) documents. We would like to expose the "system" events with
these vocabularies (for monitoring) as an auxiliary dataset (the
primary data being the legal document domain and their
interrelationships).

Currently we use AtomOwl to represent versioned entries*, but use our
own definitions of "Collect", "DeletedEntry" and "TransportError",
"ContentError" etc., which is why I'm looking for something more
general. We describe the source and result datasets as "sioc:Space":s,
connected with "dct:source" and referencing the subscription feeds
with "sioc:feed".

(* An entry being thought of as a "manifest" (in time) for a resource,
exposing content/variants and optional enclosures of an aggregate
resource (say a web page with pictures, or a PDF with separate
appendices).)

Best regards,
Niklas Lindström

Received on Wednesday, 2 September 2009 09:38:10 UTC