a pluralist account of DC 'application profiles' from Dan Brickley on 2010-11-15 (public-lld@w3.org from November 2010)

From: Dan Brickley <danbri@danbri.org>
Date: Mon, 15 Nov 2010 20:39:19 +0100
To: public-lld <public-lld@w3.org>
Message-ID: <AANLkTi=0dObW+fwU5yifZa=3S1oCtievF_2TEKZrWQEs@mail.gmail.com>

So chatting with Tom, I remembered something I wrote ~2004/2005 that
tries to explain the idea of Application Profiles to a Dublin Core
audience, without attaching it to any particular technical story.

I'm not claiming it's 100% reusable now, but there might be some text
or thoughts in there to salvage. The basic idea is that application
profiles bridge Dublin Core's role as 'community crossroads' with
technical concerns about (mostly) RDF graph patterns. And that even
the slides or audio from a conference talk on some local use of DC is
a valid, valuable form of 'application profile documentation'.

Dan

>From http://dublincore.org/usage/meetings/2006/04/profile-review/2005-10-05.danbri-dcap-draft.txt

[[[[
A Proposal for Dublin Core Application Profiles

Dan Brickley

Dublin Core Application Profiles are collections of descriptive patterns,
community conventions and multi-vocabulary metadata structures
used by members of the Dublin Core community.

A variety of descriptive techniques are available to the community
in support of this, combining human oriented and mechanical
approaches depending on the level of formality, precision and machine
checkability appropriate. Application Profiles are grounded in real world
metadata practice, user needs and local priorities. As such they
typically express constraints and information needs that simultaneously
extend and restrict global vocabulary standards such as those maintained
by DC itself.

A DCAM community serves as a forum for ongoing documentation of
these restrictions and extensions, relying on the DC Abstract Model
(and associated formalisms such as RDF) to ensure that a consistent
approach to description is taken by all Application Profiles. In the
simplest case, a DC Application Profile characterises the shared
description interests of some community of interest. This can be
achieved using natural language and other human-oriented
materials (eg. case studies, online discussion fora, etc).

In addition to human-centric documentation, Application Profiles can
often usefully be described with various machine-readable techniques.

The applicability of such techniques will vary depending on the degree
of consensus and commonality of interest in the relevant community. Such
techniques can improve the multilingual accessibility of the profile's
documentation, eg. by allowing human-oriented summaries to be
automatically generated in a variety of natural languages. The remainder
of this document outlines some technical approaches that can be used
when documenting an application profile.

A DC Abstract Model

There are various machine-readable ways in which we can
represent the descriptive patterns, community conventions and
vocabulary-mixing scenarios relevant to some DC community of interest.

These techniques are usually grounded in tools and software via some
specific binding of the Dublin Core Abstract Model, and MAY be grounded
via a syntax-neutral abstract binding such as RDF's (ie. a concrete grounding
that itself supports multiple DCAM-compatible notations).

1. DCAM Binding Type

A DCAM profile MAY restrict itself to a subset of the possible
DC Abstract Model bindings.

For example, it may specify that compliant instance data be
expressed using a specific named binding (such as DC-in-XHTML).

Restrictions on binding type SHOULD specify the specific version of
a DCAM concrete binding.

2. Document Exemplars

DCAM documentation MAY include 1 or more examplar instance documents
indicating typical expected usage.

Exemplar instances MUST be relative to named DCAM bindings, ie. if
sample RDF/XML is shown, some version of a DC-in-RDF Abstract Model
binding would be named.

Notes: Examplar document instances can be useful for user-facing
documentation, since actual examples are often more accessible and
understandable than schema-level abstractions.

3. Document Instance Syntactic Restrictions (eg. DTDs)

For DCAM bindings that specify a textual notation (typically but
not always XML based), any relevant syntactic profiling mechanisms
MAY be used.

For example, XML DTDs, Relax-NG or W3C XML Schemas, or Schematron
schemas could all be used to express restrictions and extensions
for an application profile, indicated in terms of XML-validatable
rules, can be used.

Notes:
The expression of syntactic restrictions in XML, although relative
to a specific textual notation, provide a powerful and industrially
accepted mechanism for expression information needs, shared expectations etc.

This technique can be used against RDF/XML and non-RDF XML instance
formats.

For non-XML RDF formats, techniques such as W3C's GRDDL can be used
to indicate transformation technques (eg. using XSLT) that can
convert instances into DCAM-compatible abstractions and hence into
other formats. These transformations can ensure that instance data
can be converted into other DCAM bindings; however there is no known
mechanism for translating syntactic-binding level constraints into
their equivalent in other bindings.

RSS 1.0 is an example of a document format that has additional
non-RDF syntactic constraints (XML nesting structures, required element
patterns etc.). These conventions can be captured to varying degrees
using various XML schema languages.

Note also that RDF-based profiles need not necessarily using DC terms
in their instances data, since mappings to DC terms can be expressed
within schemas using RDFS/OWL technology.

4. Namespace and Term Enumeration

A simple technique for indicating an Application Profile is to list
the namespaces that are typically used in instance data.
The profile SHOULD indicate these namespaces using URIs.

DCAPs SHOULD indicate the status of any enumerated list of namespaces,
ie. whether the list is closed, open/extensible, mandatory, etc.

At a finer granularity, profiles MAY list metadata terms that
are typically used in instance data. The profile SHOULD indicate
these terms using URIs.

TODO: @@pointers to Schemas project etc proposals

5. Use Case Query Patterns

An Application Profile MAY characterise the descriptive patterns
important in some community of practice by expressing use cases
in terms of a query language (or other data access mechnism)
for some DCAM binding.

5a. Syntactic query patterns

For profiles presented in terms of a concrete syntactic binding
(techniques (2) and (3)), syntax-specific data matching and query
languages (eg. XPath, XSLT, XQuery) MAY be used to document 1 or
more common query structures. The Schematron language provides a
suitable high level approach for using XPath to document metadata
expectations in this way.

5b. Abstract query patterns

Profiles MAY document expected use cases in terms of abstract
query patterns, ie. structured in terms of abtract metadata
statements, properties and URIs rather than in terms of a
textual binding of the DCAM.

Given the close relationship between DCAM and RDF, any RDF
query language or data access formalism will be applicable to
this task. W3C's SPARQL language offers particular facilities
(such as datatyping and optional patterns) that make it a good
choice here.

An Application Profile is not itself a collection of such queries.
Rather, an Application Profile's documentation MAY be augmented with
1 or more indicative queries, provided as a way of linking
application requirements, usage scenarios, user-needs with
formally represented patterns of properties and metadata statements.

For example, the following query indicates a pattern for using
Dublin Core, FOAF and SKOS vocabularies together for the purpose
of finding the interests of colleagues of some specified Agent.
Note that this is a highly task-centric pattern, and not an
general abstract pattern. Profile documentation MAY present a
range of indicative queries, illustrating both specific scenarios
and more general vocabulary combination patterns.

(@@todo: fix SKOS stuff to be accurate!)

eg. 1: AP Use Case - "combining DC, FOAF + SKOS for
locating colleague subject interests"

PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX skos: <http://www.w3.org/2002/skos-namespace-@@@/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?ilabel ?iblurb ?wphp ?name

WHERE
{
[ rdf:type foaf:Agent;
foaf:name ?name;
foaf:workplaceHomepage ?wphp;
foaf:homepage [ dc:subject [ skos:label ?ilabel; skos:scopeNote ] ]
]
}

This says, in effect,

"Find us values for ?ilabel, ?iblurb, ?wphp, ?name where
there is something with a name we call ?name
that is of type Agent, that has a workplace homepage
that is ?wphp, and find us SKOS labels and scopenotes for the dc:subject
of that page".

Note that this:

- combines multiple vocabulary namespaces
- describes exact patterns for some (but not all!) ways in which
such namespaces can be combined
- binds to the DC Abstract Model via RDF
- and hence binds to any concrete DCAM syntax
- corresponds directly to machine-usable queries, for any data that
can be transformed into RDF metadata statements (eg. via GRDDL
per (3) above, for non-RDF/XML notations).

The results of such a query are sets of variable-value associations,
for example some results of this query above might be:

ilabel: Carpentry
iblurb: Making things with wood
name: Eric Miller
wphp: http://www.w3.org/

ilabel: Photography
iblurb: Taking blurry photos
name: Dan Brickley
wphp: http://www.w3.org/

...assuming a dataset in which SKOS, DC and FOAF were combined
appropriately, ie. FOAF for people stuff, DC for 'subject' (we could
also have asked for titles, descriptions etc of the agent's homepage),
and SKOS to elaborate on the details of the subject of the homepage).
(the simplistic scenario assumption here is that people are
interested in the topics their homepages cover...).

]]]

Received on Monday, 15 November 2010 19:39:55 UTC