Use cases for multiple graphs in RDF, from SDMX/DDI

I've added three more use cases [1][2][3] for working with multiple graphs in RDF to the wiki page. These arose in my recent work on mapping two major XML-based statistical standards to RDF. Unlike most of the use cases considered so far, these are more about data management than provenance.

I'm copying the text below.

Best,
Richard

[1] http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs-UC#Versioning_in_SDMX_and_DDI
[2] http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs-UC#Marking_published_artefact_as_strongly_versioned
[3] http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs-UC#Composition_of_a_logical_artefact_from_multiple_published_artefacts


== Background ==

DDI and SDMX are two complementary XML-based standards widely used in the social and economic sciences and in national statistics. DDI is concerned with metadata that describes the production of statistical data (surveys, censuses, questionnaires, data cleanup, tabulation etc). SDMX is concerned with dissemination and reporting of aggregated “cube” data. An RDF Schema expression of the core of SDMX has been created as the RDF Data Cube Vocabulary; a similar effort is currently ongoing for DDI.


== 1. Versioning in SDMX and DDI ==

SDMX and DDI have strong notions of ownership and versioning. Artefacts such as code lists, question banks and data structure definitions are managed by an authority (“maintenance agency”) and have, besides an authority-assigned “local” identifier also a numeric version identifier. The identity of an artefact consists of its agency, artefact type, local identifier, version number, and (in special cases) the identity of a parent artefact. Different versions of an otherwise same artefact can exist side-by-side in the same containing XML instance.


== 2. Marking published artefacts as strongly versioned ==

DDI and SDMX impose a policy that an artefact can be marked as “published”, and once published it MUST NOT be changed unless a new version number is assigned.


== 3. Composition of a logical artefact from multiple published artefacts ==

An actual SDMX or DDI instance is often composed from various artefacts that can be maintained by different agencies. For example, an SDMX dataset that reports national statistics may use a data structure definition maintained by a supranational statistics organization such as Eurostat, and may use code lists defined by various standards bodies. Each of these artefacts are independently versioned. When referencing an artefact, the specific referenced version has to be part of the reference. The exception is “late binding”, where the latest available version is assumed, leading to a more brittle but easier to manage setup.

The XML specifications contain special elements that agencies can use to publish collections of re-usable artefacts.

When an actual SDMX dataset or DDI instance is processed, the processor must be able to retrieve the correct versions of all referenced artefacts and build a complete representation.

Received on Friday, 23 September 2011 19:12:22 UTC