RE: Recommendations: URIs from Young,Jeff (OR) on 2011-05-01 (public-lld@w3.org from May 2011)

From: Young,Jeff (OR) <jyoung@oclc.org>
Date: Sun, 1 May 2011 09:37:24 -0400
To: <gordon@gordondunsire.com>, <public-lld@w3.org>
Message-ID: <52E301F960B30049ADEFBCCF1CCAEF590C51CF87@OAEXCH4SERVER.oa.oclc.org>
> The simpler ISBD case would output a triple <ResourceURI> <has BNB
> number> "BNB number". Assuming a one-to-one correspondence between the
> BNB number (a "local" identifier) and the Resource instance, there
> would be no duplication of BNB numbers.

I agree this is the best way to handle all identifier schemes that aren't URIs. The general pattern would be this:

tbox:hasXYZIdentifier a rdf:Property;
 rdfs:comment "has XYZ identifier";
 rdfs:subProperty of dcterms:identifier.

When creating these properties, the domain and range should probably left unspecified unless it is being coined by the managing agency itself. If and when the managing agency does coin a property, then any local properties can be deprecated and mapped to the official property using owl:DeprecatedProperty and owl:equivalentProperty. 
I see some solutions that that try to treat :Identifier as a class with :identifierScheme and :identifierValue as properties, but I think that is a problematic alternative. This should alleviate the urgency to get all the various identifier agencies into the same room at the same time.

> In the FRBR case, only one of W, E, M, or I URIs would be the subject
> of the triple; e.g. <WorkURI> <has BNB number> "BNB number". Then
> <WorkURI> <frbrer:isRealizedThrough> <ExpressionURI>.
> <ExpressionURI> <frbrer:isEmbodiedIn> <ManifestationURI>.
> <ManifestationURI> <frbrer:isExemplifiedBy> <ItemURI>.

This can be enforced by adding an owl:maxCardinality restriction of 1 on the hasXYZIdentifier property.

> The other three triples are automatically generated. The issue is with
> the first triple: is a BNB number associated with a Work? Or a
> Manifestation? (Expression and Item seem unlikely to me.) I haven't
> thought that through, and I would expect the national agency to
> determine what is appropriate.

It would be ideal if official agencies coined these properties, but I don't think we need to wait for them to get around to it. When coining unofficial properties to fill in gaps, it's probably best to leave the rdfs:domain unspecified.

> I would expect different answers
> regarding other "local" identifiers, including ISBNs, etc.

ISBNs do have a URI form (urn:isbn), so it's probably not the best example and people don’t seem to use it for whatever reason. The point is important, though.

Jeff

> The
> essential point is that whatever the "identifier" triple, the FRBR
> relationship properties between Group 1 elements allow easy generation
> of the triples for the remaining three URIS to link to the local
> identifier. I'm not expecting the initial decisions to be easy (that's
> why the ISBD case is simpler).
> 
> Another approach is to determine the relationship property between the
> ISBD Resource class and the FRBR WEMI classes. This is something that
> the ISBD and FRBR Review Groups will do. There is a strong clue in the
> text of the ISBD consolidated edition: "In the terminology of the
> Functional Requirements for Bibliographic Records (FRBR), the ISBD is
> applied to describe manifestations, by means of description of the item
> in hand as an exemplar of the entire manifestation." This suggests that
> ISBD Resource is equivalent to FRBR Manifestation. But it can't be
> owl:sameAs ... because ISBD attribute properties (all with domain
> Resource) have equivalences across FRBR W, E, and M properties (all
> with the appropriate domain of Work, Expression, or Manifestation).
> [The FRBR WEMI classes are water under the bridge, although there may
> be an opportunity to tackle some of the resultant issues in the
> proposed consolidated FR family model.]
> 
> A way forward may also lie with the RDA unbounded properties currently
> under consideration by the DCMI RDA Task Group. That is, relating
> bounded (ranged) ISBD and FRBR properties to unbounded (not ranged)
> equivalents.
> 
> Whatever, the essential thing is to get existing LibraryLand
> identifiers (all basically "local") linked to the basic URIs for
> instance legacy records, so that, in most cases, a project can identify
> the URI via the "local" identifier, and then proceed to publish/migrate
> triples for the record using existing or newly-minted RDF properties.
> This is solely aimed at avoiding minting huge numbers of URIs for the
> same thing.
> 
> Of course, it would be neater all round if the national agencies just
> output all of their legacy records' instance data as triples, before
> any local project, to avoid the duplication of URIs ...
> 
> But, as you have said, there's the elephant of MARC(21) and the lack of
> RDF properties and/or mappings to existing namespaces (including DCT,
> BiBO, etc. as well as ISBD and FRBR).
> 
> I guess all this points to an increasing urgency in getting all the
> parties (national cataloguing agencies, metadata aggregators, and
> standards maintainers) together.
> 
> Cheers
> 
> Gordon
> 
> 
> 
> On 29 April 2011 at 17:36 Karen Coyle <kcoyle@kcoyle.net> wrote:
> 
> > Quoting "gordon@gordondunsire.com" <gordon@gordondunsire.com>:
> >
> >
> > > e.g. http://bl.info/bnb#1234 (Resource), http://bl.info/bnb#1234W

> (Work),
> > > http://bl.info/bnb#1234E (Expression), http://bl.info/bnb#1234M

> > > (Manifestation).
> > >
> > > What would actually be published is a set of triples of the form:
> > >
> > > <WEM/Resource URI> <has BNB number> "BNB number".
> >
> > Doesn't this result in there being multiple bnb numbers for the same
> > work and the same expression?
> >
> > Alex Haffner demonstrated a flow chart of the Europeana process at
> the
> > meeting in Cologne last year that was in two steps: the first looked
> > just like this, and the second was where they merged works and
> > expressions and assigned those new "merged" URIs. So you'd have
> >
> > W123  ->  WXX
> > E123  ->  E99
> > M123
> >
> > W789  -> WXX
> > E789  -> E88
> > M789
> >
> > Well, it would be easier to explain on paper than in email, but you
> > probably get the drift. I actually like the idea of the WEM having
> > non-merged and merged identifiers -- although that's based on system
> > management functions rather than the data model. The non-merged IDs
> > can be local, while the merged ones will be ideal for sharing.
> > (Hmmm,,, got to think about that some more.)
> >
> > kc
> >
> > >
> > > This would allow other projects to avoid creating duplicate URIs
> subsequently
> > > linked with OWL equivalence properties.
> > >
> > > A project would have to know, say, the BNB number ... The same
> approach could
> > > use other identifiers such as ISBN, etc., although there is
> significant
> > > ambiguity (not everything has an ISBN, some ISBNs are plain wrong,
> etc.).
> > > Extending this further, it might be necessary to publish some
> > > additional triples
> > > giving further identification data such as title and edition (i.e.
> a minimal
> > > identification/description set of triples):
> > >
> > > <WEM/Resource URI> <has title proper> "The title".
> > > <WEM/Resource URI> <has publication date> "2008".
> > > etc.
> > >
> > > This approach also minimises the quantity of triples that an agency
> needs to
> > > publish, reduces barriers due to rights issues, and extends the
> > > formal role of a
> > > national bibliographic agency in recording, preserving, and
> disseminating the
> > > publication output of that nation.
> > >
> > > Also, declaring which URI minting pattern is used will allow
> projects to mint
> > > future-proof URIs for local stuff.
> > >
> > > OK, in practice things would not be as straightforward (e.g.
> national
> > > bibliography numbers referencing Manifestations instead of
> > > Expressions or Works,
> > > ISBNs usually reference Manifestations but are often used to
> reference Works,
> > > etc.).
> > >
> > > I guess OCLC could use a similar approach on behalf of its members
> > > (especially
> > > those who are national agencies).
> > >
> > > Does this make sense?
> > >
> > > Cheers
> > >
> > > Gordon
> > >
> > >
> > >
> > >
> > >
> > >
> > > On 28 April 2011 at 04:51 Emmanuelle Bermes <manue@figoblog.org>
> wrote:
> > >
> > >> >
> > >> > We can obviously change the wording. But I still am not sure
> what we are
> > >> > promoting in terms of prioritizing the creation of URIs. Can we
> use Tom's
> > >> > wording?
> > >> >
> > >> > "Very broadly, the "library world", along with standards
> > >> > developers such as W3C, FOAF, and DCMI should work on assigning
> > >> > URIs to properties and classes.  But creators of specific
> > >> > Linked Data projects should be concerned first and foremost
> > >> > with _creating_ URIs for their things -- the "instances" about
> > >> > they want to make statements -- then re-use URIs for properties
> > >> > and classes (when possible) in order to make those statements."
> > >>
> > >> +1 for Tom's wording : great summary, as usual ;-)
> > >> Emma
> > >>
> > >>
> > >> >
> > >> > kc
> > >> >
> > >> > Quoting Ed Summers <ehs@pobox.com>:
> > >> >
> > >> >> On Wed, Apr 27, 2011 at 4:24 PM, Thomas Baker
> <tbaker@tbaker.de> wrote:
> > >> >>>
> > >> >>> I think we're agreeing that "assigning URIs" is a key point
> > >> >>> but that for the sake of readers we need to distinguish "URIs
> > >> >>> for properties and classes" from "URIs for dataset items
> > >> >>> (instances)".
> > >> >>
> > >> >> Nicely put Tom. I second Jeff's recommendation to at least
> reference
> > >> >> ABox and TBox to ground the more library friendly definitions
> wherever
> > >> >> that may happen: glossary, etc.
> > >> >>
> > >> >> //Ed
> > >> >>
> > >> >>
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > Karen Coyle
> > >> > kcoyle@kcoyle.net http://kcoyle.net

> > >> > ph: 1-510-540-7596
> > >> > m: 1-510-435-8234
> > >> > skype: kcoylenet
> > >> >
> > >> >
> > >> >
> > >>
> >
> >
> >
> > --
> > Karen Coyle
> > kcoyle@kcoyle.net http://kcoyle.net

> > ph: 1-510-540-7596
> > m: 1-510-435-8234
> > skype: kcoylenet
> >
> >
Received on Sunday, 1 May 2011 13:37:56 UTC