RE: Back to identifiers from Young,Jeff (OR) on 2013-01-19 (public-schemabibex@w3.org from January 2013)

From: Young,Jeff (OR) <jyoung@oclc.org>
Date: Fri, 18 Jan 2013 22:09:10 -0500
To: "Kevin Ford" <kefo@3windmills.com>, <public-schemabibex@w3.org>
Message-ID: <52E301F960B30049ADEFBCCF1CCAEF5912B540B6@OAEXCH4SERVER.oa.oclc.org>
> At the risk of being simple, the URI identifies whatever its
> description says it is.

This can't be denied. My eyes were opened when Richard Cyganiak made this point awhile back because it is a reasonable challenge to httpRange-14. To use or ignore httpRange-14 is a choice that needs to be made by organizational policy-makers who understand the implications (without expecting them to understand HTTP protocol per se). There are legal/moral/utility implications of collecting and managing the description of the "information resource" as distinct from the "non-information resource" that is being described. The httpRange-14 pattern reserves/preserves that ability regardless of whether early adopters realize/utilize/accept those responsibilities or not.

> In this snippet from Karen's email
> 
>  >>>> <http://bowker.com/identifiers/isbn/9780553479430>
>  >>>>      a schema:Identifier;
>  >>>>      schema:name "9780553479430";
>  >>>>      schema:inStandard "ISBN";
>  >>>>      schema:issuedBy <http://viaf.org/viaf/142397918>;
>  >>>>      schema:issueDate  "1997";
>  >>>>      schema:identifies <http://www.worldcat.org/oclc/38264520>.
> 
> The URI identifies an identifier.  We learn a lot about the identifier:
>   the standard it conforms to, who issued it, when it was issued, what
> it identifies.  Identifying an identifier (minting a URI for it) is a
> bit of an abstract notion, which is why, I guess, it was being
> described as a skos:Concept in the earlier emails of this thread, but
> there is administrative data there someone might benefit from (and I
> imagine more could be added to it, especially if you were Bowker).
> 
> This ancillary information about the actual ISBN number may seem
> unnecessary to some, but may be important to others. MARC certainly
> provides space to record such information about an identifier (maybe
> not about ISBNs, but other fields certainly - check out 028).  Karen
> proposes a way to capture that ancillary information.  It's particular
> vital when the identifier is not a (relatively) popular, well-known
> one, such as an ISBN or an LCCN or an OLCLNUM. Also, *if* Bowker minted
> such a URI, and provided that type of information, one could plausibly,
> simply reference the schema:identifier by URI.
> 
> Now, in addition to the above URI for the ISBN identifier itself,
> Bowker could choose to mint this URI (and associated RDF), which also
> takes advantage of the ISBN number:
> 
> <http://bowker.com/books/isbn/9780553479430>
>  a schema:Book;
>  schema:title "War and Peace";
>  schema:author <http://viaf.org/viaf/96987389>;
>  schema:pubDate  "1997";
>  schema:identifier
> <http://bowker.com/identifiers/isbn/9780553479430>.

I agree with Kevin's recap and absolutely agree and encourage Bowker to publish (and recollect if necessary) this kind of resource as well. (Inconsistency aside, Schema.org has three different ways to encode ISBNs because they care.) My only quibble is to encourage Bowker to make this particular URI pattern "very clean" by eliminating the /books token from their URI. The reason is, unless I'm mistaken, some ISBNs (and potentially more in the future) don't identify "books".

I agree with the rest of Kevin's message, so it's nice to see some convergence!

Jeff

> Generally, an ISBN will be treated like a string of characters, like
> so:
> 
> <http://www.worldcat.org/oclc/38264520> schema:isbn "9780553479430" .
> 
> but there may be cases where you could have something like this:
> 
> <http://www.worldcat.org/oclc/38264520>
>  schema:isbn "9780553479430";
>  schema:identifier _:b123;
>  schema:identifier _:b456;
>  schema:identifier _:b789;
> 
> _:b123 a schema:Identifier;
>  schema:issuedBy "Harvard UL";
>  schema:name "12345678".
> 
> _:b456 a schema:Identifier;
>  schema:issuedBy "Yale UL";
>  schema:name "asdfghj".
> 
> _:b789 a schema:Identifier;
>  schema:issuedBy "Princeton UL";
>  schema:name "qwerttyu".
> 
> Now, whether we want to do propose this as part of a Schema extension
> is another matter, but the issue Karen raised is real and present in
> the data.  And, if you're an institution like Bowker, there are two
> ways you can describe an identifier.
> 
> As for the whole business about what an ISBN is or isn't or what it can
> or cannot do, well...  I can see it both ways.  An ISBN is an
> identifier.  It can identify a Book.
> 
> As for a URI, it's whatever the data says it is.
> 
> Yours,
> Kevin
> 
> 
> 
> 
> On 01/18/2013 06:04 PM, Corey Harper wrote:
> > I see your point, Jeff, and you're definitely correct about your use
> > of redirects & to-the-letter adherence to all that fun range-14
> stuff,
> > though I'm getting a 301 rather than a 303 (see below)...
> >
> > I'm just a little wary of reusing an identifier that has a pretty
> > specific legacy meaning as both a thing ID and a metadata ID,
> > particularly when the primary usage seems to be the former.
> >
> > I suspect that's just a discomfort that I'll get over when/if the
> > legacy meanings are slowly erased from our collective memories... :)
> >
> > Thanks,
> > -Corey
> >
> > *** 301-ing for me... ***
> >> curl -I http://www.worldcat.org/oclc/38264520

> >> HTTP/1.1 301 Moved Permanently
> >> Date: Fri, 18 Jan 2013 22:59:22 GMT
> >> Server: Apache
> >> Location: /title/war-and-peace/oclc/38264520
> >
> > This new location 200's w/ or without Accept headers...
> >
> >
> > On Fri, Jan 18, 2013 at 4:04 PM, Young,Jeff (OR) <jyoung@oclc.org>
> wrote:
> >> Corey,
> >>
> >> You're not crazy. A URI is an identifier.
> >>
> >> There is no good reason to model identifiers as both URIs and non-
> URI text-strings now-a-days. The latter need to carry too much context
> to be effective. Nevertheless, they exist in legacy systems. The
> mechanism that's being proposed creates a bridge from legacy string
> identifiers to the URI identifiers. Only systems that are coupled with
> the legacy forms will care about this bridge. Whether Schema.org cares
> enough about the past to adopt such an identifier bridge is unclear.
> That's why Richard suggests tabling this discussion in favor of SKOS
> patterns (which are effectively the same).
> >>
> >> The reason the example is weird is because you're overlooking the
> implications of Cool URIs for the Semantic Web.
> >>
> >> http://www.w3.org/TR/cooluris/

> >>
> >> The example doesn't identify OCLC metadata, it identifies a Book
> that OCLC has coined a URI for. The metadata entity has a different URI
> identifier. The 303 redirect from the former to the latter is merely a
> convenience mechanism.
> >>
> >> Jeff
> >>
> >>> -----Original Message-----
> >>> From: Corey Harper [mailto:corey.harper@gmail.com]
> >>> Sent: Friday, January 18, 2013 2:42 PM
> >>> To: kcoyle@kcoyle.net
> >>> Cc: Young,Jeff (OR); public-schemabibex@w3.org
> >>> Subject: Re: Back to identifiers
> >>>
> >>> Karen, et al.,
> >>>
> >>> How is a URI not an identifier? That's what the "I" stands for,
> right?
> >>> Am I missing something here? Why would we want two different design
> >>> patterns for actionable http identifiers & text-strings as
> identifiers?
> >>>
> >>> The kinds of additional metadata one might associate with an
> >>> identifier (who maintains it, when it was issued, &c) seem to apply
> >>> irrespective of whether the identifier is a URI or a string of
> text,
> >>> no? I agree that the URI for the ISBN does not *need* to be
> defined.
> >>> But should that prevent an agency that manages library identifiers
> >>> from defining it? I'm not sure I agree that this is out of scope,
> as
> >>> this is exactly the kind of metadata libraries & related
> organizations provide.
> >>> Now, it's out of scope for a discussion of schema.org metadata
> about
> >>> the books themselves; that I agree with.
> >>>
> >>> And I also agree that it's weird that the example claims that the
> >>> ISBN "identifies" some OCLC metadata. That seems wrong to me. If
> >>> anything, both identifier point, though indirectly, to a book.
> >>>
> >>> Thanks,
> >>> Corey
> >>>
> >>> On Fri, Jan 18, 2013 at 2:21 PM, Karen Coyle <kcoyle@kcoyle.net>
> wrote:
> >>>> No, a URI is a URI. The identifier property extension that we have
> >>>> talked about is for identifiers that are not URIs. I believe at
> one
> >>>> point we had something like:
> >>>>
> >>>> Identifier
> >>>>   - value
> >>>>   - source/authority
> >>>>
> >>>> Thus, the URI for the ISBN does not need to be defined using the
> >>>> identifier property extension. Yet the example on the identifier
> >>>> page
> >>> is:
> >>>>
> >>>> <http://bowker.com/identifiers/isbn/9780553479430>
> >>>>      a schema:Identifier;
> >>>>      schema:name "9780553479430";
> >>>>      schema:inStandard "ISBN";
> >>>>      schema:issuedBy <http://viaf.org/viaf/142397918>;
> >>>>      schema:issueDate  "1997";
> >>>>      schema:identifies <http://www.worldcat.org/oclc/38264520>.
> >>>>
> >>>> Maybe I'm reading this wrong, but as long as there is a URI for
> the
> >>>> ISBN (and there always is because there is a defined URN for
> ISBN),
> >>>> then there is no need to re-describe it with the identifier
> >>> extension.
> >>>> This description of the identifier I believe is out of scope for
> >>>> our work. (And looks a lot like ARK, which possibly had everything
> >>>> right but did not get wide-spread traction). I think we should
> >>>> stick to our task of finding a way to use identifiers that do not
> yet have URIs.
> >>>> If, instead, you are intending to mint URIs for those identifiers
> >>> (issuedBy: above) then that is another case.
> >>>> This construct appears in the examples but not in the text, and I
> >>>> don't think we discussed that here. I think it would be
> >>>> over-reaching at this point in time.
> >>>>
> >>>> But what really baffles me here is that the Bowker ISBN is stated
> >>>> as identifying a WorldCat "thing." If anything, that would be
> >>>> reversed since the ISBN is assigned to the book before any library
> >>>> data is created. I do consider the ISBN to be *the* book
> identifier
> >>>> in our world and that perhaps our examples should look more like
> >>>> publishing examples than library catalog examples.
> >>>>
> >>>> kc
> >>>>
> >>>>
> >>>>
> >>>> On 1/18/13 9:52 AM, Young,Jeff (OR) wrote:
> >>>>>
> >>>>> I'm not sure I follow. The WorldCat URI is a URI, but it wouldn't
> >>>>> make sense to say that its rdf:type is xyz:Identifier. Is that
> the
> >>> concern?
> >>>>> That's what I thought Richard was saying for awhile too, but if
> >>>>> you look at this examples he does keep them separate.
> >>>>>
> >>>>> Jeff
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Karen Coyle [mailto:kcoyle@kcoyle.net]
> >>>>>> Sent: Friday, January 18, 2013 12:48 PM
> >>>>>> To: Young,Jeff (OR)
> >>>>>> Subject: Re: Back to identifiers
> >>>>>>
> >>>>>> Worldcat URI is a URI. ISBN URI is a URI. Any problem there?
> >>>>>>
> >>>>>>
> >>>>>> kc
> >>>>>>
> >>>>>> On 1/18/13 9:42 AM, Young,Jeff (OR) wrote:
> >>>>>>>
> >>>>>>> Note that a WorldCat.org URI is not a number. The Linked Data
> >>>>>>> 303
> >>>>>>
> >>>>>> (See
> >>>>>>>
> >>>>>>> Other) redirect is important because the 1st URI identifies
> "the
> >>>>>>
> >>>>>> thing"
> >>>>>>>
> >>>>>>> and the second identifies "a description of the thing" (what
> >>>>>>> Corey call "a record"). Both can have the same legacy number in
> >>>>>>> them
> >>>>>>
> >>>>>> without
> >>>>>>>
> >>>>>>> causing ambiguity.
> >>>>>>>
> >>>>>>> Jeff
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Karen Coyle [mailto:kcoyle@kcoyle.net]
> >>>>>>>> Sent: Friday, January 18, 2013 12:36 PM
> >>>>>>>> To: Wallis,Richard
> >>>>>>>> Cc: Corey Harper; public-schemabibex@w3.org
> >>>>>>>> Subject: Re: Back to identifiers
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 1/18/13 8:58 AM, Richard Wallis wrote:
> >>>>>>>>
> >>>>>>>>>> For practical reasons, I don't support the notion that an
> >>>>>>>>>> OCLC
> >>> #
> >>>>>>
> >>>>>> or
> >>>>>>>>>>
> >>>>>>>>>> an LCCN are strictly identifiers for a book.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Neither do I
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>> Well, that's news to me, because when I suggested this to you,
> >>> you
> >>>>>>>
> >>>>>>> came
> >>>>>>>>
> >>>>>>>> back with (and I quoted this before):
> >>>>>>>>
> >>>>>>>> "The ISBN is a string of characters (in ISBN scheme that
> >>>>>>>> Bowkers
> >>>>>>>> administer) that they have issued to represent the book - it
> is
> >>>>>>>> not
> >>>>>>>
> >>>>>>> the
> >>>>>>>>
> >>>>>>>> book.
> >>>>>>>>
> >>>>>>>> The WorldCat URI identifies the Book."
> >>>>>>>>
> >>>>>>>> And in another post:
> >>>>>>>>
> >>>>>>>> ***
> >>>>>>>> URIs are about providing dereferencable identifiers for
> 'things'.
> >>>>>>>>
> >>>>>>>> So when for instance the British Library asserts that the URI
> >>>>>>>> for a book in the BNB is sameAs in the German National library
> >>>>>>>> they are saying the books are the same, not the records they
> have.
> >>>>>>>>
> >>>>>>>> It is the same with WorldCat - it's not just a pile of records
> >>>>>>>> it
> >>>>>
> >>>>> is
> >>>>>>>>
> >>>>>>>> [becoming] a graph (to use the current label) of relationships
> >>>>>>>> between things - people, places, organisations, concepts, and
> >>>>>>>> bibliographic works.
> >>>>>>>>
> >>>>>>>> The URIs represent the things not the records that are being
> >>> mined
> >>>>>>
> >>>>>> to
> >>>>>>>>
> >>>>>>>> build descriptions of those things.
> >>>>>>>>
> >>>>>>>> ***
> >>>>>>>>
> >>>>>>>> You might see why I have been confused.
> >>>>>>>>
> >>>>>>>> Here's my take:
> >>>>>>>>
> >>>>>>>> Because of how we have done things in the past, we have
> >>>>>>>> identifiers
> >>>>>>>
> >>>>>>> for
> >>>>>>>>
> >>>>>>>> records that describe some level of bibliographic item. De
> >>>>>>>> facto,
> >>>>>
> >>>>> we
> >>>>>>>>
> >>>>>>>> have also used those identifiers for the "things" they
> describe.
> >>> I
> >>>>>>>> suspect that this is a common situation for anyone in data
> >>>>>>>> processing, and I suggest that we not agonize over it but live
> >>>>>>>> with
> >>>>>>
> >>>>>> the ambiguity.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> And in this ambiguous world, ISBNs, LCCNs, BNB #s, OCLC#s, all
> >>>>>>>> work reasonably well to identify a creative output. They may
> >>>>>>>> also at
> >>>>>>
> >>>>>> times
> >>>>>>>>
> >>>>>>>> represent the record. That's life.
> >>>>>>>>
> >>>>>>>> So, back to identifiers (and I do NOT want this wrapped up in
> >>>>>>>> the discussion about SKOS because I DO NOT see SKOS:concept as
> >>>>>>>> valid
> >>>>>
> >>>>> for
> >>>>>>>
> >>>>>>> an
> >>>>>>>>
> >>>>>>>> identifier), I think our identifier proposal should be for
> >>>>>>>> identifiers that are not in URI format. full stop.
> >>>>>>>>
> >>>>>>>> kc
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Karen Coyle
> >>>>>>>> kcoyle@kcoyle.net http://kcoyle.net

> >>>>>>>> ph: 1-510-540-7596
> >>>>>>>> m: 1-510-435-8234
> >>>>>>>> skype: kcoylenet
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Karen Coyle
> >>>>>> kcoyle@kcoyle.net http://kcoyle.net
> >>>>>> ph: 1-510-540-7596
> >>>>>> m: 1-510-435-8234
> >>>>>> skype: kcoylenet
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>> --
> >>>> Karen Coyle
> >>>> kcoyle@kcoyle.net http://kcoyle.net

> >>>> ph: 1-510-540-7596
> >>>> m: 1-510-435-8234
> >>>> skype: kcoylenet
> >>>>
> >>
> >
>
Received on Saturday, 19 January 2013 03:10:46 UTC