RE: Back to identifiers from Young,Jeff (OR) on 2013-01-23 (public-schemabibex@w3.org from January 2013)

From: Young,Jeff (OR) <jyoung@oclc.org>
Date: Wed, 23 Jan 2013 16:27:49 -0500
To: "Graham Bell" <graham@editeur.org>
Cc: <kcoyle@kcoyle.net>, <public-schemabibex@w3.org>, "Laura Dawson" <ljndawson@gmail.com>
Message-ID: <52E301F960B30049ADEFBCCF1CCAEF5912C014C8@OAEXCH4SERVER.oa.oclc.org>
Graham,

 

Thanks for the background. 

 

Would it be possible to push the "distributed agency" redirect mechanism
I proposed from isbn.org to their agencies (like bowker.com) up an extra
level? For example:

 

http://gtin.info/gtin/{.*} 

crunch, crunch...

303 (See Other) redirect to...

 

http://isbn.org/isbn/{.*}

crunch, crunch...

303 (See Other) redirect to...

 

http://bowker.org/isbn/{.*}

200 (OK) {insert description here}

 

The fact that these were originally conceptualized as string identifiers
doesn't need to be a barrier to upgrading them to Linked Data.

 

Jeff

 

From: Graham Bell [mailto:graham@editeur.org] 
Sent: Wednesday, January 23, 2013 12:36 PM
To: Young,Jeff (OR)
Cc: kcoyle@kcoyle.net; public-schemabibex@w3.org; Laura Dawson
Subject: Re: Back to identifiers

 

 

Yes, Jeff, it's fair to ask. So I checked with the International ISBN
Agency, which is the Registration Authority for the ISBN standard (and
which happens to be based in the next office to me...)

 

ISO is the International Organization for Standardization and this body
is responsible for developing and publishing thousands of international
standards in all fields of work and life. In practice, international
standards are developed by a panel of experts within an ISO-convened
technical committee. Once a draft has been developed, it is shared and
voted on by ISO's national member bodies. For example in the US this
body is ANSI. International standards are considered documents that pass
through a number of different stages on their way to consensus and
publication - it typically takes between 2 and 4 years, though it can be
much longer.

 

Each international standard must have a defined scope. ISBN
(International Standard Book Number) has a scope that includes books
(not serials or music scores) and certain types of related products that
are available to the public. Since it is intended to be an identifier
for the supply chain, there has always tended to be a high level of
flexibility to allow educational products in a variety of formats to
qualify for ISBN, since they often flow through the same supply chain.
Hence, flash reading cards and educational software has been included
within ISBN scope, though games are not. Similarly audio books were
included in scope because the standard states that the format in which
content is delivered is irrelevant - thus the text could be audible and
qualify to receive an ISBN. A feature film presentation would not
qualify though.

 

ISBN was first developed in the late 1960s, and was the first kid on the
block in terms of a supply chain identifier for the book world - and as
such it was quickly and very widely adopted. Over time, and as ISBN was
globally accepted, the need for other more specialised identifiers in
other parts of the media space has been recognised. So for example, ISSN
(International Standard Serial Number), ISMN (International Standard
Music Number) ISRC (International Standard Recording Code) have been
introduced. It is also a fundamental tenet of international standards
developed by ISO that a particular identifier should not be used when
another, more appropriate identifier designed for that purpose is
available - thus don't assign an ISBN to a music score because an ISMN
should be used instead. So while the scope of ISBN has grown to
encompass many items that flow through the books supply chain, it has to
be clearly differentiated from the other ISO identifier standards.

 

In addition there are other types of standard identifier - such as the
GTIN (Global Trade Item Number) developed by GS1 (www.gs1.org
<http://www.gs1.org/> ). GTINs can be used to identify ANY item that can
be ordered, priced or invoiced in the supply chain. There are different
types of GTIN (8, 12, 13 or 14 digits) but essentially the common factor
that they share is that they are unrestricted in scope. (ISBN is in
effect a restricted sub-scope of GTIN-13, as is ISMN.)

 

The ISBN standard is currently in its 4th edition since it was first
adopted in 1972 by ISO. Each new edition has brought new developments
and extensions in scope as the market for books has developed. Further
changes in scope would be considered provided they were in keeping with
the broad purpose of ISBN (i.e. to identify textual monographic
publications and certain related products that are available to the
public) and that they did not conflict with other existing (or soon to
be existing) identifier standards.

 

And finally, there have been questions about expressing ISBNs as URIs.
First, remember that ISBNs somewhat predate the Internet. They do have a
URN representation, but obviously that is not resolvable. There is also
the ISBN-A - which is in fact a variety of DOI. This does allow you to
represent some ISBNs using a http URI, and resolution of that URI can
lead to metadata about the book. However, the ISBN-A is not yet widely
used: it's confined largely to Italy and Germany, and someone has to
register the ISBN-A (separately from assigning the ISBN itself) and
maintain the resolution data. There's no 'global' ISBN-A system since -
ultimately - there is no central database of all the ISBNs that have
been assigned. Registration is managed by 160 or so national agencies,
like Bowker in the USA, who individually retain the metadata provided by
publishers for each assigned ISBN.

 

Graham

 

 

 

Graham Bell

EDItEUR

 

EDItEUR Limited is a company limited by guarantee, registered in England
no 2994705. Registered Office: United House, North Road, London N7 9DP,
UK. Website: http://www.editeur.org

 

 

 

On 21 Jan 2013, at 22:26, Young,Jeff (OR) wrote:





It seems fair to ask why ISO limited ISBNs to those types of things. Are
those reasons still valid?

 

Jeff

 

From: Graham Bell [mailto:graham@editeur.org] 
Sent: Saturday, January 19, 2013 11:16 AM
To: kcoyle@kcoyle.net
Cc: Young,Jeff (OR); public-schemabibex@w3.org; Laura Dawson
Subject: Re: Back to identifiers

 

For completeness, some text from the International ISBN Agency
website... and as you can see, not all of these are 'books'.

 

	Some examples of the types of publication that qualify for ISBN
are:
	* Printed books and pamphlets
	* Individual chapters or sections of a publication if these are
made available separately
	* Braille publications
	* Publications that are not intended by the publisher to be
updated regularly or continued indefinitely
	* Individual articles or issues of a particular continuing
resource (but not the continuing resource in its entirety)
	* Maps
	* Educational/instructional films, videos and transparencies
	* Audiobooks on cassette, or CD, or DVD (talking books)
	* Electronic publications either on physical carriers (such as
machine-readable tapes, diskettes, or CD-ROMs) or on the Internet
	* Digitised copies of print monographic publications
	* Microform publications
	* Educational or instructional software

	* Mixed media publications (where the principal constituent is
text-based)

 

Graham

 

 

 

Graham Bell

EDItEUR

 

Tel: +44 20 7503 6418

Mob: +44 7887 754958

 

EDItEUR Limited is a company limited by guarantee, registered in England
no 2994705. Registered Office: United House, North Road, London N7 9DP,
UK. Website: http://www.editeur.org

 

 

 

 

On 19 Jan 2013, at 16:08, Laura Dawson wrote:






It isn't, though. It's ISO's. we just administer it in the US and AUS.

Sent from my iPhone

On Jan 19, 2013, at 4:56 PM, Karen Coyle <kcoyle@kcoyle.net> wrote:





 

	 

	On 1/18/13 7:09 PM, Young,Jeff (OR) wrote:

	 

		 

		I agree with Kevin's recap and absolutely agree and
encourage Bowker

		to publish (and recollect if necessary) this kind of
resource as

		well. (Inconsistency aside, Schema.org has three
different ways to

		encode ISBNs because they care.) My only quibble is to
encourage

		Bowker to make this particular URI pattern "very clean"
by

		eliminating the /books token from their URI. The reason
is, unless

		I'm mistaken, some ISBNs (and potentially more in the
future) don't

		identify "books".

	 

	I'm not sure what you are referring to as ISBNs that don't
identify books. But in any case, they could be intending to do as LC
does and interpolate a level for the type of thing being described:

	 

	http://id.loc.gov/authorities/subjects/sh2006008786.html

	http://id.loc.gov/vocabulary/relators/ill.html

	 

	I think it's best to let Bowker decide, since it's their
identifier.

	 

	kc

	 

	 

		 

		I agree with the rest of Kevin's message, so it's nice
to see some

		convergence!

		 

		Jeff

		 

			Generally, an ISBN will be treated like a string
of characters,

			like so:

			 

			<http://www.worldcat.org/oclc/38264520>
schema:isbn "9780553479430"

			.

			 

			but there may be cases where you could have
something like this:

			 

			<http://www.worldcat.org/oclc/38264520>
schema:isbn

			"9780553479430"; schema:identifier _:b123;
schema:identifier

			_:b456; schema:identifier _:b789;

			 

			_:b123 a schema:Identifier; schema:issuedBy
"Harvard UL";

			schema:name "12345678".

			 

			_:b456 a schema:Identifier; schema:issuedBy
"Yale UL"; schema:name

			"asdfghj".

			 

			_:b789 a schema:Identifier; schema:issuedBy
"Princeton UL";

			schema:name "qwerttyu".

			 

			Now, whether we want to do propose this as part
of a Schema

			extension is another matter, but the issue Karen
raised is real and

			present in the data.  And, if you're an
institution like Bowker,

			there are two ways you can describe an
identifier.

			 

			As for the whole business about what an ISBN is
or isn't or what it

			can or cannot do, well...  I can see it both
ways.  An ISBN is an

			identifier.  It can identify a Book.

			 

			As for a URI, it's whatever the data says it is.

			 

			Yours, Kevin

			 

			 

			 

			 

			On 01/18/2013 06:04 PM, Corey Harper wrote:

				I see your point, Jeff, and you're
definitely correct about your

				use of redirects & to-the-letter
adherence to all that fun

				range-14

			stuff,

				though I'm getting a 301 rather than a
303 (see below)...

				 

				I'm just a little wary of reusing an
identifier that has a

				pretty specific legacy meaning as both a
thing ID and a metadata

				ID, particularly when the primary usage
seems to be the former.

				 

				I suspect that's just a discomfort that
I'll get over when/if

				the legacy meanings are slowly erased
from our collective

				memories... :)

				 

				Thanks, -Corey

				 

				*** 301-ing for me... ***

				curl -I
http://www.worldcat.org/oclc/38264520 HTTP/1.1 301

				Moved Permanently Date: Fri, 18 Jan 2013
22:59:22 GMT Server:

				Apache Location:
/title/war-and-peace/oclc/38264520

				 

				This new location 200's w/ or without
Accept headers...

				 

				 

				On Fri, Jan 18, 2013 at 4:04 PM,
Young,Jeff (OR)

				<jyoung@oclc.org>

			wrote:

				Corey,

				 

				You're not crazy. A URI is an
identifier.

				 

				There is no good reason to model
identifiers as both URIs and

				non-

			URI text-strings now-a-days. The latter need to
carry too much

			context to be effective. Nevertheless, they
exist in legacy

			systems. The mechanism that's being proposed
creates a bridge from

			legacy string identifiers to the URI
identifiers. Only systems that

			are coupled with the legacy forms will care
about this bridge.

			Whether Schema.org cares enough about the past
to adopt such an

			identifier bridge is unclear. That's why Richard
suggests tabling

			this discussion in favor of SKOS patterns (which
are effectively

			the same).

				 

				The reason the example is weird is
because you're overlooking

				the

			implications of Cool URIs for the Semantic Web.

				 

				http://www.w3.org/TR/cooluris/

				 

				The example doesn't identify OCLC
metadata, it identifies a

				Book

			that OCLC has coined a URI for. The metadata
entity has a different

			URI identifier. The 303 redirect from the former
to the latter is

			merely a convenience mechanism.

				 

				Jeff

				 

				-----Original Message----- From: Corey
Harper

				[mailto:corey.harper@gmail.com] Sent:
Friday, January 18,

				2013 2:42 PM To: kcoyle@kcoyle.net Cc:
Young,Jeff (OR);

				public-schemabibex@w3.org Subject: Re:
Back to identifiers

				 

				Karen, et al.,

				 

				How is a URI not an identifier? That's
what the "I" stands

				for,

			right?

				Am I missing something here? Why would
we want two different

				design patterns for actionable http
identifiers &

				text-strings as

			identifiers?

				 

				The kinds of additional metadata one
might associate with an

				identifier (who maintains it, when it
was issued, &c) seem to

				apply irrespective of whether the
identifier is a URI or a

				string of

			text,

				no? I agree that the URI for the ISBN
does not *need* to be

			defined.

				But should that prevent an agency that
manages library

				identifiers from defining it? I'm not
sure I agree that this

				is out of scope,

			as

				this is exactly the kind of metadata
libraries & related

			organizations provide.

				Now, it's out of scope for a discussion
of schema.org

				metadata

			about

				the books themselves; that I agree with.

				 

				And I also agree that it's weird that
the example claims that

				the ISBN "identifies" some OCLC
metadata. That seems wrong to

				me. If anything, both identifier point,
though indirectly, to

				a book.

				 

				Thanks, Corey

				 

				On Fri, Jan 18, 2013 at 2:21 PM, Karen
Coyle

				<kcoyle@kcoyle.net>

			wrote:

				No, a URI is a URI. The identifier
property extension that

				we have talked about is for identifiers
that are not URIs.

				I believe at

			one

				point we had something like:

				 

				Identifier - value - source/authority

				 

				Thus, the URI for the ISBN does not need
to be defined

				using the identifier property extension.
Yet the example on

				the identifier page

				is:

				 

	
<http://bowker.com/identifiers/isbn/9780553479430> a

				schema:Identifier; schema:name
"9780553479430";

				schema:inStandard "ISBN";
schema:issuedBy

				<http://viaf.org/viaf/142397918>;
schema:issueDate

				"1997"; schema:identifies

				<http://www.worldcat.org/oclc/38264520>.

				 

				Maybe I'm reading this wrong, but as
long as there is a URI

				for

			the

				ISBN (and there always is because there
is a defined URN

				for

			ISBN),

				then there is no need to re-describe it
with the

				identifier

				extension.

				This description of the identifier I
believe is out of

				scope for our work. (And looks a lot
like ARK, which

				possibly had everything right but did
not get wide-spread

				traction). I think we should stick to
our task of finding a

				way to use identifiers that do not

			yet have URIs.

				If, instead, you are intending to mint
URIs for those

				identifiers

				(issuedBy: above) then that is another
case.

				This construct appears in the examples
but not in the text,

				and I don't think we discussed that
here. I think it would

				be over-reaching at this point in time.

				 

				But what really baffles me here is that
the Bowker ISBN is

				stated as identifying a WorldCat
"thing." If anything, that

				would be reversed since the ISBN is
assigned to the book

				before any library data is created. I do
consider the ISBN

				to be *the* book

			identifier

				in our world and that perhaps our
examples should look more

				like publishing examples than library
catalog examples.

				 

				kc

				 

				 

				 

				On 1/18/13 9:52 AM, Young,Jeff (OR)
wrote:

				 

				I'm not sure I follow. The WorldCat URI
is a URI, but it

				wouldn't make sense to say that its
rdf:type is

				xyz:Identifier. Is that

			the

				concern?

				That's what I thought Richard was saying
for awhile too,

				but if you look at this examples he does
keep them

				separate.

				 

				Jeff

				 

				-----Original Message----- From: Karen
Coyle

				[mailto:kcoyle@kcoyle.net] Sent: Friday,
January 18,

				2013 12:48 PM To: Young,Jeff (OR)
Subject: Re: Back to

				identifiers

				 

				Worldcat URI is a URI. ISBN URI is a
URI. Any problem

				there?

				 

				 

				kc

				 

				On 1/18/13 9:42 AM, Young,Jeff (OR)
wrote:

				 

				Note that a WorldCat.org URI is not a
number. The

				Linked Data 303

				 

				(See

				 

				Other) redirect is important because the
1st URI

				identifies

			"the

				 

				thing"

				 

				and the second identifies "a description
of the

				thing" (what Corey call "a record").
Both can have

				the same legacy number in them

				 

				without

				 

				causing ambiguity.

				 

				Jeff

				 

				-----Original Message----- From: Karen
Coyle

				[mailto:kcoyle@kcoyle.net] Sent: Friday,
January

				18, 2013 12:36 PM To: Wallis,Richard Cc:
Corey

				Harper;
public-schemabibex@w3.orgSubject: Re: Back

				to identifiers

				 

				 

				 

				On 1/18/13 8:58 AM, Richard Wallis
wrote:

				 

				For practical reasons, I don't support
the

				notion that an OCLC

				#

				 

				or

				 

				an LCCN are strictly identifiers for a
book.

				 

				 

				Neither do I

				 

				Well, that's news to me, because when I
suggested

				this to you,

				you

				 

				came

				 

				back with (and I quoted this before):

				 

				"The ISBN is a string of characters (in
ISBN scheme

				that Bowkers administer) that they have
issued to

				represent the book - it

			is

				not

				 

				the

				 

				book.

				 

				The WorldCat URI identifies the Book."

				 

				And in another post:

				 

				*** URIs are about providing
dereferencable

				identifiers for

			'things'.

				 

				So when for instance the British Library
asserts

				that the URI for a book in the BNB is
sameAs in the

				German National library they are saying
the books

				are the same, not the records they

			have.

				 

				It is the same with WorldCat - it's not
just a pile

				of records it

				 

				is

				 

				[becoming] a graph (to use the current
label) of

				relationships between things - people,
places,

				organisations, concepts, and
bibliographic works.

				 

				The URIs represent the things not the
records that

				are being

				mined

				 

				to

				 

				build descriptions of those things.

				 

				***

				 

				You might see why I have been confused.

				 

				Here's my take:

				 

				Because of how we have done things in
the past, we

				have identifiers

				 

				for

				 

				records that describe some level of
bibliographic

				item. De facto,

				 

				we

				 

				have also used those identifiers for the
"things"

				they

			describe.

				I

				suspect that this is a common situation
for anyone

				in data processing, and I suggest that
we not

				agonize over it but live with

				 

				the ambiguity.

				 

				 

				And in this ambiguous world, ISBNs,
LCCNs, BNB #s,

				OCLC#s, all work reasonably well to
identify a

				creative output. They may also at

				 

				times

				 

				represent the record. That's life.

				 

				So, back to identifiers (and I do NOT
want this

				wrapped up in the discussion about SKOS
because I

				DO NOT see SKOS:concept as valid

				 

				for

				 

				an

				 

				identifier), I think our identifier
proposal should

				be for identifiers that are not in URI
format. full

				stop.

				 

				kc

				 

				-- Karen Coyle
kcoyle@kcoyle.nethttp://kcoyle.net

				ph: 1-510-540-7596 m: 1-510-435-8234
skype:

				kcoylenet

				 

				-- Karen Coyle kcoyle@kcoyle.net
http://kcoyle.net ph:

				1-510-540-7596 m: 1-510-435-8234 skype:
kcoylenet

				 

				-- Karen Coyle kcoyle@kcoyle.net
http://kcoyle.net ph:

				1-510-540-7596 m: 1-510-435-8234 skype:
kcoylenet

	 

	--

	Karen Coyle

	kcoyle@kcoyle.net http://kcoyle.net

	ph: 1-510-540-7596

	m: 1-510-435-8234

	skype: kcoylenet
Received on Wednesday, 23 January 2013 21:28:29 UTC