- From: Karen Coyle <kcoyle@kcoyle.net>
- Date: Tue, 04 Dec 2012 17:42:24 -0800
- To: "public-schemabibex@w3.org" <public-schemabibex@w3.org>
I did check these fields on what I can find of the Moen statistics (a large study of MARC field frequency), so there may be some we can defer. Unfortunately, what I have of those stats only covers books, not, for example, serials or music, so I am making a guess here, but these fields seem to be used less in less than 80% of the relevant records: 013 - Patent Control Information (R) Full | Concise 017 - Copyright or Legal Deposit Number (R) Full | Concise 024 - Other Standard Identifier (R) Full | Concise 025 - Overseas Acquisition Number (R) Full | Concise 026 - Fingerprint Identifier (R) Full | Concise 027 - Standard Technical Report Number (R) Full | Concise 031 - Musical Incipits Information (R) Full | Concise 035 - System Control Number (R) Full | Concise I rather expected the GPO item number (074) to be higher, but it is not. However, I've lost access to the full set of stats so I don't know its actual frequency. (Some files are on the original site are giving me 404) I'll see if I can rectify this. kc On 12/4/12 11:45 AM, Karen Coyle wrote: > It kind of depends on what you consider a bibliographic identifier. So > maybe our first step should be to define that. > > Here are the ones that I find in the MARC21 format: > > 010 - Library of Congress Control Number (NR) Full | Concise > 013 - Patent Control Information (R) Full | Concise > 015 - National Bibliography Number (R) Full | Concise > 016 - National Bibliographic Agency Control Number (R) Full | Concise > 017 - Copyright or Legal Deposit Number (R) Full | Concise > 020 - International Standard Book Number (R) Full | Concise > 022 - International Standard Serial Number (R) Full | Concise > 024 - Other Standard Identifier (R) Full | Concise > 025 - Overseas Acquisition Number (R) Full | Concise > 026 - Fingerprint Identifier (R) Full | Concise > 027 - Standard Technical Report Number (R) Full | Concise > 028 - Publisher Number (R) Full | Concise > 030 - CODEN Designation (R) Full | Concise > 031 - Musical Incipits Information (R) Full | Concise > 032 - Postal Registration Number (R) Full | Concise > 035 - System Control Number (R) Full | Concise > ?036 - Original Study Number for Computer Data Files (NR) Full | Concise > 074 - GPO Item Number (R) Full | Concise > > I think this is all of them.... Then we go on to the classification codes: > > > 050 - Library of Congress Call Number (R) Full | Concise > 052 - Geographic Classification (R) Full | Concise > 055 - Classification Numbers Assigned in Canada (R) Full | Concise > 060 - National Library of Medicine Call Number (R) Full | Concise > 070 - National Agricultural Library Call Number (R) Full | Concise > ?072 - Subject Category Code (R) Full | Concise > > And that doesn't cover thesauri. However, we may want to ignore any > thesauri that cannot provide URIs? > > kc > > > > On 12/4/12 11:28 AM, Ross Singer wrote: >> >> On Dec 4, 2012, at 2:23 PM, Ed Summers <ehs@pobox.com >> <mailto:ehs@pobox.com>> wrote: >> >>> Call me naive, but I contend that most bibliographic identifiers are >>> expressable as URIs (URNs, info-uris, URLs) and that as such they can >>> use microdata's itemid [1]. Is there really a problem here? >> >> +1 >> >> I was hoping to suggest something along these lines, but had lacked the >> cycles to actually do the research to back it up. >> >> -Ross. >>> >>> //Ed >>> >>> [1] >>> http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#global-identifiers-for-items >>> >>> >>> >>> On Tue, Dec 4, 2012 at 9:00 AM, Karen Coyle <kcoyle@kcoyle.net >>> <mailto:kcoyle@kcoyle.net>> wrote: >>> >>> >>> >>> On 12/4/12 5:01 AM, Shlomo Sanders wrote: >>> >>> For what it is worth, I prefer: >>> >>> ISBN-10<span property=" identifier" >>> typeof="ISBN">0316769487</__span> >>> >>> >>> I don't think this is correct -- unless you have a property that >>> is "ISBN". The "typeof" takes a property, not a value. >>> >>> Any values have to be outside of the <> unless you use a meta tag. >>> see: >>> http://schema.org/docs/gs.__html#advanced_missing >>> <http://schema.org/docs/gs.html#advanced_missing> >>> >>> Maybe that's how we'll have to go - with meta. >>> >>> kc >>> >>> >>> >>> Or >>> ISBN-10: <span itemprop="isbn">0316769487</__span> >>> >>> These are short and clean. >>> The itemprop="isbn" is not generic since the valid values for >>> itemprop is enumerated? >>> Is that the same issue for typeof? >>> >>> -----Original Message----- >>> From: Karen Coyle [mailto:kcoyle@kcoyle.net >>> <mailto:kcoyle@kcoyle.net>] >>> Sent: Tuesday, December 04, 2012 14:58 >>> To: public-schemabibex@w3.org <mailto:public-schemabibex@w3.org> >>> Subject: Re: Missing Schema.Org <http://Schema.Org> properties >>> >>> Do we need to consider how this might be displayed, since >>> schema.org <http://schema.org/> generally wraps around a >>> display? These two options would result in different displays: >>> >>> On 12/4/12 3:33 AM, Shlomo Sanders wrote: >>> >>> How is this as a schema.org <http://schema.org/> >>> "friendly" version of the ONIX structure: >>> >>> <div typeof="identifier"> >>> <span property=" identifierValue >>> ">0316769487</span> >>> <span property=" identifierType ">ISBN</span> >>> </div> >>> >>> >>> 0316769487 ISBN >>> >>> >>> >>> Seems too long to me, perhaps: <span property=" >>> identifier" typeof="ISBN">0316769487</__span> >>> >>> >>> 0316769487 >>> >>> The schema.org <http://schema.org/> documentation shows a >>> similar example to this latter approach using price: >>> >>> Price: <span itemprop="price">$6.99</span> >>> <meta itemprop="priceCurrency" content="USD" /> >>> >>> This gets the "$6.99" display for the human reader, plus the >>> currency type for processing. >>> >>> The current use of ISBN is illustrated as: >>> >>> ISBN-10: <span itemprop="isbn">0316769487</__span> >>> >>> If we go with id type and value, then display is limited by >>> the defined types, unless we leave type very loose. To get the >>> same display as the ISBN immediately above, we'd need: >>> >>> <div itemprop="identifier" >>> itemscope="http://schema.org/__Identifier >>> <http://schema.org/Identifier>"> >>> <span itemprop="idType">ISBN-10: </span> >>> <span itemprop="idValue">0316769487<__/span> >>> </div> >>> >>> Does identifier type do what we want if it's not a controlled >>> value? Or would we need a <meta> with a controlled value? >>> >>> kc >>> >>> >>> >>> -----Original Message----- >>> From: Karen Coyle [mailto:kcoyle@kcoyle.net >>> <mailto:kcoyle@kcoyle.net>] >>> Sent: Monday, December 03, 2012 20:28 >>> To: Graham Bell >>> Cc: public-schemabibex@w3.org >>> <mailto:public-schemabibex@w3.org> >>> Subject: Re: Missing Schema.Org <http://Schema.Org> >>> properties >>> >>> I do, however, see a significant difference between >>> schema.org <http://schema.org/> and the XML structure of >>> ONIX (or any other XML-based metadata): schema.org >>> <http://schema.org/> allows the data to be flattened to a >>> single horizon of data. This is for the sake of >>> simplicity, if I understand correctly. There seems to be a >>> philosophy in schema.org <http://schema.org/> that avoids >>> a strict division of descriptions into "right" and >>> "wrong." XML, instead, is really an enforcement mechanism. >>> >>> I'm leery of adding much structure to schema.org >>> <http://schema.org/>. Or at least, of either requiring it >>> or relying on it. That makes the identifier "problem" >>> particularly difficult. It is for this reason that I >>> asked, in response to Shlomo's post, whether one can make >>> use of the self-identifying nature of URIs. That doesn't >>> help us with non-URI identifiers, but it seems that we are >>> moving increasingly in the direction of "fully formed" >>> identifiers. >>> >>> kc >>> >>> On 12/3/12 8:41 AM, Graham Bell wrote: >>> >>> Worth saying at this point that this is EXACTLY how >>> ONIX is structured: >>> >>> <entityIdentifier> >>> <entityIDType> >>> <IDTypeName> >>> <IDValue> >>> </entityIdentifier> >>> >>> >>> where 'entity' might be 'product', 'work', 'name', or >>> whatever. There >>> is a controlled vocabulary for common IDTypes, and if >>> you have some >>> proprietary identifier not in the list, you must >>> include a 'likely to >>> be unique' name for it in <IDTypeName> instead. >>> >>> A point of history -- ONIX started (in 1999) with a >>> property per >>> identifier type: there were tags called <ISBN> and >>> <UPC>, but as >>> pointed out below, that isn't really practical, so the >>> above XML >>> structure is used extensively now. It's easy to add to >>> the controlled >>> vocabulary when a new identifier comes along, without >>> having to >>> change the schema. In UML, it looks like the attached, >>> and I leave >>> the RDF as an exercise for the reader... >>> >>> Graham >>> >>> >>> >>> Graham Bell >>> EDItEUR >>> >>> Tel: +44 20 7503 6418 <tel:%2B44%2020%207503%206418> >>> Mob: +44 7887 754958 <tel:%2B44%207887%20754958> >>> >>> EDItEUR Limited is a company limited by guarantee, >>> registered in >>> England no 2994705. Registered Office: United House, >>> North Road, >>> London N7 9DP, UK. Website: http://www.editeur.org >>> <http://www.editeur.org/> >>> >>> >>> >>> >>> >>> On 3 Dec 2012, at 16:18, Laura Dawson wrote: >>> >>> That might work, actually. >>> >>> Sent from my iPhone >>> >>> On Dec 3, 2012, at 4:05 PM, Karen Coyle >>> <kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> >>> <mailto:kcoyle@kcoyle.net >>> <mailto:kcoyle@kcoyle.net>>> wrote: >>> >>> >>> >>> On 12/3/12 7:19 AM, Richard Wallis wrote: >>> >>> >>> Hi Shlomo, >>> >>> Couple of points. >>> >>> >>> *Identifiers: *This is a particular >>> concern of mine. >>> >>> >>> Me, too! >>> >>> The approach of >>> >>> having a named property for each possible >>> identifier that a >>> CreativeWork or a Person could have, just >>> does not scale. However >>> to handle this you will always be >>> disenfranchising some identifier >>> backing group. Isbn seems to of got in >>> because it is know by everyone, oclcnum is >>> obvious >>> from where I sit (but that does not make >>> it right). I think we (in all >>> of Schema, not just the bib domain) need >>> an identifier Type with >>> properties of 'identifierValue' and >>> 'identifierType' - which could >>> handle either an enumerated list or at >>> least well known identifier >>> names. >>> >>> >>> I believe that this means that "Identifier" >>> becomes a "schema" in >>> schema.org <http://schema.org/> >>> <http://schema.org <http://schema.org/>>. >>> >>> kc >>> >>> >>> ~Richard. >>> >>> >>> -- >>> Karen Coyle >>> kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> >>> http://kcoyle.net <http://kcoyle.net/> >>> ph: 1-510-540-7596 <tel:1-510-540-7596> >>> m: 1-510-435-8234 <tel:1-510-435-8234> >>> skype: kcoylenet >>> >>> >>> >>> -- >>> Karen Coyle >>> kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net >>> <http://kcoyle.net/> >>> ph: 1-510-540-7596 <tel:1-510-540-7596> >>> m: 1-510-435-8234 <tel:1-510-435-8234> >>> skype: kcoylenet >>> >>> >>> >>> >>> -- >>> Karen Coyle >>> kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net >>> <http://kcoyle.net/> >>> ph: 1-510-540-7596 <tel:1-510-540-7596> >>> m: 1-510-435-8234 <tel:1-510-435-8234> >>> skype: kcoylenet >>> >>> >> > -- Karen Coyle kcoyle@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Received on Wednesday, 5 December 2012 01:42:51 UTC