- From: Young,Jeff (OR) <jyoung@oclc.org>
- Date: Wed, 5 Dec 2012 10:31:16 -0500
- To: "Ed Summers" <ehs@pobox.com>, "Karen Coyle" <kcoyle@kcoyle.net>
- Cc: <public-schemabibex@w3.org>
If a thing has multiple URIs, then it would make sense to use owl:sameAs to tie those together; even in Microdata. Jeff > -----Original Message----- > From: ed.summers@gmail.com [mailto:ed.summers@gmail.com] On Behalf Of > Ed Summers > Sent: Wednesday, December 05, 2012 9:59 AM > To: Karen Coyle > Cc: public-schemabibex@w3.org > Subject: Re: Missing Schema.Org properties > > Offlist Alf Eaton kindly reminded me that the HTML Microdata spec does > not appear to allow you to encode multiple identifiers for a given item > using itemid. I don't think I'm going out on a limb here by saying that > this is problematic, for example in use cases like ScholarlyArticle > where it would be useful to encode a PubMedID and a DOI. > > So I emailed the WHATWG mailing list to make sure that this is actually > the case, and to propose that the Microdata spec allow for it [1]. As > you can see from Ian Hickson's response, itemid doesn't allow for > multiple identifiers by design. He also had some suggestions for > workarounds using meta and link with a generic 'id' itemprop [2]. > > So I think this leaves us with two options: > > 1) document itemprops in Book ScholarlyArticle, etc for all the > identifier types that we think are relevant for the bibliographic > universe: doi, oclcnum, pmid, etc. > 2) document a pattern for expressing identifiers of different types: > using meta, link (as Ian suggested) or some other mechanism. > > I'm not sure I have a preference at this point, but I just wanted to > point out that relying entirely on itemid for expressing identifiers is > not going to work. Perhaps it would be useful to document some of the > design choices on the wiki for further discussion? > > //Ed > > [1] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012- > December/038256.html > [2] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012- > December/038257.html > > PS. Sorry for sending this to you twice Karen :-) > > On Tue, Dec 4, 2012 at 8:42 PM, Karen Coyle <kcoyle@kcoyle.net> wrote: > > I did check these fields on what I can find of the Moen statistics (a > > large study of MARC field frequency), so there may be some we can > defer. > > Unfortunately, what I have of those stats only covers books, not, for > > example, serials or music, so I am making a guess here, but these > > fields seem to be used less in less than 80% of the relevant records: > > > > > > 013 - Patent Control Information (R) Full | Concise > > 017 - Copyright or Legal Deposit Number (R) Full | Concise > > 024 - Other Standard Identifier (R) Full | Concise > > 025 - Overseas Acquisition Number (R) Full | Concise > > 026 - Fingerprint Identifier (R) Full | Concise > > 027 - Standard Technical Report Number (R) Full | Concise > > 031 - Musical Incipits Information (R) Full | Concise > > 035 - System Control Number (R) Full | Concise > > > > I rather expected the GPO item number (074) to be higher, but it is > not. > > However, I've lost access to the full set of stats so I don't know > its > > actual frequency. (Some files are on the original site are giving me > > 404) I'll see if I can rectify this. > > > > kc > > > > > > On 12/4/12 11:45 AM, Karen Coyle wrote: > >> > >> It kind of depends on what you consider a bibliographic identifier. > >> So maybe our first step should be to define that. > >> > >> Here are the ones that I find in the MARC21 format: > >> > >> 010 - Library of Congress Control Number (NR) Full | Concise > >> 013 - Patent Control Information (R) Full | Concise > >> 015 - National Bibliography Number (R) Full | Concise > >> 016 - National Bibliographic Agency Control Number (R) Full | > Concise > >> 017 - Copyright or Legal Deposit Number (R) Full | Concise > >> 020 - International Standard Book Number (R) Full | Concise > >> 022 - International Standard Serial Number (R) Full | Concise > >> 024 - Other Standard Identifier (R) Full | Concise > >> 025 - Overseas Acquisition Number (R) Full | Concise > >> 026 - Fingerprint Identifier (R) Full | Concise > >> 027 - Standard Technical Report Number (R) Full | Concise > >> 028 - Publisher Number (R) Full | Concise > >> 030 - CODEN Designation (R) Full | Concise > >> 031 - Musical Incipits Information (R) Full | Concise > >> 032 - Postal Registration Number (R) Full | Concise > >> 035 - System Control Number (R) Full | Concise > >> ?036 - Original Study Number for Computer Data Files (NR) Full | > >> Concise > >> 074 - GPO Item Number (R) Full | Concise > >> > >> I think this is all of them.... Then we go on to the classification > codes: > >> > >> > >> 050 - Library of Congress Call Number (R) Full | Concise > >> 052 - Geographic Classification (R) Full | Concise > >> 055 - Classification Numbers Assigned in Canada (R) Full | Concise > >> 060 - National Library of Medicine Call Number (R) Full | Concise > >> 070 - National Agricultural Library Call Number (R) Full | Concise > >> ?072 - Subject Category Code (R) Full | Concise > >> > >> And that doesn't cover thesauri. However, we may want to ignore any > >> thesauri that cannot provide URIs? > >> > >> kc > >> > >> > >> > >> On 12/4/12 11:28 AM, Ross Singer wrote: > >>> > >>> > >>> On Dec 4, 2012, at 2:23 PM, Ed Summers <ehs@pobox.com > >>> <mailto:ehs@pobox.com>> wrote: > >>> > >>>> Call me naive, but I contend that most bibliographic identifiers > >>>> are expressable as URIs (URNs, info-uris, URLs) and that as such > >>>> they can use microdata's itemid [1]. Is there really a problem > here? > >>> > >>> > >>> +1 > >>> > >>> I was hoping to suggest something along these lines, but had lacked > >>> the cycles to actually do the research to back it up. > >>> > >>> -Ross. > >>>> > >>>> > >>>> //Ed > >>>> > >>>> [1] > >>>> > >>>> http://www.whatwg.org/specs/web-apps/current- > work/multipage/microda > >>>> ta.html#global-identifiers-for-items > >>>> > >>>> > >>>> > >>>> On Tue, Dec 4, 2012 at 9:00 AM, Karen Coyle <kcoyle@kcoyle.net > >>>> <mailto:kcoyle@kcoyle.net>> wrote: > >>>> > >>>> > >>>> > >>>> On 12/4/12 5:01 AM, Shlomo Sanders wrote: > >>>> > >>>> For what it is worth, I prefer: > >>>> > >>>> ISBN-10<span property=" identifier" > >>>> typeof="ISBN">0316769487</__span> > >>>> > >>>> > >>>> I don't think this is correct -- unless you have a property > that > >>>> is "ISBN". The "typeof" takes a property, not a value. > >>>> > >>>> Any values have to be outside of the <> unless you use a meta > tag. > >>>> see: > >>>> http://schema.org/docs/gs.__html#advanced_missing > >>>> <http://schema.org/docs/gs.html#advanced_missing> > >>>> > >>>> Maybe that's how we'll have to go - with meta. > >>>> > >>>> kc > >>>> > >>>> > >>>> > >>>> Or > >>>> ISBN-10: <span itemprop="isbn">0316769487</__span> > >>>> > >>>> These are short and clean. > >>>> The itemprop="isbn" is not generic since the valid values > for > >>>> itemprop is enumerated? > >>>> Is that the same issue for typeof? > >>>> > >>>> -----Original Message----- > >>>> From: Karen Coyle [mailto:kcoyle@kcoyle.net > >>>> <mailto:kcoyle@kcoyle.net>] > >>>> Sent: Tuesday, December 04, 2012 14:58 > >>>> To: public-schemabibex@w3.org <mailto:public- > schemabibex@w3.org> > >>>> Subject: Re: Missing Schema.Org <http://Schema.Org> > >>>> properties > >>>> > >>>> Do we need to consider how this might be displayed, since > >>>> schema.org <http://schema.org/> generally wraps around a > >>>> display? These two options would result in different > displays: > >>>> > >>>> On 12/4/12 3:33 AM, Shlomo Sanders wrote: > >>>> > >>>> How is this as a schema.org <http://schema.org/> > >>>> "friendly" version of the ONIX structure: > >>>> > >>>> <div typeof="identifier"> > >>>> <span property=" identifierValue > >>>> ">0316769487</span> > >>>> <span property=" identifierType > ">ISBN</span> > >>>> </div> > >>>> > >>>> > >>>> 0316769487 ISBN > >>>> > >>>> > >>>> > >>>> Seems too long to me, perhaps: <span property=" > >>>> identifier" typeof="ISBN">0316769487</__span> > >>>> > >>>> > >>>> 0316769487 > >>>> > >>>> The schema.org <http://schema.org/> documentation shows a > >>>> similar example to this latter approach using price: > >>>> > >>>> Price: <span itemprop="price">$6.99</span> > >>>> <meta itemprop="priceCurrency" content="USD" /> > >>>> > >>>> This gets the "$6.99" display for the human reader, plus > the > >>>> currency type for processing. > >>>> > >>>> The current use of ISBN is illustrated as: > >>>> > >>>> ISBN-10: <span itemprop="isbn">0316769487</__span> > >>>> > >>>> If we go with id type and value, then display is limited > by > >>>> the defined types, unless we leave type very loose. To get > the > >>>> same display as the ISBN immediately above, we'd need: > >>>> > >>>> <div itemprop="identifier" > >>>> itemscope="http://schema.org/__Identifier > >>>> <http://schema.org/Identifier>"> > >>>> <span itemprop="idType">ISBN-10: </span> > >>>> <span itemprop="idValue">0316769487<__/span> > >>>> </div> > >>>> > >>>> Does identifier type do what we want if it's not a > controlled > >>>> value? Or would we need a <meta> with a controlled value? > >>>> > >>>> kc > >>>> > >>>> > >>>> > >>>> -----Original Message----- > >>>> From: Karen Coyle [mailto:kcoyle@kcoyle.net > >>>> <mailto:kcoyle@kcoyle.net>] > >>>> Sent: Monday, December 03, 2012 20:28 > >>>> To: Graham Bell > >>>> Cc: public-schemabibex@w3.org > >>>> <mailto:public-schemabibex@w3.org> > >>>> Subject: Re: Missing Schema.Org <http://Schema.Org> > >>>> properties > >>>> > >>>> I do, however, see a significant difference between > >>>> schema.org <http://schema.org/> and the XML structure > of > >>>> ONIX (or any other XML-based metadata): schema.org > >>>> <http://schema.org/> allows the data to be flattened > to a > >>>> single horizon of data. This is for the sake of > >>>> simplicity, if I understand correctly. There seems to > be a > >>>> philosophy in schema.org <http://schema.org/> that > avoids > >>>> a strict division of descriptions into "right" and > >>>> "wrong." XML, instead, is really an enforcement > mechanism. > >>>> > >>>> I'm leery of adding much structure to schema.org > >>>> <http://schema.org/>. Or at least, of either requiring > it > >>>> or relying on it. That makes the identifier "problem" > >>>> particularly difficult. It is for this reason that I > >>>> asked, in response to Shlomo's post, whether one can > make > >>>> use of the self-identifying nature of URIs. That > doesn't > >>>> help us with non-URI identifiers, but it seems that we > are > >>>> moving increasingly in the direction of "fully formed" > >>>> identifiers. > >>>> > >>>> kc > >>>> > >>>> On 12/3/12 8:41 AM, Graham Bell wrote: > >>>> > >>>> Worth saying at this point that this is EXACTLY > how > >>>> ONIX is structured: > >>>> > >>>> <entityIdentifier> > >>>> <entityIDType> > >>>> <IDTypeName> > >>>> <IDValue> > >>>> </entityIdentifier> > >>>> > >>>> > >>>> where 'entity' might be 'product', 'work', 'name', > or > >>>> whatever. There > >>>> is a controlled vocabulary for common IDTypes, and > if > >>>> you have some > >>>> proprietary identifier not in the list, you must > >>>> include a 'likely to > >>>> be unique' name for it in <IDTypeName> instead. > >>>> > >>>> A point of history -- ONIX started (in 1999) with > a > >>>> property per > >>>> identifier type: there were tags called <ISBN> and > >>>> <UPC>, but as > >>>> pointed out below, that isn't really practical, so > the > >>>> above XML > >>>> structure is used extensively now. It's easy to > add to > >>>> the controlled > >>>> vocabulary when a new identifier comes along, > without > >>>> having to > >>>> change the schema. In UML, it looks like the > attached, > >>>> and I leave > >>>> the RDF as an exercise for the reader... > >>>> > >>>> Graham > >>>> > >>>> > >>>> > >>>> Graham Bell > >>>> EDItEUR > >>>> > >>>> Tel: +44 20 7503 6418 > <tel:%2B44%2020%207503%206418> > >>>> Mob: +44 7887 754958 <tel:%2B44%207887%20754958> > >>>> > >>>> EDItEUR Limited is a company limited by guarantee, > >>>> registered in > >>>> England no 2994705. Registered Office: United > House, > >>>> North Road, > >>>> London N7 9DP, UK. Website: http://www.editeur.org > >>>> <http://www.editeur.org/> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> On 3 Dec 2012, at 16:18, Laura Dawson wrote: > >>>> > >>>> That might work, actually. > >>>> > >>>> Sent from my iPhone > >>>> > >>>> On Dec 3, 2012, at 4:05 PM, Karen Coyle > >>>> <kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> > >>>> <mailto:kcoyle@kcoyle.net > >>>> <mailto:kcoyle@kcoyle.net>>> wrote: > >>>> > >>>> > >>>> > >>>> On 12/3/12 7:19 AM, Richard Wallis wrote: > >>>> > >>>> > >>>> Hi Shlomo, > >>>> > >>>> Couple of points. > >>>> > >>>> > >>>> *Identifiers: *This is a particular > >>>> concern of mine. > >>>> > >>>> > >>>> Me, too! > >>>> > >>>> The approach of > >>>> > >>>> having a named property for each > possible > >>>> identifier that a > >>>> CreativeWork or a Person could have, > just > >>>> does not scale. However > >>>> to handle this you will always be > >>>> disenfranchising some identifier > >>>> backing group. Isbn seems to of got > in > >>>> because it is know by everyone, > oclcnum is > >>>> obvious > >>>> from where I sit (but that does not > make > >>>> it right). I think we (in all > >>>> of Schema, not just the bib domain) > need > >>>> an identifier Type with > >>>> properties of 'identifierValue' and > >>>> 'identifierType' - which could > >>>> handle either an enumerated list or at > >>>> least well known identifier > >>>> names. > >>>> > >>>> > >>>> I believe that this means that > "Identifier" > >>>> becomes a "schema" in > >>>> schema.org <http://schema.org/> > >>>> <http://schema.org <http://schema.org/>>. > >>>> > >>>> kc > >>>> > >>>> > >>>> ~Richard. > >>>> > >>>> > >>>> -- > >>>> Karen Coyle > >>>> kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> > >>>> http://kcoyle.net <http://kcoyle.net/> > >>>> ph: 1-510-540-7596 <tel:1-510-540-7596> > >>>> m: 1-510-435-8234 <tel:1-510-435-8234> > >>>> skype: kcoylenet > >>>> > >>>> > >>>> > >>>> -- > >>>> Karen Coyle > >>>> kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> > http://kcoyle.net > >>>> <http://kcoyle.net/> > >>>> ph: 1-510-540-7596 <tel:1-510-540-7596> > >>>> m: 1-510-435-8234 <tel:1-510-435-8234> > >>>> skype: kcoylenet > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Karen Coyle > >>>> kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net > >>>> <http://kcoyle.net/> > >>>> ph: 1-510-540-7596 <tel:1-510-540-7596> > >>>> m: 1-510-435-8234 <tel:1-510-435-8234> > >>>> skype: kcoylenet > >>>> > >>>> > >>> > >> > > > > -- > > Karen Coyle > > kcoyle@kcoyle.net http://kcoyle.net > > ph: 1-510-540-7596 > > m: 1-510-435-8234 > > skype: kcoylenet > > >
Received on Wednesday, 5 December 2012 15:32:45 UTC