- From: Richard Wallis <richard.wallis@dataliberate.com>
- Date: Thu, 13 Aug 2015 17:25:55 +0100
- To: "public-schemabibex@w3.org" <public-schemabibex@w3.org>
- Message-ID: <CAD47Kz56U29nEGAspV8U0J1QCk+CV5V815um0mT9MqFPtbySNQ@mail.gmail.com>
So, the majority consensus (if that is not an oxymoron) appears to be that we will not push forward the proposal for an Agent type, as a super-type for Person & Organization, in the bib.schema.org extension. The recommendation being for publishing an agent (i.e. a Person or an Organization but it is not clear which) Schema:Thing should be used. If the data publisher wishes to identify that thing as being identified within their data as an agent they should use http://bibliograph.net/Agent or http://purl.org/dc/terms/Agent in addition to Schema:Thing. If people are happy with that position I will forward it as a response to the discussion in the min Shema.org group. ~Richard. Richard Wallis Founder, Data Liberate http://dataliberate.com Linkedin: http://www.linkedin.com/in/richardwallis Twitter: @rjw On 11 August 2015 at 12:43, Heuvelmann, Reinhold <R.Heuvelmann@dnb.de> wrote: > Some background to MARC field 720 [0]: It was defined almost 20 years > ago, there is a first discussion paper at [1], a second narrow one at [2], > and the proposal at [3]. > > > > Minutes of the MARBI meetings are at [4] and [5]. > > > > MARC Usage (in WorldCat) statistics are at [6]. > > > > Best wishes > > > > Reinhold > > > > [0] > > http://www.loc.gov/marc/bibliographic/bd720.html > > > > [1] > > “Mapping the Dublin Core Metadata Elements to USMARC” > > http://www.loc.gov/marc/marbi/dp/dp86.html > > > > [2] > > “Defining a Generic Author Field in USMARC” > > http://www.loc.gov/marc/marbi/dp/dp88.html > > > > [3] > > “Define a Generic Author Field in the Bibliographic, Authority, > Classification, and Community Information Formats” > > (Date given is “December 1, 1996”, correct date is December 1, 1995.) > > http://www.loc.gov/marc/marbi/1996/96-02.html > > > > [4] http://www.loc.gov/marc/marbi/minutes/an-95.html > > > > [5] http://www.loc.gov/marc/marbi/1996/96-02.html > > > > [6] http://experimental.worldcat.org/marcusage/720.html > > > > > > *Von:* Young,Jeff (OR) [mailto:jyoung@oclc.org] > *Gesendet:* Montag, 10. August 2015 21:28 > *An:* corey.harper@nyu.edu; Dan Scott > *Cc:* LeVan,Ralph; Richard Wallis; public-schemabibex@w3.org > *Betreff:* RE: The Agent proposal in bib.schema.org is controversial > > > > MARC 720 is the main culprit. Here’s the analysis Richard referred to is > below: > > > > Jeff > > > > --- > > > > Many of the Agents coming out of the transform process are produced from > the MARC 720 field where the 1st indicator is blank (“Not specified”) or > 2 (“Other”): > > > > http://www.loc.gov/marc/bibliographic/bd720.html > > > > Note that the MARC 720 field is used when crosswalking Dublin Core to MARC: > > > > http://www.loc.gov/marc/dccross.html > > > > Here’s a quote: > > “Note: there is no way to specify whether the Contributor is a person or > organization because it is not in the Dublin Core data. If it can > reasonably be determined that the contributor is a person or organization, > fields 700 1#$a (Added Entry--Personal Name) or 710 2#$a (Added > Entry--Corporate Name) may be used.” > > In other words, anyone who wants to upgrade their Dublin Core data to > Schema.org will have this same problem when trying to map dc:creator, > dc:contributor, dct:mediator, and dct:rightsHolder, not to mention > dc:subject. > > > > In some cases, our transformer is able to sniff the name’s structure to > choose Person or Organization. (BTW, there is still room for some > improvement there.) Without such clues, though, the process assigns > bgn:Agent as a default, based on the assumption that reconciliation with > typed entities will be done downstream. > > > > I can give plenty of examples, but here’s one to illustrate: > > (The highlighted rdf:value statements are used for debugging and contain > the source data using in the mapping.) > > > > *<*http://www.worldcat.org/oclc/881301071> # "我的動物小百科" > > schema:contributor <http://schema.org/contributor> < > http://experiment.worldcat.org/entity/work/data/1913968092#Agent/ke_xue_guan> > ; # "科學館" > > . > > *<* > http://experiment.worldcat.org/entity/work/data/1913968092#Agent/ke_xue_guan > > # "科學館" > > a bgn:Agent <http://bibliograph.net/Agent> ; > > rdfs:label <http://www.w3.org/2000/01/rdf-schema#label> "Ke xue guan" > ; > > schema:name <http://schema.org/name> "科學館"@zh ; > > rdf:value <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "<v720 > altscript=\".//v880[16]\" i1=\" \" i2=\" \"><s6><d>880-16</d></s6><sa><d>Ke > xue guan.</d></sa></v720>"^^rdf:XMLLiteral ; # idiomatic diagnostic > > rdf:value <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "<v880 > i1=\" \" i2=\" \" lnkfrom=\"720\" script=\"$1\" > xlink=\".//v720[1]\"><s6><d>720-16/$1</d></s6><sa><d>科學館</d></sa></v880>"^^rdf:XMLLiteral > ; # idiomatic diagnostic > > . > > Jeff > > > > *From:* Corey A Harper [mailto:corey.harper@nyu.edu <corey.harper@nyu.edu>] > > *Sent:* Monday, August 10, 2015 3:11 PM > *To:* Dan Scott > *Cc:* Young,Jeff (OR); LeVan,Ralph; Richard Wallis; > public-schemabibex@w3.org > *Subject:* Re: The Agent proposal in bib.schema.org is controversial > > > > Dear all, > > > > My $0.02: I also think that schama:Thing is the best option at this time, > and don't think we should push too much on Agent given what I consider > relatively limited usefulness. I understand Jeff's point about the dangers > of "not sorting these out", but I also think that we can store and manage > data with whatever specifity we want, and I'm not sure those dangers apply > to data as published downstream to consumers on the Web. > > > > I'm also _very_ interested in knowing more about the 70 Million + "mystery > agents" Richard and Jeff have been referencing. Are these just 1xx and 7xx > data points that are type unknown because they haven't matched a known > entity with a known type? Can't we at least infer more about their type by > their Marc field? Can we see some example instance (bib) data where these > show up? > > > > Best, > > -Corey > > > > On Mon, Aug 10, 2015 at 1:55 PM, Dan Scott <denials@gmail.com> wrote: > > FWIW, the Bibliographic Ontology (bibo) also uses foaf:Agent. > > > > But I concur with the developing dissenting opinion on the github issue > that, if we have nothing specific to say about the nature of the entity > because we lack the information, it's better to simply avoid the compromise > of Agent. We might make ourselves feel a bit better about the dismal state > of our bibliographic data through an abstract class like Agent, but in the > end it doesn't really add any data to the data we're trying to express. > > > > Using schema:Thing seems like an acceptable fallback in the mean time, and > allows the data expressed by the target links to be refined to either > Person or Organization at some point in the future when the effort occurs. > > > > > > On Mon, 10 Aug 2015 at 11:29 Young,Jeff (OR) <jyoung@oclc.org> wrote: > > I made an argument that the problem is broader than bib records: > > > > https://github.com/schemaorg/schemaorg/issues/700#issuecomment-129078302 > > > > Limiting to our situation, though, Richard cites the count from WorldCat > at 72 million “agents” (people and organizations excluded): > > > > https://github.com/schemaorg/schemaorg/issues/700#issuecomment-129227478 > > > > These all have Linked Data identifiers, but they are only mechanized > placeholders in need of exposure, reconciliation, and enrichment. > > > > The danger of not sorting these out is that naïve automated “entity > matching” processes resort to string matching on name as an “else > condition” and the resulting mix up manifests itself in the Linked Data. > > > > I suggested Google Custom Search as a possible tool to help with discovery > and possibly lead to an interface where they could be reconciled: > > > > https://github.com/schemaorg/schemaorg/issues/700#issuecomment-129239474 > > > > Jeff > > > > *From:* LeVan,Ralph > *Sent:* Monday, August 10, 2015 10:33 AM > *To:* Young,Jeff (OR); Richard Wallis; public-schemabibex@w3.org > > > *Subject:* RE: The Agent proposal in bib.schema.org is controversial > > > > One of the arguments against Agent was that if you didn’t know what kind > of object a thing was, then you just shouldn’t say. All the properties of > Agent seem to come from Thing. I’d propose that we just use Thing. > > > > My guess is that the need for Agent comes mostly from our need to convert > existing bib records into RDF and some of our crappy old bib records don’t > reliably distinguish the type of agent involved. Rather than be caught out > in a lie about whether the agent is a Person or Organization, we’d rather > say less. This is a problem peculiar to our situation and not a broad > problem of the internet community. It’s also a short-term problem. > Selling ‘Agent’ to a community that doesn’t need it is going to be an > uphill battle. > > > > What’s wrong with dropping all the way back to Thing when we don’t know > the type of the agent? > > > > Ralph > > > > *From:* Young,Jeff (OR) [mailto:jyoung@oclc.org <jyoung@oclc.org>] > *Sent:* Monday, August 10, 2015 10:04 AM > *To:* Richard Wallis; public-schemabibex@w3.org > *Subject:* RE: The Agent proposal in bib.schema.org is controversial > > > > One option would be for us to use foaf:Agent. Presumably search engines > would ignore it, but that’s their prerogative. > > > > Another option would be to preserve http://bibliograph.net/Agent, with a > comment that it wasn’t accepted by the broader community, but remains > useful in our limited domain. (Terms that have been adopted should be > deprecated.) > > > > Jeff > > > > > > *From:* Richard Wallis [mailto:richard.wallis@dataliberate.com > <richard.wallis@dataliberate.com>] > *Sent:* Monday, August 10, 2015 8:18 AM > *To:* public-schemabibex@w3.org > *Subject:* The Agent proposal in bib.schema.org is controversial > > > > You may have noticed if you followed the recent announcement of Schema.or > v2.1 > <https://lists.w3.org/Archives/Public/public-schemabibex/2015Aug/0000.html>, > which includes bib.schema.org, that one of our proposals did not make it > in. That proposal being the Agent type that we proposed as a super-type > for Person and Organization. > > > > Agent has been a theme of discussion in the community well before we > approached the issue. You can follow the recent debate in the related > schemaorg git issue comment trail: > https://github.com/schemaorg/schemaorg/issues/700 > > > > In the bibliographic world Agent is a well understood, some would say > obvious, approach. When applied to the wider domains that Schema.org > embraces however, it raises many concerns and issues. Especially because, > as proposed, it would introduce a new direct sub-type of Thing with > ramifications that could cascade across many areas of the vocabulary. > > > > In my personal opinion the gap between the two apposing views on this is > significant and the best way forward would be to consider possible > pragmatic approaches to how we represent our data in Schema.org without > loosing the ability to describe our resources effectively to the wider > world. > > > > In simple terms, if we identify an author, creator, publisher, or even > copyright holder as a Person or an Organization there is not a problem. > The difficulty occurs when we know from the relationships in the data that > they are either a Person or an Organization but cannot identify which. > > > > One suggested way forward for such a circumstance would be to define them > as a schema:Thing. To me this feels a little too vague. A follow-on > option was to suggest a 'personOrOrganization' boolean property to indicate > this circumstance. This is a little more appealing, but I think it still > needs some work. > > > > What are others thoughts on this? > > > > Do we believe that the proposed Agent type is the *only* way forward? > Are there potential pragmatic options like the one I describe above that we > could shape, that would be acceptable? Is this requirement to specifically > describe agents as too detailed and something we can pass over, and move on > to other things? > > > > ~Richard. > > > > > > > Richard Wallis > > Founder, Data Liberate > > http://dataliberate.com > > Linkedin: http://www.linkedin.com/in/richardwallis > > Twitter: @rjw > > > > > > -- > > Corey A Harper > Metadata Services Librarian > New York University Libraries > 20 Cooper Square, 3rd Floor > New York, NY 10003-7112 > 212.998.2479 > corey.harper@nyu.edu >
Received on Thursday, 13 August 2015 16:26:25 UTC