Re: The Agent proposal in bib.schema.org is controversial

So, the majority consensus (if that is not an oxymoron) appears to be that
we will not push forward the proposal for an Agent type, as a super-type
for Person & Organization, in the bib.schema.org extension.

The recommendation being for publishing an agent (i.e. a Person or an
Organization but it is not clear which) Schema:Thing should be used.  If
the data publisher wishes to identify that thing as being identified within
their data as an agent they should use http://bibliograph.net/Agent or
http://purl.org/dc/terms/Agent in addition to Schema:Thing.

If people are happy with that position I will forward it as a response to
the discussion in the min Shema.org group.

~Richard.



Richard Wallis
Founder, Data Liberate
http://dataliberate.com
Linkedin: http://www.linkedin.com/in/richardwallis
Twitter: @rjw

On 11 August 2015 at 12:43, Heuvelmann, Reinhold <R.Heuvelmann@dnb.de>
wrote:

> Some background to MARC field 720 [0]:  It was defined almost 20 years
> ago, there is a first discussion paper at [1], a second narrow one at [2],
> and the proposal at [3].
>
>
>
> Minutes of the MARBI meetings are at [4] and [5].
>
>
>
> MARC Usage (in WorldCat) statistics are at [6].
>
>
>
> Best wishes
>
>
>
> Reinhold
>
>
>
> [0]
>
> http://www.loc.gov/marc/bibliographic/bd720.html
>
>
>
> [1]
>
> “Mapping the Dublin Core Metadata Elements to USMARC”
>
> http://www.loc.gov/marc/marbi/dp/dp86.html
>
>
>
> [2]
>
> “Defining a Generic Author Field in USMARC”
>
> http://www.loc.gov/marc/marbi/dp/dp88.html
>
>
>
> [3]
>
> “Define a Generic Author Field in the Bibliographic, Authority,
> Classification, and Community Information Formats”
>
> (Date given is “December 1, 1996”, correct date is December 1, 1995.)
>
> http://www.loc.gov/marc/marbi/1996/96-02.html
>
>
>
> [4] http://www.loc.gov/marc/marbi/minutes/an-95.html
>
>
>
> [5] http://www.loc.gov/marc/marbi/1996/96-02.html
>
>
>
> [6] http://experimental.worldcat.org/marcusage/720.html
>
>
>
>
>
> *Von:* Young,Jeff (OR) [mailto:jyoung@oclc.org]
> *Gesendet:* Montag, 10. August 2015 21:28
> *An:* corey.harper@nyu.edu; Dan Scott
> *Cc:* LeVan,Ralph; Richard Wallis; public-schemabibex@w3.org
> *Betreff:* RE: The Agent proposal in bib.schema.org is controversial
>
>
>
> MARC 720 is the main culprit. Here’s the analysis Richard referred to is
> below:
>
>
>
> Jeff
>
>
>
> ---
>
>
>
> Many of the Agents coming out of the transform process are produced from
> the MARC 720 field where the 1st indicator is blank (“Not specified”) or
> 2 (“Other”):
>
>
>
> http://www.loc.gov/marc/bibliographic/bd720.html
>
>
>
> Note that the MARC 720 field is used when crosswalking Dublin Core to MARC:
>
>
>
> http://www.loc.gov/marc/dccross.html
>
>
>
> Here’s a quote:
>
> “Note: there is no way to specify whether the Contributor is a person or
> organization because it is not in the Dublin Core data. If it can
> reasonably be determined that the contributor is a person or organization,
> fields 700 1#$a (Added Entry--Personal Name) or 710 2#$a (Added
> Entry--Corporate Name) may be used.”
>
> In other words, anyone who wants to upgrade their Dublin Core data to
> Schema.org will have this same problem when trying to map dc:creator,
> dc:contributor, dct:mediator, and dct:rightsHolder, not to mention
> dc:subject.
>
>
>
> In some cases, our transformer is able to sniff the name’s structure to
> choose Person or Organization. (BTW, there is still room for some
> improvement there.) Without such clues, though, the process assigns
> bgn:Agent as a default, based on the assumption that reconciliation with
> typed entities will be done downstream.
>
>
>
> I can give plenty of examples, but here’s one to illustrate:
>
> (The highlighted rdf:value statements are used for debugging and contain
> the source data using in the mapping.)
>
>
>
> *<*http://www.worldcat.org/oclc/881301071> # "我的動物小百科"
>
>     schema:contributor <http://schema.org/contributor> <
> http://experiment.worldcat.org/entity/work/data/1913968092#Agent/ke_xue_guan>
> ; # "科學館"
>
>     .
>
> *<*
> http://experiment.worldcat.org/entity/work/data/1913968092#Agent/ke_xue_guan
> > # "科學館"
>
>     a bgn:Agent <http://bibliograph.net/Agent> ;
>
>     rdfs:label <http://www.w3.org/2000/01/rdf-schema#label> "Ke xue guan"
> ;
>
>     schema:name <http://schema.org/name> "科學館"@zh ;
>
>     rdf:value <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "<v720
> altscript=\".//v880[16]\" i1=\" \" i2=\" \"><s6><d>880-16</d></s6><sa><d>Ke
> xue guan.</d></sa></v720>"^^rdf:XMLLiteral ; # idiomatic diagnostic
>
>     rdf:value <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "<v880
> i1=\" \" i2=\" \" lnkfrom=\"720\" script=\"$1\"
> xlink=\".//v720[1]\"><s6><d>720-16/$1</d></s6><sa><d>科學館</d></sa></v880>"^^rdf:XMLLiteral
> ; # idiomatic diagnostic
>
>     .
>
> Jeff
>
>
>
> *From:* Corey A Harper [mailto:corey.harper@nyu.edu <corey.harper@nyu.edu>]
>
> *Sent:* Monday, August 10, 2015 3:11 PM
> *To:* Dan Scott
> *Cc:* Young,Jeff (OR); LeVan,Ralph; Richard Wallis;
> public-schemabibex@w3.org
> *Subject:* Re: The Agent proposal in bib.schema.org is controversial
>
>
>
> Dear all,
>
>
>
> My $0.02: I also think that schama:Thing is the best option at this time,
> and don't think we should push too much on Agent given what I consider
> relatively limited usefulness. I understand Jeff's point about the dangers
> of "not sorting these out", but I also think that we can store and manage
> data with whatever specifity we want, and I'm not sure those dangers apply
> to data as published downstream to consumers on the Web.
>
>
>
> I'm also _very_ interested in knowing more about the 70 Million + "mystery
> agents" Richard and Jeff have been referencing. Are these just 1xx and 7xx
> data points that are type unknown because they haven't matched a known
> entity with a known type? Can't we at least infer more about their type by
> their Marc field? Can we see some example instance (bib) data where these
> show up?
>
>
>
> Best,
>
> -Corey
>
>
>
> On Mon, Aug 10, 2015 at 1:55 PM, Dan Scott <denials@gmail.com> wrote:
>
> FWIW, the Bibliographic Ontology (bibo) also uses foaf:Agent.
>
>
>
> But I concur with the developing dissenting opinion on the github issue
> that, if we have nothing specific to say about the nature of the entity
> because we lack the information, it's better to simply avoid the compromise
> of Agent. We might make ourselves feel a bit better about the dismal state
> of our bibliographic data through an abstract class like Agent, but in the
> end it doesn't really add any data to the data we're trying to express.
>
>
>
> Using schema:Thing seems like an acceptable fallback in the mean time, and
> allows the data expressed by the target links to be refined to either
> Person or Organization at some point in the future when the effort occurs.
>
>
>
>
>
> On Mon, 10 Aug 2015 at 11:29 Young,Jeff (OR) <jyoung@oclc.org> wrote:
>
> I made an argument that the problem is broader than bib records:
>
>
>
> https://github.com/schemaorg/schemaorg/issues/700#issuecomment-129078302
>
>
>
> Limiting to our situation, though, Richard cites the count from WorldCat
> at 72 million “agents” (people and organizations excluded):
>
>
>
> https://github.com/schemaorg/schemaorg/issues/700#issuecomment-129227478
>
>
>
> These all have Linked Data identifiers, but they are only mechanized
> placeholders in need of exposure, reconciliation, and enrichment.
>
>
>
> The danger of not sorting these out is that naïve automated “entity
> matching” processes resort to string matching on name as an “else
> condition” and the resulting mix up manifests itself in the Linked Data.
>
>
>
> I suggested Google Custom Search as a possible tool to help with discovery
> and possibly lead to an interface where they could be reconciled:
>
>
>
> https://github.com/schemaorg/schemaorg/issues/700#issuecomment-129239474
>
>
>
> Jeff
>
>
>
> *From:* LeVan,Ralph
> *Sent:* Monday, August 10, 2015 10:33 AM
> *To:* Young,Jeff (OR); Richard Wallis; public-schemabibex@w3.org
>
>
> *Subject:* RE: The Agent proposal in bib.schema.org is controversial
>
>
>
> One of the arguments against Agent was that if you didn’t know what kind
> of object a thing was, then you just shouldn’t say.   All the properties of
> Agent seem to come from Thing.  I’d propose that we just use Thing.
>
>
>
> My guess is that the need for Agent comes mostly from our need to convert
> existing bib records into RDF and some of our crappy old bib records don’t
> reliably distinguish the type of agent involved.  Rather than be caught out
> in a lie about whether the agent is a Person or Organization, we’d rather
> say less.  This is a problem peculiar to our situation and not a broad
> problem of the internet community.  It’s also a short-term problem.
> Selling ‘Agent’ to a community that doesn’t need it is going to be an
> uphill battle.
>
>
>
> What’s wrong with dropping all the way back to Thing when we don’t know
> the type of the agent?
>
>
>
> Ralph
>
>
>
> *From:* Young,Jeff (OR) [mailto:jyoung@oclc.org <jyoung@oclc.org>]
> *Sent:* Monday, August 10, 2015 10:04 AM
> *To:* Richard Wallis; public-schemabibex@w3.org
> *Subject:* RE: The Agent proposal in bib.schema.org is controversial
>
>
>
> One option would be for us to use foaf:Agent. Presumably search engines
> would ignore it, but that’s their prerogative.
>
>
>
> Another option would be to preserve http://bibliograph.net/Agent, with a
> comment that it wasn’t accepted by the broader community, but remains
> useful in our limited domain. (Terms that have been adopted should be
> deprecated.)
>
>
>
> Jeff
>
>
>
>
>
> *From:* Richard Wallis [mailto:richard.wallis@dataliberate.com
> <richard.wallis@dataliberate.com>]
> *Sent:* Monday, August 10, 2015 8:18 AM
> *To:* public-schemabibex@w3.org
> *Subject:* The Agent proposal in bib.schema.org is controversial
>
>
>
> You may have noticed if you followed the recent announcement of Schema.or
> v2.1
> <https://lists.w3.org/Archives/Public/public-schemabibex/2015Aug/0000.html>,
> which includes bib.schema.org, that one of our proposals did not make it
> in.  That proposal being the Agent type that we proposed as a super-type
> for Person and Organization.
>
>
>
> Agent has been a theme of discussion in the community well before we
> approached the issue.  You can follow the recent debate in the related
> schemaorg git issue comment trail:
> https://github.com/schemaorg/schemaorg/issues/700
>
>
>
> In the bibliographic world Agent is a well understood, some would say
> obvious, approach.  When applied to the wider domains that Schema.org
> embraces however, it raises many concerns and issues. Especially because,
> as proposed, it would introduce a new direct sub-type of Thing with
> ramifications that could cascade across many areas of the  vocabulary.
>
>
>
> In my personal opinion the gap between the two apposing views on this is
> significant and the best way forward would be to consider possible
> pragmatic approaches to how we represent our data in Schema.org without
> loosing the ability to describe our resources effectively to the wider
> world.
>
>
>
> In simple terms, if we identify an author, creator, publisher, or even
> copyright holder as a Person or an Organization there is not a problem.
> The difficulty occurs when we know from the relationships in the data that
> they are either a Person or an Organization but cannot identify which.
>
>
>
> One suggested way forward for such a circumstance would be to define them
> as a schema:Thing.  To me this feels a little too vague.  A follow-on
> option was to suggest a 'personOrOrganization' boolean property to indicate
> this circumstance.  This is a little more appealing, but I think it still
> needs some work.
>
>
>
> What are others thoughts on this?
>
>
>
> Do we believe that the proposed Agent type is the *only* way forward?
> Are there potential pragmatic options like the one I describe above that we
> could shape, that would be acceptable? Is this requirement to specifically
> describe agents as too detailed and something we can pass over, and move on
> to other things?
>
>
>
> ~Richard.
>
>
>
>
>
>
> Richard Wallis
>
> Founder, Data Liberate
>
> http://dataliberate.com
>
> Linkedin: http://www.linkedin.com/in/richardwallis
>
> Twitter: @rjw
>
>
>
>
>
> --
>
> Corey A Harper
> Metadata Services Librarian
> New York University Libraries
> 20 Cooper Square, 3rd Floor
> New York, NY 10003-7112
> 212.998.2479
> corey.harper@nyu.edu
>

Received on Thursday, 13 August 2015 16:26:25 UTC