Re: The Agent proposal in bib.schema.org is controversial

On 13 August 2015 at 17:34, Young,Jeff (OR) <jyoung@oclc.org> wrote:

> +1
>
>
>
> foaf:Agent would be another viable option to suggest.
>

Yes - we should also get an example into the Recipes and Guidelines
<https://www.w3.org/community/schemabibex/wiki/Recipes_and_Guidelines> area
of the wiki.

~Richard

>
>
> *From:* Richard Wallis [mailto:richard.wallis@dataliberate.com]
> *Sent:* Thursday, August 13, 2015 12:26 PM
> *To:* public-schemabibex@w3.org
>
> *Subject:* Re: The Agent proposal in bib.schema.org is controversial
>
>
>
> So, the majority consensus (if that is not an oxymoron) appears to be that
> we will not push forward the proposal for an Agent type, as a super-type
> for Person & Organization, in the bib.schema.org extension.
>
>
>
> The recommendation being for publishing an agent (i.e. a Person or an
> Organization but it is not clear which) Schema:Thing should be used.  If
> the data publisher wishes to identify that thing as being identified within
> their data as an agent they should use http://bibliograph.net/Agent or
> http://purl.org/dc/terms/Agent in addition to Schema:Thing.
>
>
>
> If people are happy with that position I will forward it as a response to
> the discussion in the min Shema.org group.
>
>
>
> ~Richard.
>
>
>
>
>
>
> Richard Wallis
>
> Founder, Data Liberate
>
> http://dataliberate.com
>
> Linkedin: http://www.linkedin.com/in/richardwallis
>
> Twitter: @rjw
>
>
>
> On 11 August 2015 at 12:43, Heuvelmann, Reinhold <R.Heuvelmann@dnb.de>
> wrote:
>
> Some background to MARC field 720 [0]:  It was defined almost 20 years
> ago, there is a first discussion paper at [1], a second narrow one at [2],
> and the proposal at [3].
>
>
>
> Minutes of the MARBI meetings are at [4] and [5].
>
>
>
> MARC Usage (in WorldCat) statistics are at [6].
>
>
>
> Best wishes
>
>
>
> Reinhold
>
>
>
> [0]
>
> http://www.loc.gov/marc/bibliographic/bd720.html
>
>
>
> [1]
>
> “Mapping the Dublin Core Metadata Elements to USMARC”
>
> http://www.loc.gov/marc/marbi/dp/dp86.html
>
>
>
> [2]
>
> “Defining a Generic Author Field in USMARC”
>
> http://www.loc.gov/marc/marbi/dp/dp88.html
>
>
>
> [3]
>
> “Define a Generic Author Field in the Bibliographic, Authority,
> Classification, and Community Information Formats”
>
> (Date given is “December 1, 1996”, correct date is December 1, 1995.)
>
> http://www.loc.gov/marc/marbi/1996/96-02.html
>
>
>
> [4] http://www.loc.gov/marc/marbi/minutes/an-95.html
>
>
>
> [5] http://www.loc.gov/marc/marbi/1996/96-02.html
>
>
>
> [6] http://experimental.worldcat.org/marcusage/720.html
>
>
>
>
>
> *Von:* Young,Jeff (OR) [mailto:jyoung@oclc.org]
> *Gesendet:* Montag, 10. August 2015 21:28
> *An:* corey.harper@nyu.edu; Dan Scott
> *Cc:* LeVan,Ralph; Richard Wallis; public-schemabibex@w3.org
> *Betreff:* RE: The Agent proposal in bib.schema.org is controversial
>
>
>
> MARC 720 is the main culprit. Here’s the analysis Richard referred to is
> below:
>
>
>
> Jeff
>
>
>
> ---
>
>
>
> Many of the Agents coming out of the transform process are produced from
> the MARC 720 field where the 1st indicator is blank (“Not specified”) or
> 2 (“Other”):
>
>
>
> http://www.loc.gov/marc/bibliographic/bd720.html
>
>
>
> Note that the MARC 720 field is used when crosswalking Dublin Core to MARC:
>
>
>
> http://www.loc.gov/marc/dccross.html
>
>
>
> Here’s a quote:
>
> “Note: there is no way to specify whether the Contributor is a person or
> organization because it is not in the Dublin Core data. If it can
> reasonably be determined that the contributor is a person or organization,
> fields 700 1#$a (Added Entry--Personal Name) or 710 2#$a (Added
> Entry--Corporate Name) may be used.”
>
> In other words, anyone who wants to upgrade their Dublin Core data to
> Schema.org will have this same problem when trying to map dc:creator,
> dc:contributor, dct:mediator, and dct:rightsHolder, not to mention
> dc:subject.
>
>
>
> In some cases, our transformer is able to sniff the name’s structure to
> choose Person or Organization. (BTW, there is still room for some
> improvement there.) Without such clues, though, the process assigns
> bgn:Agent as a default, based on the assumption that reconciliation with
> typed entities will be done downstream.
>
>
>
> I can give plenty of examples, but here’s one to illustrate:
>
> (The highlighted rdf:value statements are used for debugging and contain
> the source data using in the mapping.)
>
>
>
> *<*http://www.worldcat.org/oclc/881301071> # "我的動物小百科"
>
>     schema:contributor <http://schema.org/contributor> <
> http://experiment.worldcat.org/entity/work/data/1913968092#Agent/ke_xue_guan>
> ; # "科學館"
>
>     .
>
> *<*
> http://experiment.worldcat.org/entity/work/data/1913968092#Agent/ke_xue_guan
> > # "科學館"
>
>     a bgn:Agent <http://bibliograph.net/Agent> ;
>
>     rdfs:label <http://www.w3.org/2000/01/rdf-schema#label> "Ke xue guan"
> ;
>
>     schema:name <http://schema.org/name> "科學館"@zh ;
>
>     rdf:value <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "<v720
> altscript=\".//v880[16]\" i1=\" \" i2=\" \"><s6><d>880-16</d></s6><sa><d>Ke
> xue guan.</d></sa></v720>"^^rdf:XMLLiteral ; # idiomatic diagnostic
>
>     rdf:value <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "<v880
> i1=\" \" i2=\" \" lnkfrom=\"720\" script=\"$1\"
> xlink=\".//v720[1]\"><s6><d>720-16/$1</d></s6><sa><d>科學館</d></sa></v880>"^^rdf:XMLLiteral
> ; # idiomatic diagnostic
>
>     .
>
> Jeff
>
>
>
> *From:* Corey A Harper [mailto:corey.harper@nyu.edu <corey.harper@nyu.edu>]
>
> *Sent:* Monday, August 10, 2015 3:11 PM
> *To:* Dan Scott
> *Cc:* Young,Jeff (OR); LeVan,Ralph; Richard Wallis;
> public-schemabibex@w3.org
> *Subject:* Re: The Agent proposal in bib.schema.org is controversial
>
>
>
> Dear all,
>
>
>
> My $0.02: I also think that schama:Thing is the best option at this time,
> and don't think we should push too much on Agent given what I consider
> relatively limited usefulness. I understand Jeff's point about the dangers
> of "not sorting these out", but I also think that we can store and manage
> data with whatever specifity we want, and I'm not sure those dangers apply
> to data as published downstream to consumers on the Web.
>
>
>
> I'm also _very_ interested in knowing more about the 70 Million + "mystery
> agents" Richard and Jeff have been referencing. Are these just 1xx and 7xx
> data points that are type unknown because they haven't matched a known
> entity with a known type? Can't we at least infer more about their type by
> their Marc field? Can we see some example instance (bib) data where these
> show up?
>
>
>
> Best,
>
> -Corey
>
>
>
> On Mon, Aug 10, 2015 at 1:55 PM, Dan Scott <denials@gmail.com> wrote:
>
> FWIW, the Bibliographic Ontology (bibo) also uses foaf:Agent.
>
>
>
> But I concur with the developing dissenting opinion on the github issue
> that, if we have nothing specific to say about the nature of the entity
> because we lack the information, it's better to simply avoid the compromise
> of Agent. We might make ourselves feel a bit better about the dismal state
> of our bibliographic data through an abstract class like Agent, but in the
> end it doesn't really add any data to the data we're trying to express.
>
>
>
> Using schema:Thing seems like an acceptable fallback in the mean time, and
> allows the data expressed by the target links to be refined to either
> Person or Organization at some point in the future when the effort occurs.
>
>
>
>
>
> On Mon, 10 Aug 2015 at 11:29 Young,Jeff (OR) <jyoung@oclc.org> wrote:
>
> I made an argument that the problem is broader than bib records:
>
>
>
> https://github.com/schemaorg/schemaorg/issues/700#issuecomment-129078302
>
>
>
> Limiting to our situation, though, Richard cites the count from WorldCat
> at 72 million “agents” (people and organizations excluded):
>
>
>
> https://github.com/schemaorg/schemaorg/issues/700#issuecomment-129227478
>
>
>
> These all have Linked Data identifiers, but they are only mechanized
> placeholders in need of exposure, reconciliation, and enrichment.
>
>
>
> The danger of not sorting these out is that naïve automated “entity
> matching” processes resort to string matching on name as an “else
> condition” and the resulting mix up manifests itself in the Linked Data.
>
>
>
> I suggested Google Custom Search as a possible tool to help with discovery
> and possibly lead to an interface where they could be reconciled:
>
>
>
> https://github.com/schemaorg/schemaorg/issues/700#issuecomment-129239474
>
>
>
> Jeff
>
>
>
> *From:* LeVan,Ralph
> *Sent:* Monday, August 10, 2015 10:33 AM
> *To:* Young,Jeff (OR); Richard Wallis; public-schemabibex@w3.org
>
>
> *Subject:* RE: The Agent proposal in bib.schema.org is controversial
>
>
>
> One of the arguments against Agent was that if you didn’t know what kind
> of object a thing was, then you just shouldn’t say.   All the properties of
> Agent seem to come from Thing.  I’d propose that we just use Thing.
>
>
>
> My guess is that the need for Agent comes mostly from our need to convert
> existing bib records into RDF and some of our crappy old bib records don’t
> reliably distinguish the type of agent involved.  Rather than be caught out
> in a lie about whether the agent is a Person or Organization, we’d rather
> say less.  This is a problem peculiar to our situation and not a broad
> problem of the internet community.  It’s also a short-term problem.
> Selling ‘Agent’ to a community that doesn’t need it is going to be an
> uphill battle.
>
>
>
> What’s wrong with dropping all the way back to Thing when we don’t know
> the type of the agent?
>
>
>
> Ralph
>
>
>
> *From:* Young,Jeff (OR) [mailto:jyoung@oclc.org <jyoung@oclc.org>]
> *Sent:* Monday, August 10, 2015 10:04 AM
> *To:* Richard Wallis; public-schemabibex@w3.org
> *Subject:* RE: The Agent proposal in bib.schema.org is controversial
>
>
>
> One option would be for us to use foaf:Agent. Presumably search engines
> would ignore it, but that’s their prerogative.
>
>
>
> Another option would be to preserve http://bibliograph.net/Agent, with a
> comment that it wasn’t accepted by the broader community, but remains
> useful in our limited domain. (Terms that have been adopted should be
> deprecated.)
>
>
>
> Jeff
>
>
>
>
>
> *From:* Richard Wallis [mailto:richard.wallis@dataliberate.com
> <richard.wallis@dataliberate.com>]
> *Sent:* Monday, August 10, 2015 8:18 AM
> *To:* public-schemabibex@w3.org
> *Subject:* The Agent proposal in bib.schema.org is controversial
>
>
>
> You may have noticed if you followed the recent announcement of Schema.or
> v2.1
> <https://lists.w3.org/Archives/Public/public-schemabibex/2015Aug/0000.html>,
> which includes bib.schema.org, that one of our proposals did not make it
> in.  That proposal being the Agent type that we proposed as a super-type
> for Person and Organization.
>
>
>
> Agent has been a theme of discussion in the community well before we
> approached the issue.  You can follow the recent debate in the related
> schemaorg git issue comment trail:
> https://github.com/schemaorg/schemaorg/issues/700
>
>
>
> In the bibliographic world Agent is a well understood, some would say
> obvious, approach.  When applied to the wider domains that Schema.org
> embraces however, it raises many concerns and issues. Especially because,
> as proposed, it would introduce a new direct sub-type of Thing with
> ramifications that could cascade across many areas of the  vocabulary.
>
>
>
> In my personal opinion the gap between the two apposing views on this is
> significant and the best way forward would be to consider possible
> pragmatic approaches to how we represent our data in Schema.org without
> loosing the ability to describe our resources effectively to the wider
> world.
>
>
>
> In simple terms, if we identify an author, creator, publisher, or even
> copyright holder as a Person or an Organization there is not a problem.
> The difficulty occurs when we know from the relationships in the data that
> they are either a Person or an Organization but cannot identify which.
>
>
>
> One suggested way forward for such a circumstance would be to define them
> as a schema:Thing.  To me this feels a little too vague.  A follow-on
> option was to suggest a 'personOrOrganization' boolean property to indicate
> this circumstance.  This is a little more appealing, but I think it still
> needs some work.
>
>
>
> What are others thoughts on this?
>
>
>
> Do we believe that the proposed Agent type is the *only* way forward?
> Are there potential pragmatic options like the one I describe above that we
> could shape, that would be acceptable? Is this requirement to specifically
> describe agents as too detailed and something we can pass over, and move on
> to other things?
>
>
>
> ~Richard.
>
>
>
>
>
>
> Richard Wallis
>
> Founder, Data Liberate
>
> http://dataliberate.com
>
> Linkedin: http://www.linkedin.com/in/richardwallis
>
> Twitter: @rjw
>
>
>
>
>
> --
>
> Corey A Harper
> Metadata Services Librarian
> New York University Libraries
> 20 Cooper Square, 3rd Floor
> New York, NY 10003-7112
> 212.998.2479
> corey.harper@nyu.edu
>
>
>

Received on Thursday, 13 August 2015 16:40:13 UTC