Re: Why we want to have separate Periodical and (Periodical)Issu(e|ance) types from Karen Coyle on 2013-11-22 (public-schemabibex@w3.org from November 2013)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Fri, 22 Nov 2013 10:16:44 -0800
To: "public-schemabibex@w3.org" <public-schemabibex@w3.org>
Message-ID: <528F9F8C.4030202@kcoyle.net>
I have replied privately to Dan on this, but to save everyone else time, 
you do not have to explain linked data or RDF to me. I already know.

Thanks,
kc

On 11/22/13 8:43 AM, Dan Scott wrote:
> On Fri, Nov 22, 2013 at 9:53 AM, Karen Coyle <kcoyle@kcoyle.net> wrote:
>>
>>
>> On 11/21/13 5:42 PM, Dan Scott wrote:
>>
>>>
>>> Yes, you have mentioned this a number of times now. As I said on the
>>> call, we're working with structured data. One benefit lies in being
>>> able to define Periodical as an entity in and of itself, then refer to
>>> it from the separate issues, instead of repeating the core Periodical
>>> information in each instance of an issue (and worse, in each instance
>>> of an article in each instance of a Periodical). If you refer to two
>>> separate issues of the same periodical on the same page, and you
>>> haven't broken Periodical out separately from Issue, then you have to
>>> repeat all that core Periodical information with slightly different
>>> volume / number / date information. You could determine that they're
>>> the same Periodical by comparing their ISSN and name, I suppose, but
>>> that seems like a very twisted and artificial way to achieve what
>>> should be a very basic operation.
>>
>>
>> I honestly don't see this as a mark-up use case. So I would like to see an
>> example (preferably of a real web page) where this type of structuring would
>> be used in the mark-up.
>
> I have included such an example in the Periodical proposal from the
> very start; see "Example 1: A list of the issues of a given
> periodical, and the articles that were published in each issue." The
> example uses an ellipsis to indicate that further articles would
> follow. Perhaps I need to make that clear.
>
> For another example, On November 1 I wrote: "One use case is "here's
> everything we know about <Time magazine> or <Laurentian University
> student newspaper>", listing all of the issues and articles contained
> in those issues and linking to the articles where feasible. I can
> imagine search engines and discovery layers greedily gobbling that
> up." (http://lists.w3.org/Archives/Public/public-schemabibex/2013Nov/0005.html)
>
> And for another example that points at a real web page, in a separate
> email on November 1, I wrote: "The citation for "International Journal
> of Sustainable Development & World Ecology" just lists 2007, for
> example, but as we can see at
> http://www.tandfonline.com/loi/tsdw20?open=14&repitition=0#vol_14
> there were six issues published in 2007. [...] In many cases, the
> publishers themselves know a lot (see the Taylor & Francis link
> above), and could augment the mechanical issues/articles lists that
> they already publish with structured data fairly easily _if_ we help
> them with a reasonable set of types & properties and provide a
> reasonable pattern to follow. (Do we have any periodical publishers on
> this list? That would be fantastic!) The obvious motivation for
> publishers would be to drive more traffic to their "Pay us $$ to
> access a copy of this article..." business model. Heh."
> (http://lists.w3.org/Archives/Public/public-schemabibex/2013Nov/0006.html)
>
>> I do not see a problem with having some repetition in marking up a page
>> like:
>
> It's not a problem if someone chooses to repeat the information. It's
> a problem if we _force_ them to repeat the information because we
> failed to give them a way to cleanly and rationally provide structure
> for their data.
>
>>
>> Le Boeuf, P. (2012). Foreword. Cataloging & Classification Quarterly,
>> 50(5-7), 355–359. doi:10.1080/01639374.2012.682001
>>
>> MADISON, Olivia M.A. The origins of the IFLA study on Functional
>> Requirements for Bibliographic Records. In: LE BŒUF, Patrick. Ed. Functional
>> Requirements for Bibliographic Records (FRBR): Hype, or Cure-All? .
>> Binghamton, NY: the Haworth Press, 2005.
>>
>> Le Boeuf, P. (2005).Musical Works in the FRBR Model or "Quasi la Stessa
>> Cosa": Variations on a Theme by Umberto Eco. Cataloging & Classification
>> Quarterly, 39(3-4), 103-124. doi:10.1080/01639374.2012.682001
>>
>> Schmidt, R. (2012). Composing in Real Time: Jazz Performances as “Works” in
>> the FRBR Model. Cataloging & Classification Quarterly, 50(5-7), 653–669.
>> doi:10.1080/01639374.2012.68160
>>
>> Are you saying that you feel a need to have "Cataloging & Classification
>> Quarterly, 50(5-7)" coded in only a single entry on that page? I am assuming
>> that each entry stands alone, and it needs to be marked up something like
>> (and, yes, this is very pseudo-codey):
>>
>> Article
>>    author "Le Boeuf, P."
>>    name "Foreward"
>>    Periodical
>>      name "Cataloging..."
>>      volume "50"
>>      issue "5-7"
>>    pages "355-359"
>>    id "doi:10.1080/01639374.2012.682001"
>
> Note that you have inlined "author" here as a text value, where a more
> structured approach would inline or link to a Person or Organization.
> That's because schema.org supports structured data. But processors do
> accept that humans will sometimes be lazy or confused and will do
> their best to deal with non-structured data.
>
> The use case for separating issue/volume out of Periodical is
> absolutely parallel. I want to support non-lazy, clear-thinking
> development of systems. So in actual RDFa Lite markup, that would look
> something like:
>
> <div vocab="http://schema.org/" typeof="Article">
>    <span property="author" typeof="Author"><link property="url"
> href="http://viaf.org/viaf/22193216" /><span property="name">Le Boeuf,
> P.</span></span>
>    (<span property="datePublished">2005</span>).
>    <span property="name">Musical Works in the FRBR Model or "Quasi la
> Stessa Cosa": Variations on a Theme by Umberto Eco</span>.
>    <span property="partOfIssue" typeof="Issuance">
>      <link property="url" href="http://www.tandfonline.com/toc/wccq20/39/3-4" />
>      <span property="partOfPeriodical" typeof="Periodical">
>        <link property="url" href="http://www.tandfonline.com/loi/wccq20" />
>        <span property="name">Cataloging &amp; Classification Quarterly</span>,
>      </span>
>      <span property="issueVolume">39</span>(<span
> property="issueNumber">3-4</span>), <span
> property="pagination">103-124</span>.
>    </span>
>    <a property="url"
> href="http://dx.doi.org/10.1080/01639374.2012.682001">doi:10.1080/01639374.2012.682001</a>
> </div>
>
> Notice:
>
> * The "author" property is a full-fledged Person with a link to an
> authoritative URL. Linked data, win!
> * The periodical property is a full-fledged Periodical with a link to
> a URL (not going to say it is authoritative, but certainly works as an
> identifier). Linked data, win!
> * The "partOfIssue" property is a full-fledged Issuance that is, in
> this case, described inline and links to
> http://www.tandfonline.com/toc/wccq20/39/3-4 for a URL. (Oh, hey look,
> publishers feel that it's important to separate out and describe their
> issues onto separate web pages!). Linked data win! We could also link
> to http://catalogingandclassificationquarterly.com/ccq39nr3-4.html via
> sameAs but that just ends up being a table of contents (aside: seems
> like schema.org could use a property and type(s) for ToCs) so it's not
> a very satisfying place to link (maybe for "description"...)
> * We've turned that DOI into a clickable link and authoritative URL
> for the article.
>
> And the markup works. Toss it into Google's Structured Data Testing
> Tool or http://rdfa.info/play. Now that I've taken the time to mark it
> up and test it, I'll add that to the examples in the proposal so that
> we can knock off the "enhanced citation" use case.
>
>> And in the case where the page represents an issue with, say, its table of
>> contents, then:
>>
>> Periodical
>>    name
>>    volume
>>    issue
>>    date
>>      Article1
>>        author...
>>      Article2
>>        author
>>
>> Which tells me that we don't have a hierarchical structure between
>> Periodical and Article, but have two things that can be used together in
>> various ways.
>
> Except of course "issue" can simply be a single Issu(e|ance) type that
> wraps all of the Article types. Bump issue and everything over to the
> right one level and it works quite nicely.
>
> I agree that we have things (but not just two) that can be used
> together in various ways. As meaningless as that statement is, I'll
> try to make it concrete--the Periodical proposal had the ability for
> an Article to link directly to the containing Periodical for those
> arxiv.org use cases that require only two types before yesterday's
> call began. But the Periodical proposal also supports more traditional
> periodical relationships as well.
>
>> (This also helps the case where there is an article that is
>> not associated with a periodical. One of the examples above is an article
>> reprinted in a book.)
>
> Okay, then, let's think that example through. By adopting the
> "Collection" proposal [1], as we agreed to do on the call, Article can
> use its new "isPartOf" relationship that it will inherit from
> CreativeWork to point at the Book for that exact use case.
>
>> The structure we need to address is that of the web
>> page, which very well may be repetitive.
>
> I disagree. We need to add structured data to the web page that
> enables search engines and other schema.org processors to do
> intelligent things with the data on the page. The data might repeat on
> a given web page, but being able to give it a @resource ID in RDFa and
> refer to that thereafter, or use microdata's @itemref / @itemid
> mechanisms, or point to authoritative URLs, will enable the processors
> to say "Ah, okay, so this article and this article belong to the same
> issue in the same periodical, and I know from crawling these other
> pages that these are all the other issues for this same periodical,
> and now I can do intelligent things with this data like generate my
> own much more usable table of contents with links to the open-access
> versions of these articles that I know about from crawling
> institutional repositories..." etc.
>
> 1. http://www.w3.org/community/schemabibex/wiki/Collection
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet
Received on Friday, 22 November 2013 18:17:12 UTC