Re: Comics and periodicals in schema.org (was Re: journal article for next call?)

On Sat, Dec 7, 2013 at 2:16 AM, Henry Andrews <hha1@cornell.edu> wrote:
> Hi folks- for those who don't know me, I'm the former tech lead for the GCD (comics.org), and in particular I was the tech lead when the current iteration of the database was set up.  Comments below:
>
>> From: Dan Scott <denials@gmail.com>
>>Subject: Re: Comics and periodicals in schema.org (was Re: journal article for next call?)
>>On Thu, Dec 5, 2013 at 11:57 PM, Olson, Peter <polson@marvel.com> wrote:
>>> Hi Dan -
>>Okay, I was following the basic tutorial at comics.org
>>(http://docs.comics.org/wiki/OI_Tutorial#How_do_I_create_my_first_index.3F)
>>where it mentions searching for "Muppet Show" by Series Name, and
>>three of the first four results are series for the same "The Muppet
>>Show" comic published by Boom! in 2009, with the first series having
>>four issues, and the next two series having one issue each. As I
>>warned in my initial email, I worried that I might be drawing too much
>>from that example!
>
> This illustrates a point of ongoing discussion at the GCD, which is the idea of treating "periodicals" (i.e. things like U.S. monthly-ish comics, or UK weeklies) vs "albums" (the common European format) vs "books" (collected editions, longer single publications, etc.) in separate ways.  Right now we treat them all the same, primarily based on the U.S. monthly periodical.  Of course, defining what is and isn't a "book" is incredibly contentious, up to and including the question of whether the categories are distinct or overlap.  So that one has been going around in circles for years.

Hah! It's good to know that there's no simple solution that we've been
missing, at least :)

One (albeit slightly complex) option that schema.org / RDFa offer is
the ability to mix multiple types, so that (for example) a collected
edition of comic issues could have a Comic type as its primary type
and Book as a secondary type, using properties from both. The current
proposal has a specific "GraphicNovel" type that inherits properties
from Book and adds in the Comic properties that comes wholecloth from
the original Comics & Periodicals proposal, but if that section of the
proposal was rejected by the schema.org partners, the "mix Comic +
Book types" approach would still let you express what you need, I
think.

> Several of the "Muppet Show" series matched there are collections or otherwise book-like.

Ah, I see - so the particular examples I was looking at were 1.
serialized across four issues then 2. collected in a trade paperback
format and 3. collected in a hardcover edition. Got it. And the latter
two are where the Book type properties would come in handy; looking at
http://www.amazon.ca/Muppet-Show-Comic-Book-Muppets/dp/1934506850, for
example, "isbn" and "bookFormat" and "numberOfPages" from Book might
be useful for some, along with the Comic-specific artist / penciler /
etc properties.

> [quoted out of order -henry]
>>>We have two distinct Amazing Spider-Man series, one which started
>>>in 1963 and one which started in 1999
>>>(http://marvel.com/comics/series/1987/amazing_spider-man_1963_-_1998
>>>and http://marvel.com/comics/series/454/amazing_spider-man_1999_-_2013).
>
>>
>>(For what it's worth, I had looked up "The Amazing
>>Spider-Man" at the time and saw that it had one huge series starting
>>in 1963, so I was confused!)
>
> Not sure what happened there, GCD has the series same as Marvel:
> http://www.comics.org/series/1570/ 1963-1998
> http://www.comics.org/series/11288/ 1999-2013

I apologize for explaining myself poorly; what I meant was that the
ComicSeries proposal includes the note: "At Marvel we use the start
year as the volume number". So I had expected to see one different
series listed per year.

>>I was worrying that perhaps there was no need for a Comic /
>>ComicSeries split after all. Do cases like "7 Brothers"
>>(http://www.comics.org/series/name/7%20brothers/sort/alpha/) where a
>>set of 5 issues was published in 2007, and another set of 5 issues was
>>published in 2008 justify continuing to have ComicSeries match with
>>PeriodicalVolume, and to have a separate Comic as a peer of
>>Periodical? Maybe.
>
> No, that's not something you can rely on.  Volume numbers vary widely in comics.  Early Golden Age U.S. comics would have a volume per year and reset the issue number each year.  For decades, DC would increment the volume number each year *without* resetting the issue number.  European series do something involving calendar years (I'm not sure if that's a formal volume or just the European GCD indexers' notational convention- sadly a fair chunk of what ought to be schema is still done through notation due to not enough tech volunteers to migrate the more complex notation).

Right, in the world of periodicals (both academic and comics) I think
we have all learned that we cannot rely on anything. However, the core
part of my question is: does it make sense, as I've laid out in the
current synthesized proposal at
http://www.w3.org/community/schemabibex/wiki/Periodicals_and_Comics_synthesis#Comic_Schemata,
to have a "Comic" type that is separate from the "ComicSeries" type,
so that we can handle those cases where we have the same title (Comic
level) with a different volume number (ComicSeries level) than then
collects one or more issues (ComicIssue level)?

To handle those comics that don't have a volume number, there is a
direct Comic -> ComicIssue relationship via "hasComicIssue" that
supports that structure.

> Also, note that comic book publishers, particularly early on, often did weird things with both volume and issue numbers, sometimes as a postal regulation dodge, and sometimes just because nobody cared.  My favorite pathological example is Cat-Man comics:  http://www.comics.org/series/61787/

I love the note in the Holyoke series about the "notoriously
complicated numbering scheme"!

<snip Peter's talk which I have cued up in another browser window>

>>> Definition of Comic - There's some potential for ambiguity here so I wanted to dig down on some specific examples.  Often several comic series are published simultaneously with very similar names.  For example, we currently publish the following:
>>> X-Men
>>> Uncanny X-Men
>>> Ultimate X-Men
>>> X-Men Legacy
>>>
>>> If I'm reading the proposal right, each of those would be distinct comics (each containing one or more distinct series).
>>
>>Yes, that's what I was thinking.
>>
>>> Another example - over the years we published a series of Comic Series in which the titles changed but the numbering was continuous: X-Men -> New X-Men  -> X-Men -> X-Men Legacy -> X-Men (again see the talk, which lists out a few more examples).  Under the definition in the proposal each distinct title would be a distinct Comic, correct?
>>
>>Fascinating! Yes, I think each title would be a distinct Comic in that
>>case. Maybe we'll need some sort of relatedWork mechanism sooner
>>rather than later after all. From http://docs.comics.org/wiki/Tracking
>>it looks like "Continues from" / "Continues in" covers the
>>relationships that comics.org cares about, although it carries series
>>name, publisher, and date with each relationship.
>
> I had a more fully-featured notion of tracking links that I've never had time to get through policy and implement.  Tracking links are one of those things that are still notation-based.  There's the "numbering continues from/in" concept, which is well-established (and btw, may not just link two series at each end- see: http://www.comics.org/series/177/ for a double continuation, and http://www.comics.org/series/236/ for a rather infamous numbering shell game).
>
> An important corollary of the renumbering thing is that the same name can show up again as part of a *different* comic (for lack of a better term).  The first "New X-Men" was very much a retitling of "X-Men", but the second was a book focused on the junior team, with unrelated numbering (it re-launched out of "New Mutants", itself a re-used name).
>
> The GCD defines a series more-or-less as a set of sequentially published things having the same indicia (formal/legal) title and the same "master publisher".  Where "master publisher" is another source of endless research and debate (look at the publisher for Cat-Man Comics for an example).  Particularly before about 1960 it's extremely difficult to tell whether certain publishing companies are "the same" and by what measure.  Publishing was a dodgy business.

And corners of it are still dodgy, I'm sure!  I suspect in the short
term that we'll hold off on trying to mint related work relationships
as part of the periodicals & comics proposal and bring that in as a
separate proposal. And I hope that you'll be part of that
conversation, too :)

>>How would that have been handled in the original proposal: separate
>>ComicSeries for each title change, I guess?
>>
>>> Comic Stories - because stories can be and are reprinted, the original comic issue in which they appeared should probably be identified in the schema.  For example, the Spider-Man origin story has been reprinted hundreds of times, but it's always "from" Amazing Fantasy #15.
>>
>>That sounds very reasonable; so something like an
>>"originallyPublishedIn" property that should only be used if there are
>>more than one "partOfComicIssue" / "partOfPeriodicalIssue" properties,
>>to identify the ur-comic (or periodical, as that could be useful for
>>non-comic articles as well)?

Henry - do you feel strongly (either way) about an
"originallyPublishedIn" property for ComicStory?

>>> It might be worthwhile looking at the comics.org schema as well: http://docs.comics.org/wiki/Current_Schema
>>
>>As someone who cut his first-career teeth developing a relational
>>database for 8 years, *yes*, it's always worthwhile looking at
>>database schemas (I will pretend that I'm not seeing the "recalculated
>>by code on data updates" statements) :)
>>
>>Hey, there is a "Story" table in the schema. That makes me feel better
>>about having a ComicStory type, then!
>
>
> A story type is very important- for decades anthologies were the norm, not single-story issues, and in some places they still are.  One of the biggest online databases, I.N.D.U.C.K.S. (specializing in Disney comics) is centered around the story rather than the GCD's issue-centric model.

This is a very good to know, thank you.

> I should also warn that that schema on the GCD wiki is out of date, although not drastically misleading.  But for instance the current tech lead just implemented a two-layer publisher's branding scheme to deal with the many difficulties we had with a single-layer system (minor logo changes caused a lot of confusion).
>
> Anyway, I apologize for rambling, hope some of this helps someone- I'm happy to answer any questions about the GCD's data model.  I'm not actively writing code for them right now but I still keep an eye on developments and may get back to it.

Please don't apologize; rather, thank you so much for your help and
your patience! Based on what you and Peter have said, I _think_ the
current proposal at
http://www.w3.org/community/schemabibex/wiki/Periodicals_and_Comics_synthesis#Comic_Schemata
can handle most of the core use cases for comics, and does not remove
any of the capabilities that were offered by the original proposal at
http://www.w3.org/wiki/WebSchemas/PeriodicalsComics.

For me, the most pressing concern is whether you and Peter, as the
comic experts of the group, support the proposed "Comic -> ComicSeries
-> ComicIssue -> ComicStory" structure, with the direct "Comic ->
ComicIssue -> ComicStory" for those comics that have no need for a
separate series, or even "Comic -> ComicStory" for single-story
original graphic novels (that might also get the Book type mixed in).

Received on Sunday, 8 December 2013 16:32:58 UTC