- From: Dan Scott <denials@gmail.com>
- Date: Mon, 9 Dec 2013 07:35:15 -0500
- To: Henry Andrews <hha1@cornell.edu>, Peter Olson <polson@marvel.com>
- Cc: "public-schemabibex@w3c.org" <public-schemabibex@w3c.org>
HI Henry: These are good questions to be asking; see inline, below, for my attempted responses :) On Mon, Dec 9, 2013 at 2:17 AM, Henry Andrews <hha1@cornell.edu> wrote: > Hi folks, > I took a look at the > http://www.w3.org/community/schemabibex/wiki/Periodicals_and_Comics_synthesis > as suggested and have a few questions and comments. Some of these are > really basic questions about the schema goals a process. I did go through > the comics-related emails in the archive for the past few months to catch up > a bit, but I haven't read the entire rather large volume of emails that > didn't specifically say "comics" in the subject line. So feel free to tell > me to go read some other documentation, or to send some answers off-list. > > This definitely does a good job of covering the essentials, which I gather > is the goal. So now I'll nitpick at details :-P > > One really basic question is how much precision are you going for here? I > am guessing less than the GCD, which wants all the precision :-) Do you > have a feel for the point at which it's fine to stuff things in a "notes" > field? In practice, this generally boils down to "should people be able to > search for this thing?" We're actually aiming for a fair bit of precision, although we can work towards refining the precision over time. schema.org is capable of modeling complex relationships between different types, as well as between different instances of the same type. I'll try to provide some examples below. > Some general concerns about the definition: > ================================ > The description of "Comics" given at > http://www.w3.org/community/schemabibex/wiki/Periodicals_and_Comics_synthesis#Comics > , if read literally, is extremely specific to typical modern U.S. periodical > comics. > > The restrictions on binding and size, while helpful to give the general > idea, will break down pretty rapidly looking over the entire history of US > comics. It also doesn't fit typical European (nor, I imagine, Asian) > formats all that well, although perhaps some of those fit better under > GraphicNovel? Is the goal here to handle published sequential art in > general, or just the US market and things that are similar enough, with > other schemas for bande-dessinée, manga, etc.? Hmm. Most of the comics-related portions of the proposal you're looking at is based on either the original January 2012 proposal that was floated to schema.org (http://www.w3.org/wiki/WebSchemas/PeriodicalsComics) or taken from Peter's earlier reply on this list. The particular example that I suspect is bothering you ("short form, saddle-stitched, usually comes in pamphlet form") was also used in the introductory material to the original proposal. However (and this is hopefully good news), that was meant only to serve as an introduction for the proposal. The actual content that users of schema.org would see if the proposal was adopted is the description under the actual definition of the type; so, for ComicIssue, down at http://www.w3.org/community/schemabibex/wiki/Periodicals_and_Comics_synthesis#Thing_.3E_CreativeWork_.3E_PeriodicalIssue_.3E_ComicIssue the description is: "Individual comic issues are serially published as part of a larger series (for the sake of consistency, even one-shot issues belong to a series comprised of a single issue). All comic issues can be uniquely identified by the combination of the name and volume number of the series to which the issue belongs; the issue number; and the variant description of the issue (if it exists)." > Hierarchy: > ======= > I see that each level (Comic, ComicSeries, ComicIssue, ComicStory) can link > to all of the levels above or below it. Is this just to support the full > range of possible "joins" (to borrow from SQL) more easily? Or do you > expect that some levels will be omitted. Would a comic published as a > one-shout (per the indicia) with only one story in it just have a Comic and > a ComicIssue and no ComicSeries or ComicStory? Peter had made it clear earlier that many comics do not follow a strict Comic / ComicSeries / ComicIssue hierarchy. In the original Comics proposal, there was only ComicSeries and ComicIssue, but in digging into the possible permutations it seemed to me as though there was a need to break out ComicStory as its own thing (to support the description of multiple stories published in a single issue, as well as to support individual stories that get republished elsewhere). I have been less sure about the need for a separate Comic vs. ComicSeries type, but wanted to start with that distinction and then collapse it if it does not hold up under scrutiny. I expect that an automated publishing system like comics.org would simply mark up that one-shot example on the series page (http://www.comics.org/series/76838/) using the full Comic / ComicSeries / ComicIssue / ComicStory - something like: <div vocab="http://schema.org/" typeof="Comic"> <h1 property="name">All-New X-Men Special</h1> <div property="hasComicSeries" typeof="ComicSeries"><span property="volumeNumber">2013</span> Series <div property="hasComicIssue" typeof="ComicIssue">Issue #<span property="issueNumber">1</span> <div property="hasComicStory" typeof="ComicStory"><span property="description">The X-Men are in the arms of the Octopus...</span></div> </div> </div> </div> Although that could also be flattened out as follows: <div vocab="http://schema.org/" typeof="Comic"> <h1 property="name">All-New X-Men Special</h1> <p property="hasComicSeries" typeof="ComicSeries"><span property="volumeNumber">2013</span> Series</p> <p property="hasComicIssue" typeof="ComicIssue">Issue #<span property="issueNumber">1</span></p> <p property="hasComicStory" typeof="ComicStory"><span property="description">The X-Men are in the arms of the Octopus...</span></p> </div> That said, the existing proposal would let you mark it up as just a Comic and ComicIssue. For a publishing system like comics.org, they would have to use a separate template for one-shots, and it might be a little more difficult for search engines to retrieve the intended semantics ('If the ComicIssue has no described ComicStory, then assume that the descriptive properties like "description" and "name" actually describe an implicit ComicStory...'). > Is this why ComicIssue and > ComicStory have many duplicate fields? That's one of the reasons, yes. Note that the original Comics proposal also duplicated most of those fields across ComicSeries and ComicIssue, so I was just attempting to maintain the status quo there. > How are searches expected to handle creator data being at either of two > possible levels? (again, apologies if this is obvious to folks who have > been working on this stuff for a while). Implementations will differ by each search engine that crawls the pages marked up with schema.org, but in the Google Custom Search, for example, you can filter on properties of different types - so if you wanted to search for ComicIssues or ComicStories where the creator was "Joss Whedon", that would avoid pulling up results where the Comic was created by Joss Whedon but the issues and stories were actually written by someone else. Alternately, I believe you can search for the same property across a number of types. > I see in the examples that these things are shown nesting in XML. Does this > mean that none of the connections are many-to-many? You can actually repeat the "has*" and "partOf*" properties, so you can (for example) link a given story to many different issues, series, and graphic novels. And you can link a given Comic to many different series and issues and stories. I _think_ this offers the flexibility you're looking for. > With issues and stories > that can be useful for variants (although that's not how the GCD does it on > the back-end). There are examples of issues as part of multiple series (the > GCD has never implemented that, although there are intentions). Of course > duplicating data is also an option- if that's the plan for variants then > you're probably fine with it for the series case- it's fairly rare. I can > pull up examples if anyone wants some, though. > > I've already commented that I think the "Comic" concept as stated is a bit > problematic, although I'm still contemplating that. Probably worth its own > thread, maybe tomorrow. Agreed (both that it is a bit problematic, and deserving of a separate thread). > ComicIssue and ComicStory: > ===================== > I noticed discussion of an Article type. Is there a particular reason why > ComicStory does not correlate to Article? I think I originally made ComicStory a subclass of Article, then opted against that for the sake of simplicity. The only property I was really interested in for ComicStory that would have come from Article was "pagination", so it seemed easier to just define it as a direct descendant of CreativeWork. However, that could easily be changed! > At the GCD we have found it much easier to model the cover as a type of > ComicStory. They need the entire set of credits (especially when you get to > cases like http://www.comics.org/issue/85/cover/4/ where the cover is a > complete story, or ones where the cover is the first page of a story that > continues inside (I can't recall an example off the top of my head). Hah! The cover as a complete story is an awesome example. You may be interested in my proposal for enabling better markup of cover art at http://lists.w3.org/Archives/Public/public-schemabibex/2013Nov/0091.html (in short this would add a coverArt property to all CreativeWorks pointing at a full type to allow distinguishing between variants, for example, and in the case of cover-as-story you could apply the alternate type "ComicStory" to it as well). > Looks like multiple contributors of the same role should work fine here. > What about pen names? Is this intended to record it as credited, by an > authoritative name, or both? I'm assuming "Person" handles some notion of > name changes or nicknames/sobriquets. I should probably go find the > definition of the Person type... Yeah... in an ideal world, you link to a Person which can in turn link to something like http://viaf.org/viaf/34481281/ or the ISNI equivalent so that you (or more accurately the search engine) can see that Alan Moore also published works as Curt Vile and Jill de Ray and do intelligent things with that. > Is there any interest in capturing information about editors or other roles? Good news: because all of the Comic types are subclasses of http://schema.org/CreativeWork, we get all the roles defined there (such as "editor") for free! Of course there is always room for more roles. > I think it would be a good idea to allow job codes as a local ID space on > stories similar to distributor codes on issues. > Here's an illustration of INDUCKS' prominent use of job/story codes: > http://coa.inducks.org/index.php > AtlasTales http://atlastales.com/search and the GCD also allow searching by > job codes. There is a way of defining an external enumeration that would be more restrictive, but that external enumeration needs to exist and be openly available to point at. Alternately, you can define an enumeration within schema.org, but that causes more churn for schema.org and seems to be not working out all that well thus far. Are the comic-related job codes openly available and reasonably broadly used? > Is the "format" field free-form? That's given the GCD a lot of headache > over the years, although it depends on the attempted resolution. If you're > just going for comic vs book vs album vs (I dont know what the Asian formats > are) you'll probably get reasonable data. Yes, at this point (following the original Comics proposal), format is free-form "Text". Again, if there's a good external enumeration we could point at that already exists and covers most use cases, that would be a great enhancement! > ===== > I had another section on "imprint" but decided that it could use its own > thread. I'll post that shortly. I'll also write separately on GraphicNovel > at some point. Great, thanks Henry!
Received on Monday, 9 December 2013 12:35:44 UTC