- From: Jarno van Driel <jarno@quantumspork.nl>
- Date: Thu, 17 Apr 2014 20:44:26 +0200
- To: Jason Douglas <jasondouglas@google.com>
- Cc: Public Vocabs <public-vocabs@w3.org>
- Message-ID: <CAFQgrbZWEtwJhNuaY04kHK5wKZ7QDZfMC958aqkOfiWJKCB0rA@mail.gmail.com>
"Let's fix it!" You said it, hehe. The simplest way I can come up with is by having a 'mainType' (or mainContent, mainEntity, mainItem, etc) property with an expected type of Thing, with which one can express: WebPage > mainType > Product CollectionPage > mainType > ItemList > itemListElement > ImageObject Question then becomes, what to do with a type like WebPageElement (and it's subClasses)? Do we connect entities it contains with @about, @mentions, or a new property like @hasType or @hasPart? Or how do we connect (subClasses of) WebPageElement to WebPage? On Thu, Apr 17, 2014 at 8:25 PM, Jason Douglas <jasondouglas@google.com>wrote: > Yup, that's messed up. Let's fix it! > > > On Thu Apr 17 2014 at 11:15:43 AM, Jarno van Driel <jarno@quantumspork.nl> > wrote: > >> "...if a relation is declared without an explicit subject, then the >> subject will be assumed to be the current WebPage." >> Got it. >> >> "It is legal for there to be multiple top-level entities." + "Current >> clients make up their own heuristics for this..." >> Brainfreeze! >> How am I, as a developer, to deal with this? Does this mean I have to >> somehow figure out which heuristics every parser/search engine uses, to be >> able to have control or do I need to try to chain everything together such >> that only one top-level entity is left? >> >> And how would I do this for a category page on for example an eCommerce >> site. Which shows a range of Product entities on a CollectionPage, which >> together form the main-content and where the CollectionPage, for lack of a >> better term, only functions as a 'wrapper' for the list of products. >> >> "we probably do need a mechanism for indicating the "primary entity" of >> a webpage when there is one..." >> One the reasons why I asked my questions is because I encounter quite a >> lot of markup on websites where people use @mainContentPage on entities >> like Product. Now @mainContentOfPage has the expected type WebPageElement, >> but many aren't aware of this. And since there is no property to indicate >> which entity is the primary one I actually can completely understand they >> try to resolve it like this. And frankly, I'm confused as well. >> >> >> >> On Thu, Apr 17, 2014 at 7:51 PM, Jason Douglas <jasondouglas@google.com>wrote: >> >>> It is legal for there to be multiple top-level entities. That >>> description of WebPage is not meant to imply anything about the >>> relationship of those top-level objects... all that is saying is that if a >>> relation is declared without an explicit subject, then the subject will be >>> assumed to be the current WebPage. >>> >>> That said, we probably do need a mechanism for indicating the "primary >>> entity" of a webpage when there is one. Current clients make up their own >>> heuristics for this, but I think it would be better to have an explicit way >>> of stating that. >>> >>> -jason >>> >>> >>> On Thu Apr 17 2014 at 10:41:47 AM, Jarno van Driel < >>> jarno@quantumspork.nl> wrote: >>> >>>> I'm trying to understand semantic mechanisms better but am a bit >>>> confused about schema.org/WebPage and I'd like to know how it works. >>>> >>>> Now it could well be I understand certain terminologies wrong, so >>>> please bare with me and be so nice to correct me when needed. >>>> >>>> 1] The description of http://schema.org/WebPage says: >>>> "Every web page is implicitly assumed to be declared to be of type >>>> WebPage, so the various properties about that webpage, such as breadcrumb >>>> may be used. We recommend explicit declaration if these properties are >>>> specified, but if they are found outside of an itemscope, they will be >>>> assumed to be about the page." >>>> >>>> code example: >>>> <body itemscope itemtype="http://schema.org/WebPage"> >>>> <!-- Content --> >>>> </body> >>>> >>>> Now if the WebPage is the only entity is it then considered to be the >>>> 'Subject', the 'Object' or both? >>>> >>>> 2] If the WebPage contains an entity, let's say a Product, without >>>> specifying a property on the Product and I check this with Google's SDTT, I >>>> see 2 'root' entities, since there is no property to chain the two >>>> together. Yet I get the impression the Product gets treated as the >>>> 'Object', since it's the Product that gets used for Rich snippet >>>> extraction, and that therefore the WebPage is the 'Subject' : >>>> >>>> code example: >>>> <body itemscope itemtype="http://schema.org/WebPage"> >>>> <span itemprop="name">Page title</span> >>>> >>>> <div itemscope itemtype="http://schema.org/Product"> >>>> <span itemprop="name">Product name</span> >>>> <!-- Product properties --> >>>> </div> >>>> </body> >>>> >>>> Now since "Every web page is implicitly assumed to be declared to be of >>>> type WebPage" I was wondering if there also is a property that is >>>> 'implicitly assumed to be declared' (something like @contains) on the first >>>> entity that comes after it, like Product in this case, which indicates that >>>> the Product is the 'Object'? >>>> >>>> And if not, than how does a parser 'know' which of the entities is the >>>> 'Subject' and which is the 'Object', shouldn't there be a predicate for >>>> this? >>>> >>>> 3] When a WebPage contains a bunch of 'root' entities, how does a >>>> parser make sense of this, does the DOM have anything to do with this? >>>> >>>> <body itemscope itemtype="http://schema.org/WebPage"> >>>> <span itemprop="name">Page title</span> >>>> >>>> <div itemscope itemtype="http://schema.org/Product"> >>>> <span itemprop="name">Product 1 name</span> >>>> <!-- Product properties --> >>>> </div> >>>> >>>> <div itemscope itemtype="http://schema.org/Product"> >>>> <span itemprop="name">Product 2 name</span> >>>> <!-- Product properties --> >>>> </div> >>>> >>>> <div itemscope itemtype="http://schema.org/LocalBusiness"> >>>> <span itemprop="name">Business name</span> >>>> <!-- Product properties --> >>>> </div> >>>> </body> >>>> >>>> Now the above could be full of misunderstandings because I lack in >>>> theoretical knowledge still, but that's exactly the thing I'm hoping to >>>> change. Who can enlighten me? >>>> >>>> >>>> >>>> >>>> >>>> >>
Received on Thursday, 17 April 2014 18:44:54 UTC