- From: Thad Guidry <thadguidry@gmail.com>
- Date: Sun, 20 Apr 2014 10:10:09 -0500
- To: "Wallis,Richard" <Richard.Wallis@oclc.org>
- Cc: Jarno van Driel <jarno@quantumspork.nl>, Jocelyn Fournier <jocelyn.fournier@gmail.com>, Jason Douglas <jasondouglas@google.com>, "<public-vocabs@w3.org>" <public-vocabs@w3.org>
- Message-ID: <CAChbWaOKY+g-rcMnASXU8UP15CT0AA7KuBK4Wij1gqkvVtJOdQ@mail.gmail.com>
Least we forget... A webpage IS A html document. It can contain multiple sections about multiple subjects. Reminder FYI. Thanks for asking these kinds of questions, Jarno. Keep 'em coming! On Sun, Apr 20, 2014 at 4:13 AM, Wallis,Richard <Richard.Wallis@oclc.org>wrote: > I would suggest that it should read “… then the subject will be assumed > to be the current *webpage*”. > > That subject of that webpage can be of any Type - Person, Place, > Organization, WebPage or one of its more focused subtypes - that is being > described. > > By defining the type, you are doing the right thing otherwise some > process or person parsing the data would just have to assume that it > is ‘just a WebPage’. > > ~Richard > > On 20 Apr 2014, at 02:05, Jarno van Driel <jarno@quantumspork.nl> wrote: > > "all that is saying is that if a relation is declared without an > explicit subject, then the subject will be assumed to be the current > WebPage" > > So if the WebPage is only there to function as a sort of 'catch all' > than why does it even have subClasses? > > I actually use WebPage and it's subClasses in an attempt to do the right > thing, but is there a point in doing this, does it actually matter if I > declare the page type? > > > On Thu, Apr 17, 2014 at 9:03 PM, Jocelyn Fournier < > jocelyn.fournier@gmail.com> wrote: > >> Le 17/04/2014 20:25, Jason Douglas a écrit : >> >> Yup, that's messed up. Let's fix it! >>> >> >> Hi, >> >> Note that examples regarding mainContentOfPage on schema.org are also >> misleading. >> E.g. : http://schema.org/Table >> => <meta itemprop="mainContentOfPage" content="true"/> >> >> I would have expected >> <div itemprop="mainContentOfPage" itemscope itemtype="http://schema.org/ >> Table"> >> >> >> But I fully agree it would be much more usefull to have mainContentOfPage >> as a 'Thing' rather than 'WebPageElement' >> >> Jocelyn >> >> >>> On Thu Apr 17 2014 at 11:15:43 AM, Jarno van Driel >>> <jarno@quantumspork.nl <mailto:jarno@quantumspork.nl>> wrote: >>> >>> "...if a relation is declared without an explicit subject, then the >>> subject will be assumed to be the current WebPage." >>> Got it. >>> >>> "It is legal for there to be multiple top-level entities." + >>> "Current clients make up their own heuristics for this..." >>> Brainfreeze! >>> How am I, as a developer, to deal with this? Does this mean I have >>> to somehow figure out which heuristics every parser/search engine >>> uses, to be able to have control or do I need to try to chain >>> everything together such that only one top-level entity is left? >>> >>> And how would I do this for a category page on for example an >>> eCommerce site. Which shows a range of Product entities on a >>> CollectionPage, which together form the main-content and where the >>> CollectionPage, for lack of a better term, only functions as a >>> 'wrapper' for the list of products. >>> >>> "we probably do need a mechanism for indicating the "primary entity" >>> of a webpage when there is one..." >>> One the reasons why I asked my questions is because I encounter >>> quite a lot of markup on websites where people use @mainContentPage >>> on entities like Product. Now @mainContentOfPage has the expected >>> type WebPageElement, but many aren't aware of this. And since there >>> is no property to indicate which entity is the primary one I >>> actually can completely understand they try to resolve it like this. >>> And frankly, I'm confused as well. >>> >>> >>> >>> On Thu, Apr 17, 2014 at 7:51 PM, Jason Douglas >>> <jasondouglas@google.com <mailto:jasondouglas@google.com>> wrote: >>> >>> It is legal for there to be multiple top-level entities. That >>> description of WebPage is not meant to imply anything about the >>> relationship of those top-level objects... all that is saying is >>> that if a relation is declared without an explicit subject, then >>> the subject will be assumed to be the current WebPage. >>> >>> That said, we probably do need a mechanism for indicating the >>> "primary entity" of a webpage when there is one. Current >>> clients make up their own heuristics for this, but I think it >>> would be better to have an explicit way of stating that. >>> >>> -jason >>> >>> >>> On Thu Apr 17 2014 at 10:41:47 AM, Jarno van Driel >>> <jarno@quantumspork.nl <mailto:jarno@quantumspork.nl>> wrote: >>> >>> I'm trying to understand semantic mechanisms better but am a >>> bit confused about schema.org/WebPage >>> <http://schema.org/WebPage> and I'd like to know how it >>> works. >>> >>> >>> Now it could well be I understand certain terminologies >>> wrong, so please bare with me and be so nice to correct me >>> when needed. >>> >>> 1] The description of http://schema.org/WebPage says: >>> "Every web page is implicitly assumed to be declared to be >>> of type WebPage, so the various properties about that >>> webpage, such as breadcrumb may be used. We recommend >>> explicit declaration if these properties are specified, but >>> if they are found outside of an itemscope, they will be >>> assumed to be about the page." >>> >>> code example: >>> <body itemscope itemtype="http://schema.org/WebPage"> >>> <!-- Content --> >>> </body> >>> >>> Now if the WebPage is the only entity is it then considered >>> to be the 'Subject', the 'Object' or both? >>> >>> 2] If the WebPage contains an entity, let's say a Product, >>> without specifying a property on the Product and I check >>> this with Google's SDTT, I see 2 'root' entities, since >>> there is no property to chain the two together. Yet I get >>> the impression the Product gets treated as the 'Object', >>> since it's the Product that gets used for Rich snippet >>> extraction, and that therefore the WebPage is the 'Subject' : >>> >>> code example: >>> <body itemscope itemtype="http://schema.org/WebPage"> >>> <span itemprop="name">Page title</span> >>> >>> <div itemscope itemtype="http://schema.org/Product"> >>> <span itemprop="name">Product name</span> >>> <!-- Product properties --> >>> </div> >>> </body> >>> >>> Now since "Every web page is implicitly assumed to be >>> declared to be of type WebPage" I was wondering if there >>> also is a property that is 'implicitly assumed to be >>> declared' (something like @contains) on the first entity >>> that comes after it, like Product in this case, which >>> indicates that the Product is the 'Object'? >>> >>> And if not, than how does a parser 'know' which of the >>> entities is the 'Subject' and which is the 'Object', >>> shouldn't there be a predicate for this? >>> >>> 3] When a WebPage contains a bunch of 'root' entities, how >>> does a parser make sense of this, does the DOM have anything >>> to do with this? >>> >>> <body itemscope itemtype="http://schema.org/WebPage"> >>> <span itemprop="name">Page title</span> >>> >>> <div itemscope itemtype="http://schema.org/Product"> >>> <span itemprop="name">Product 1 name</span> >>> <!-- Product properties --> >>> </div> >>> >>> <div itemscope itemtype="http://schema.org/Product"> >>> <span itemprop="name">Product 2 name</span> >>> <!-- Product properties --> >>> </div> >>> >>> <div itemscope itemtype="http://schema.org/LocalBusiness >>> "> >>> <span itemprop="name">Business name</span> >>> <!-- Product properties --> >>> </div> >>> </body> >>> >>> Now the above could be full of misunderstandings because I >>> lack in theoretical knowledge still, but that's exactly the >>> thing I'm hoping to change. Who can enlighten me? >>> >>> >>> >>> >>> >>> >>> >> > > -- -Thad +ThadGuidry <https://www.google.com/+ThadGuidry> Thad on LinkedIn <http://www.linkedin.com/in/thadguidry/>
Received on Sunday, 20 April 2014 15:10:37 UTC