- From: Wallis,Richard <Richard.Wallis@oclc.org>
- Date: Sun, 20 Apr 2014 09:13:39 +0000
- To: Jarno van Driel <jarno@quantumspork.nl>
- CC: Jocelyn Fournier <jocelyn.fournier@gmail.com>, Jason Douglas <jasondouglas@google.com>, "<public-vocabs@w3.org>" <public-vocabs@w3.org>
- Message-ID: <808A0125-4942-4413-872A-2963A0B56E05@oclc.org>
I would suggest that it should read “… then the subject will be assumed to be the current webpage”. That subject of that webpage can be of any Type - Person, Place, Organization, WebPage or one of its more focused subtypes - that is being described. By defining the type, you are doing the right thing otherwise some process or person parsing the data would just have to assume that it is ‘just a WebPage’. ~Richard On 20 Apr 2014, at 02:05, Jarno van Driel <jarno@quantumspork.nl<mailto:jarno@quantumspork.nl>> wrote: "all that is saying is that if a relation is declared without an explicit subject, then the subject will be assumed to be the current WebPage" So if the WebPage is only there to function as a sort of 'catch all' than why does it even have subClasses? I actually use WebPage and it's subClasses in an attempt to do the right thing, but is there a point in doing this, does it actually matter if I declare the page type? On Thu, Apr 17, 2014 at 9:03 PM, Jocelyn Fournier <jocelyn.fournier@gmail.com<mailto:jocelyn.fournier@gmail.com>> wrote: Le 17/04/2014 20:25, Jason Douglas a écrit : Yup, that's messed up. Let's fix it! Hi, Note that examples regarding mainContentOfPage on schema.org<http://schema.org/> are also misleading. E.g. : http://schema.org/Table => <meta itemprop="mainContentOfPage" content="true"/> I would have expected <div itemprop="mainContentOfPage" itemscope itemtype="http://schema.org/Table"> But I fully agree it would be much more usefull to have mainContentOfPage as a 'Thing' rather than 'WebPageElement' Jocelyn On Thu Apr 17 2014 at 11:15:43 AM, Jarno van Driel <jarno@quantumspork.nl<mailto:jarno@quantumspork.nl> <mailto:jarno@quantumspork.nl<mailto:jarno@quantumspork.nl>>> wrote: "...if a relation is declared without an explicit subject, then the subject will be assumed to be the current WebPage." Got it. "It is legal for there to be multiple top-level entities." + "Current clients make up their own heuristics for this..." Brainfreeze! How am I, as a developer, to deal with this? Does this mean I have to somehow figure out which heuristics every parser/search engine uses, to be able to have control or do I need to try to chain everything together such that only one top-level entity is left? And how would I do this for a category page on for example an eCommerce site. Which shows a range of Product entities on a CollectionPage, which together form the main-content and where the CollectionPage, for lack of a better term, only functions as a 'wrapper' for the list of products. "we probably do need a mechanism for indicating the "primary entity" of a webpage when there is one..." One the reasons why I asked my questions is because I encounter quite a lot of markup on websites where people use @mainContentPage on entities like Product. Now @mainContentOfPage has the expected type WebPageElement, but many aren't aware of this. And since there is no property to indicate which entity is the primary one I actually can completely understand they try to resolve it like this. And frankly, I'm confused as well. On Thu, Apr 17, 2014 at 7:51 PM, Jason Douglas <jasondouglas@google.com<mailto:jasondouglas@google.com> <mailto:jasondouglas@google.com<mailto:jasondouglas@google.com>>> wrote: It is legal for there to be multiple top-level entities. That description of WebPage is not meant to imply anything about the relationship of those top-level objects... all that is saying is that if a relation is declared without an explicit subject, then the subject will be assumed to be the current WebPage. That said, we probably do need a mechanism for indicating the "primary entity" of a webpage when there is one. Current clients make up their own heuristics for this, but I think it would be better to have an explicit way of stating that. -jason On Thu Apr 17 2014 at 10:41:47 AM, Jarno van Driel <jarno@quantumspork.nl<mailto:jarno@quantumspork.nl> <mailto:jarno@quantumspork.nl<mailto:jarno@quantumspork.nl>>> wrote: I'm trying to understand semantic mechanisms better but am a bit confused about schema.org/WebPage<http://schema.org/WebPage> <http://schema.org/WebPage> and I'd like to know how it works. Now it could well be I understand certain terminologies wrong, so please bare with me and be so nice to correct me when needed. 1] The description of http://schema.org/WebPage says: "Every web page is implicitly assumed to be declared to be of type WebPage, so the various properties about that webpage, such as breadcrumb may be used. We recommend explicit declaration if these properties are specified, but if they are found outside of an itemscope, they will be assumed to be about the page." code example: <body itemscope itemtype="http://schema.org/WebPage"> <!-- Content --> </body> Now if the WebPage is the only entity is it then considered to be the 'Subject', the 'Object' or both? 2] If the WebPage contains an entity, let's say a Product, without specifying a property on the Product and I check this with Google's SDTT, I see 2 'root' entities, since there is no property to chain the two together. Yet I get the impression the Product gets treated as the 'Object', since it's the Product that gets used for Rich snippet extraction, and that therefore the WebPage is the 'Subject' : code example: <body itemscope itemtype="http://schema.org/WebPage"> <span itemprop="name">Page title</span> <div itemscope itemtype="http://schema.org/Product"> <span itemprop="name">Product name</span> <!-- Product properties --> </div> </body> Now since "Every web page is implicitly assumed to be declared to be of type WebPage" I was wondering if there also is a property that is 'implicitly assumed to be declared' (something like @contains) on the first entity that comes after it, like Product in this case, which indicates that the Product is the 'Object'? And if not, than how does a parser 'know' which of the entities is the 'Subject' and which is the 'Object', shouldn't there be a predicate for this? 3] When a WebPage contains a bunch of 'root' entities, how does a parser make sense of this, does the DOM have anything to do with this? <body itemscope itemtype="http://schema.org/WebPage"> <span itemprop="name">Page title</span> <div itemscope itemtype="http://schema.org/Product"> <span itemprop="name">Product 1 name</span> <!-- Product properties --> </div> <div itemscope itemtype="http://schema.org/Product"> <span itemprop="name">Product 2 name</span> <!-- Product properties --> </div> <div itemscope itemtype="http://schema.org/LocalBusiness"> <span itemprop="name">Business name</span> <!-- Product properties --> </div> </body> Now the above could be full of misunderstandings because I lack in theoretical knowledge still, but that's exactly the thing I'm hoping to change. Who can enlighten me?
Received on Sunday, 20 April 2014 09:14:13 UTC