W3C home > Mailing lists > Public > public-vocabs@w3.org > April 2014

Re: working of schema.org/WebPage

From: Wallis,Richard <Richard.Wallis@oclc.org>
Date: Sun, 20 Apr 2014 09:13:39 +0000
To: Jarno van Driel <jarno@quantumspork.nl>
CC: Jocelyn Fournier <jocelyn.fournier@gmail.com>, Jason Douglas <jasondouglas@google.com>, "<public-vocabs@w3.org>" <public-vocabs@w3.org>
Message-ID: <808A0125-4942-4413-872A-2963A0B56E05@oclc.org>
I would suggest that it should read “… then the subject will be assumed to be the current webpage”.

That subject of that webpage can be of any Type - Person, Place, Organization, WebPage or one of its more focused subtypes - that is being described.

By defining the type, you are doing the right thing otherwise some process or person parsing the data would just have to assume that it is ‘just a WebPage’.

~Richard

On 20 Apr 2014, at 02:05, Jarno van Driel <jarno@quantumspork.nl<mailto:jarno@quantumspork.nl>> wrote:

"all that is saying is that if a relation is declared without an explicit subject, then the subject will be assumed to be the current WebPage"

So if the WebPage is only there to function as a sort of 'catch all' than why does it even have subClasses?

I actually use WebPage and it's subClasses in an attempt to do the right thing, but is there a point in doing this, does it actually matter if I declare the page type?


On Thu, Apr 17, 2014 at 9:03 PM, Jocelyn Fournier <jocelyn.fournier@gmail.com<mailto:jocelyn.fournier@gmail.com>> wrote:
Le 17/04/2014 20:25, Jason Douglas a écrit :

Yup, that's messed up.  Let's fix it!

Hi,

Note that examples regarding mainContentOfPage on schema.org<http://schema.org/> are also misleading.
E.g. : http://schema.org/Table
=>  <meta itemprop="mainContentOfPage" content="true"/>

I would have expected
<div itemprop="mainContentOfPage" itemscope itemtype="http://schema.org/Table">


But I fully agree it would be much more usefull to have mainContentOfPage as a 'Thing' rather than 'WebPageElement'

  Jocelyn


On Thu Apr 17 2014 at 11:15:43 AM, Jarno van Driel
<jarno@quantumspork.nl<mailto:jarno@quantumspork.nl> <mailto:jarno@quantumspork.nl<mailto:jarno@quantumspork.nl>>> wrote:

    "...if a relation is declared without an explicit subject, then the
    subject will be assumed to be the current WebPage."
    Got it.

    "It is legal for there to be multiple top-level entities." +
    "Current clients make up their own heuristics for this..."
    Brainfreeze!
    How am I, as a developer, to deal with this? Does this mean I have
    to somehow figure out which heuristics every parser/search engine
    uses, to be able to have control or do I need to try to chain
    everything together such that only one top-level entity is left?

    And how would I do this for a category page on for example an
    eCommerce site. Which shows a range of Product entities on a
    CollectionPage, which together form the main-content and where the
    CollectionPage, for lack of a better term, only functions as a
    'wrapper' for the list of products.

    "we probably do need a mechanism for indicating the "primary entity"
    of a webpage when there is one..."
    One the reasons why I asked my questions is because I encounter
    quite a lot of markup on websites where people use @mainContentPage
    on entities like Product. Now @mainContentOfPage has the expected
    type WebPageElement, but many aren't aware of this. And since there
    is no property to indicate which entity is the primary one I
    actually can completely understand they try to resolve it like this.
    And frankly, I'm confused as well.



    On Thu, Apr 17, 2014 at 7:51 PM, Jason Douglas
    <jasondouglas@google.com<mailto:jasondouglas@google.com> <mailto:jasondouglas@google.com<mailto:jasondouglas@google.com>>> wrote:

        It is legal for there to be multiple top-level entities.  That
        description of WebPage is not meant to imply anything about the
        relationship of those top-level objects... all that is saying is
        that if a relation is declared without an explicit subject, then
        the subject will be assumed to be the current WebPage.

        That said, we probably do need a mechanism for indicating the
        "primary entity" of a webpage when there is one.  Current
        clients make up their own heuristics for this, but I think it
        would be better to have an explicit way of stating that.

        -jason


        On Thu Apr 17 2014 at 10:41:47 AM, Jarno van Driel
        <jarno@quantumspork.nl<mailto:jarno@quantumspork.nl> <mailto:jarno@quantumspork.nl<mailto:jarno@quantumspork.nl>>> wrote:

            I'm trying to understand semantic mechanisms better but am a
            bit confused about schema.org/WebPage<http://schema.org/WebPage>
            <http://schema.org/WebPage> and I'd like to know how it works.


            Now it could well be I understand certain terminologies
            wrong, so please bare with me and be so nice to correct me
            when needed.

            1] The description of http://schema.org/WebPage says:
            "Every web page is implicitly assumed to be declared to be
            of type WebPage, so the various properties about that
            webpage, such as breadcrumb may be used. We recommend
            explicit declaration if these properties are specified, but
            if they are found outside of an itemscope, they will be
            assumed to be about the page."

            code example:
            <body itemscope itemtype="http://schema.org/WebPage">
               <!-- Content -->
            </body>

            Now if the WebPage is the only entity is it then considered
            to be the 'Subject', the 'Object' or both?

            2] If the WebPage contains an entity, let's say a Product,
            without specifying a property on the Product and I check
            this with Google's SDTT, I see 2 'root' entities, since
            there is no property to chain the two together. Yet I get
            the impression the Product gets treated as the 'Object',
            since it's the Product that gets used for Rich snippet
            extraction, and that therefore the WebPage is the 'Subject' :

            code example:
            <body itemscope itemtype="http://schema.org/WebPage">
               <span itemprop="name">Page title</span>

               <div itemscope itemtype="http://schema.org/Product">
                 <span itemprop="name">Product name</span>
                 <!-- Product properties -->
               </div>
            </body>

            Now since "Every web page is implicitly assumed to be
            declared to be of type WebPage" I was wondering if there
            also is a property that is 'implicitly assumed to be
            declared' (something like @contains) on the first entity
            that comes after it, like Product in this case, which
            indicates that the Product is the 'Object'?

            And if not, than how does a parser 'know' which of the
            entities is the 'Subject' and which is the 'Object',
            shouldn't there be a predicate for this?

            3] When a WebPage contains a bunch of 'root' entities, how
            does a parser make sense of this, does the DOM have anything
            to do with this?

            <body itemscope itemtype="http://schema.org/WebPage">
               <span itemprop="name">Page title</span>

               <div itemscope itemtype="http://schema.org/Product">
                 <span itemprop="name">Product 1 name</span>
                 <!-- Product properties -->
               </div>

               <div itemscope itemtype="http://schema.org/Product">
                 <span itemprop="name">Product 2 name</span>
                 <!-- Product properties -->
               </div>

               <div itemscope itemtype="http://schema.org/LocalBusiness">
                 <span itemprop="name">Business name</span>
                 <!-- Product properties -->
               </div>
            </body>

            Now the above could be full of misunderstandings because I
            lack in theoretical knowledge still, but that's exactly the
            thing I'm hoping to change. Who can enlighten me?
Received on Sunday, 20 April 2014 09:14:13 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:29:39 UTC