- From: <martin.hepp@ebusiness-unibw.org>
- Date: Wed, 9 Apr 2014 09:23:13 +0200
- To: Laura Dawson <Laura.Dawson@bowker.com>
- Cc: Vicki Tardif Holland <vtardif@google.com>, Jarno van Driel <jarno@quantumspork.nl>, W3C Web Schemas Task Force <public-vocabs@w3.org>
On 08 Apr 2014, at 18:44, Laura Dawson <Laura.Dawson@bowker.com> wrote: > That’s the job of any given ontology. In the book industry, we have a number of relationships defined already. I’m pretty sure in other media types those also exist, so as to communicate to vendors “this thing replaces that thing; this thing is derived from that thing; this thing has nothing to do with that thing”. That’s essential information that vendors (e.g. Amazon and Apple) need to have. Well, I think the question in a *Web* ontology like schema.org is to find a good balance between the specificity of the property/relationship types (or type/class alike), and the ability of publishers to employ the distinctions properly, which is constrained by at least 1. their understanding of the definition of a property (or type/class), 2. their existing data structures (in particular local schemas), and 3. the fitness of the distinction to the context of the user (e.g. what is a common distinction in the library context - like book copy vs. book title - can be harder to understand and apply for users in other contexts - like car model vs. actual car, service template vs. service instance etc.). So simply taking a set of relationship types from an existing standard is not always the best choice. It really depends on 1. whether the degree of specifity is compatible with the local schemas of existing databases (i.e. that site owners can populate them with ease), and 2. whether the consumers of data (e.g. search engines) can reconstruct the distinction from contextual information or other signals. If e.g. search engines can reconstruct a distinction from contextual information or other signals, it is not necessary to encourage publishers of data to spend resources on declaring the respective meta-data in mark-up. In a Web vocabulary like schema.org, we should thus focus on those classes/types and properties that 1. are easy to apply reliably from existing data sources by typical site owners and 2. that cannot be reconstructed with ease by a consumer of the data (*). Since adding type and structural information to Web data has costs, we should further center our efforts on such meta-data that 1. has a high information entropy (approximately: is useful, additional information) and 2. can be reliably provided by a large number of sites. RDF and its proponents made, IMO, the mistake that they also expected such meta-data explicitly in the data representation that a client can often reliably reconstruct from the data alone. For example, data types for literals like the information that the literal "ABC123" is a string or that "2014-04-09" is likely a date. (The former has been fixed in RDF 1.1, I know ;-) ) Many Web ontologies on the contrary (and maybe even GoodRelations in some of its branches) provide conceptual distinctions that cannot be reliably applied by site owners, either due to fact that the distinctions require a deep philosophical understanding, or because the existing databases available for powering a Web site simply do not support the distinction. As for schema.org, I would say that as long as we are defining properties for a single schema.org type or a small set of types, this problem is a lesser issue, since the type provides context and people will more easily understand even distinctions (like author vs. editor for a book). But when we speak about properties at a more generic level, or even for Thing, we need to be super-careful. Best wishes Martin (*) Of course, this depends on the power of the data consumer - Google et al. can likely do way more with messy, partly structured data than a smartphone app or browser extension. One untested and undeclared working assumption of the Semantic Web advocates is that data on the Web can ever be consumed in its raw form by relatively dumb clients. I have argued elsewhere [1] that structured data in Web content is rather "proto-data" that can be turned into usable data by heavy post-processing. [1] http://lists.w3.org/Archives/Public/public-vocabs/2013Oct/0293.html ------------------------------------------------------- martin hepp e-business & web science research group universitaet der bundeswehr muenchen e-mail: martin.hepp@unibw.de phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! ================================================================= * Project Main Page: http://purl.org/goodrelations/ On 08 Apr 2014, at 18:44, Laura Dawson <Laura.Dawson@bowker.com> wrote: > That’s the job of any given ontology. In the book industry, we have a number of relationships defined already. I’m pretty sure in other media types those also exist, so as to communicate to vendors “this thing replaces that thing; this thing is derived from that thing; this thing has nothing to do with that thing”. That’s essential information that vendors (e.g. Amazon and Apple) need to have. > > From: Vicki Tardif Holland <vtardif@google.com> > Date: Tuesday, April 8, 2014 at 11:52 AM > To: Jarno van Driel <jarno@quantumspork.nl> > Cc: "martin.hepp@ebusiness-unibw.org" <martin.hepp@ebusiness-unibw.org>, W3C Web Schemas Task Force <public-vocabs@w3.org> > Subject: Re: Why is the video property bound to creative work? > Resent-From: <public-vocabs@w3.org> > Resent-Date: Tuesday, April 8, 2014 at 11:52 AM > > My fear is that a "related" property would lead to confusion between authors and consumers. For example, if we had a VideoObject related to Barack Obama, does he appear in the video? Does the video discuss him? Is it about a book he wrote? > > While we can know there is a relationship, it is difficult to understand what that relationship is. > > - Vicki > > > Vicki Tardif Holland | Ontologist | vtardif@google.com > > > > On Tue, Apr 8, 2014 at 11:42 AM, Jarno van Driel <jarno@quantumspork.nl> wrote: >> "The type of the object of this statement would then indicate the nature of the relatedness, e.g. a VideoObject." >> Says it all for me. In my mind this makes perfect sense, does anybody have any extra input on this from a data-consumer perspective maybe? >> >> >> >> >> On Tue, Apr 8, 2014 at 5:19 PM, martin.hepp@ebusiness-unibw.org <martin.hepp@ebusiness-unibw.org> wrote: >>> Thanks! The "related" property could also be used to link related products in shop applications, btw. >>> >>> Of course, the exact semantics of the properties is pretty broad, but we can leave it up to the consumers of the data to interprete it, imo. >>> >>> Martin >>> >>> >>> >>> >>> On 08 Apr 2014, at 17:06, Jarno van Driel <jarno@quantumspork.nl> wrote: >>> >>> > In this particular case a having 'related' property would already suffice for what I'm looking to do. My issue isn't so much with having multiple root entities relate to each other - which indeed adds additional complexity and size of vocabulary - but more with the fact I can't have a single Product (or MedicalProcedure for that matter) express it has a video that adds additional info about the entity. >>> > >>> > But coming back to your idea for adding 'related' as a more generic property of Thing for exactly this type of use, amongst others, seems like a good idea to me. So I'm all for it. >>> > >>> > >>> > On Tue, Apr 8, 2014 at 4:46 PM, martin.hepp@ebusiness-unibw.org <martin.hepp@ebusiness-unibw.org> wrote: >>> > I understand your point, but personally, I strongly discourage having inverse properties, except for very few cases. Being able to model the same fact from both sides using different properties adds confusion and increases the size of the vocabulary. >>> > >>> > Martin >>> > >>> > >>> > >>> > On 08 Apr 2014, at 16:35, Jarno van Driel <jarno@quantumspork.nl> wrote: >>> > >>> > > Thanks Martin, that helped a lot. >>> > > >>> > > Now putting the discussion about how multiple 'root' entities are handled, by search engines and other data-consumers, aside for a moment. (Although it might be a nice topic for new thread), I do want to re-use you code for a moment to illustrate what's missing from my point of view, and multiple root 'entites' serves quite nicely for this. >>> > > >>> > > Imagine a page has 2 'root' entities which aren't linked to the WebPage by means of a property then I would use @itemid to have both entities link to each other as such: >>> > > >>> > > <div itemid="video-object" itemscope itemtype="http://schema.org/VideoObject"> >>> > > <link itemprop="about" href="product"> >>> > > >>> > > <h2>Video: <span itemprop="name">Video of the Personal SCSI controller in use</span></h2> >>> > > <meta itemprop="duration" content="T1M33S" /> >>> > > <meta itemprop="thumbnail" content="personal-scsi-thumb.jpg" /> >>> > > <object ...> >>> > > <param ...> >>> > > <embed type="application/x-shockwave-flash" ...> >>> > > </object> >>> > > >>> > > <span itemprop="description">In this short video, we show how to use the controller in typical setting.</span> >>> > > </div> >>> > > >>> > > <div itemid="product" itemscope itemtype="http://schema.org/Product"> >>> > > <link itemprop="video" href="video-object"> >>> > > >>> > > <span itemprop="name">The Personal SCSI Controller by ACME Technology</span> >>> > > <!-- other product properties go here --> >>> > > </div> >>> > > >>> > > In this case both entities have a global identifier which should make it possible to have both items link to each other. Now the VideoObject points to the Product by means of <link itemprop="about" href="product"> but I can't achieve this the other way around. In an ideal world <link itemprop="video" href="video-object"> would achieve the same relation only inversed but unfortunately Product doesn't have a 'video' property. >>> > > >>> > > Which could be resolved by either having 'video' be part of Thing or having a completely new property like 'related' as you proposed. Either way, there's something missing right now to provide this type of relationship. >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > On Tue, Apr 8, 2014 at 3:42 PM, martin.hepp@ebusiness-unibw.org <martin.hepp@ebusiness-unibw.org> wrote: >>> > > Hi Jarno: >>> > > >>> > > Below is how I would model a product video with the current set of elements. >>> > > In general I would suggest that if a use-case can be sufficiently covered with existing elements, we rather encourage search engines to implement support for the respective markup rather than adding redundant conceptual elements that are there just because search engines prefer a particular direction of a relationship. >>> > > >>> > > Example: Product with video: >>> > > >>> > > <div itemprop="video" itemscope itemtype="http://schema.org/VideoObject" itemref="product"> >>> > > <h2>Video: <span itemprop="name">Video of the Personal SCSI controller in use</span></h2> >>> > > <meta itemprop="duration" content="T1M33S" /> >>> > > <meta itemprop="thumbnail" content="personal-scsi-thumb.jpg" /> >>> > > >>> > > <object ...> >>> > > <param ...> >>> > > <embed type="application/x-shockwave-flash" ...> >>> > > </object> >>> > > <span itemprop="description">In this short video, we show how to use the controller in typical setting.</span> >>> > > </div> >>> > > >>> > > >>> > > <div id="product"> >>> > > <div itemprop="about" itemscope itemtype="http://schema.org/ProductModel"> >>> > > <span itemprop="name">The Personal SCSI Controller by ACME Technology</span> >>> > > <!-- other product properties go here --> >>> > > </div> >>> > > </div> >>> > > >>> > > >>> > > >>> > > >>> > > Best wishes / Mit freundlichen Grüßen >>> > > >>> > > Martin Hepp >>> > > >>> > > ------------------------------------------------------- >>> > > martin hepp >>> > > e-business & web science research group >>> > > universitaet der bundeswehr muenchen >>> > > >>> > > e-mail: martin.hepp@unibw.de >>> > > phone: +49-(0)89-6004-4217 >>> > > fax: +49-(0)89-6004-4620 >>> > > www: http://www.unibw.de/ebusiness/ (group) >>> > > http://www.heppnetz.de/ (personal) >>> > > skype: mfhepp >>> > > twitter: mfhepp >>> > > >>> > > Check out GoodRelations for E-Commerce on the Web of Linked Data! >>> > > ================================================================= >>> > > * Project Main Page: http://purl.org/goodrelations/ >>> > > >>> > > >>> > > >>> > > >>> > > On 08 Apr 2014, at 15:10, Jarno van Driel <jarno@quantumspork.nl> wrote: >>> > > >>> > > > "Conceptually, this is not true, since you can use itemref in Microdata..." >>> > > > >>> > > > Would you be so kind to provide a small markup example, that illustrates this. I think I understand what you mean but unfotunately without an example I'm not sure if I understand you correctly. >>> > > > >>> > > > Op 8 apr. 2014 14:20 schreef "martin.hepp@ebusiness-unibw.org" <martin.hepp@ebusiness-unibw.org>: >>> > > > Conceptually, this is not true, since you can use itemref in Microdata or a unique identifier in RDFa to make the video the outer entitity in the nesting. >>> > > > However, search engines have, in practice, two problems with this: >>> > > > >>> > > > 1. Rich snippets and similar techniques often depend on finding one main entity type, and use the outermost entities (root entities) in the syntax for that task. So a Web page with a VideoObject and an Offer nested therein may not trigger a product snippet because the search engine thinks it was mainly a page about a video. >>> > > > >>> > > > 2. The linkage between entities on the basis of identifiers in RDFa is, to my experience, not properly supported by major search engines, so in reality, my proposed pattern will only work in Microdata. >>> > > > >>> > > > Martin >>> > > > >>> > > > >>> > > > >>> > > > On 08 Apr 2014, at 13:01, Jarno van Driel <jarno@quantumspork.nl> wrote: >>> > > > >>> > > > > But of course you can also model it the other way round... >>> > > > > >>> > > > > True but only in cases where VideoObject is the main object. When the main object is something else, which isn't part of the CreativeWork branch, then there is no way to link a video by means of a 'video' property. >>> > > > > >>> > > > > >>> > > > > On Tue, Apr 8, 2014 at 10:33 AM, martin.hepp@ebusiness-unibw.org <martin.hepp@ebusiness-unibw.org> wrote: >>> > > > > In general, I am supportive of this, since any entity could "have" a video. >>> > > > > >>> > > > > But of course you can also model it the other way round: >>> > > > > >>> > > > > http://schema.org/VideoObject >>> > > > > ---> about --> Thing >>> > > > > >>> > > > > This works as of now. The main problem with the current solution is that search engines seem to have a hard time honoring information in that structure. And since we have the property "image" at the level ofhttp://schema.org/Thing, why not promote video thereto, too? >>> > > > > >>> > > > > >>> > > > > Martin >>> > > > > >>> > > > > >>> > > > > On 08 Apr 2014, at 04:11, Jarno van Driel <jarno@quantumspork.nl> wrote: >>> > > > > >>> > > > > > When working on markup for a MedicalProcedure I ran into the issue of not having the 'video' property available to link an embedded video, explaining the MedicalProcedure, to the entity. >>> > > > > > >>> > > > > > But while looking for a solution in the full list of types at schema.org I started to wonder, wouldn't the 'video' property be usefull on plenty of more types than just CreativeWork. For example a 'video' about a person, organization, product, service or MedicalProcedure is quite common, yet there's no way to link a video to any of those types. >>> > > > > > >>> > > > > > Of course the workaround for this would be an multi-type entity as in "Product CreativeWork" but somehow that just feels wrong. Looking at how much embedded video is used, wouldn't it be better if the 'video' property moved up the chain and became part of 'Thing'? >>> > > > > >>> > > > > >>> > > > >>> > > >>> > > >>> > >>> > >>> >> >
Received on Wednesday, 9 April 2014 07:23:41 UTC