- From: Martin Hepp <mfhepp@gmail.com>
- Date: Fri, 15 Jun 2018 11:43:45 +0200
- To: Anthony Moretti <anthony.moretti@gmail.com>
- Cc: Simon Cox <Simon.Cox@csiro.au>, Dan Brickley <danbri@google.com>, elf Pavlik <elf-pavlik@hackers4peace.net>, "schema.org Mailing List" <public-schemaorg@w3.org>, Thad Guidry <thadguidry@gmail.com>
Hi Anthony: A product is essentially any thing that can potentially be the object of a promise to transfer rights thereon for some kind of compensation. So "product" does not overload or modify the meaning of any other object type. Yes, product has a commercial bias, but it is among the most frequently used types in schema.org. GoodRelations, and schema.org derived therefrom, do not prevent you from modeling anything as a potential object of an offering. You can make "love" a schema.org product, a commercial killer can model "murder" as a schema.org product (actually, it is more likely a service, but schema:product actually includes services, too), etc. You can also make "good karma" and even the Sun and the Moon a schema:Product, even without ever modeling a matching schema:Offer. There are cases where people feel confused by attaching this commercial-biased word to something they feel is not a product. But we have to balance that confusion to the huge amount of guidance and confirmation it brings to the zillion of e-commerce Web developers who are looking for the best schema.org type for their data. It is natural, in particular in academia, to look for inconsistencies in a model, in particular a data model. But schema.org is a whole eco-system, so consistency within the model is only one dimension of performance of this system (and not even a very important one when dealing with data at Web scale). The terminological fit to the language and mental models of people working on mainstream development tasks is very important. As a side note: I think that in general, we need way more "modeling challenges" (how would you model the following scenario with schema.org) rather than vocabulary modifications as issues and proposals. The existing set of elements will reach much further than many people assume, and lots of additional cases could be better addressed by adding an example as markup rather than adding a new property or type. If you want to do science and improve schema.org, it would be very promising to do actual A/B tests with a representative sample of schema.org developers and measure the learning and coding effort and quality of command of alternative modeling decisions. But you need real developers for this, not students. We tried this in https://link.springer.com/chapter/10.1007%2F978-3-319-69459-7_28 Best wishes Martin ----------------------------------- martin hepp http://www.heppnetz.de mhepp@computer.org @mfhepp > On 15 Jun 2018, at 10:36, Anthony Moretti <anthony.moretti@gmail.com> wrote: > > Simon, I understand domainIncludes means the union of the types, but is what you mean that any property can be used on any type? > > And yeah I understand things can be members of more than one class. I'll give an example of when Product is confusing - Say I build a custom car at home that I don't plan on selling ever, is it a Product? According to Schema all Cars are Products. This then makes you ask what the precise definition of Product is, but you don't even need to ask that question if you say the custom car is simply a Car and a Car is a type of Thing, and Things can have Offers (past, present, or future). You don't need to continually ask "Is this a product?", "Is this a creative work?", "Is this intangible?", you simply apply properties as required and use more clearly defined types like Car or Book as appropriate. > > And Martin, I completely agree with you about models and their domains. I do think a design can be criticized if it shows potential brittleness though, flexibility was the whole point Dan was making when he mentioned the volcanos-can-have-fax-numbers approach, which I like. > > Anthony > > On Fri, Jun 15, 2018 at 12:18 AM Martin Hepp <mfhepp@gmail.com> wrote: > Hi Anthony: > > The main thing to see in here is that types in schema.org are mostly used for grouping entites for which the same type of processing by major consumers of such data is appropriate. We are not trying to develop a fully application-agnostic system of types. > > Many of the contributors of schema.org have been in ontology engineering since the beginning of that discipline, and over time, we have learned that the pure ideal of fully detaching conceptual data models, and namely relationship types and entity types, from any notion of the processing task expected on the data, will not work. > > I think there is a nice quote by R.V. Guha on this topic somewhere in the list archive, but I don't find it right now. > > Historically, data structures and algorithms have always been considered a duality in Computer Science. The community that reused the term "ontology" from Philosophy to CS in the 1990s and redefined it as a word for shared conceptual data models that try to represent the "real structures of the world" wanted to decouple data structures from algorithms. While this aim was well intended, it turned out to be a dead end, because you can endlessly debate about what these "real structures of the world" are, as long as you do not have a metric for measuring your archievement. > > All models, and data models are no exception, are purpose-bound simplifications of a domain of interest. You can only assess the quality of a model with regard to a purpose. It is invalid to critize a model for being too granular, too coarse, or otherwise deficient, unless this defiency is observable in the area of application for which the model is intended. > > Best wishes > Martin > ----------------------------------- > martin hepp http://www.heppnetz.de > mhepp@computer.org @mfhepp > > > > > > On 15 Jun 2018, at 08:59, <Simon.Cox@csiro.au> <Simon.Cox@csiro.au> wrote: > > > > 1. domainIncludes and rangeIncludes are not exhaustive. Multiple values are linked by an open OR, not exclusive AND (major difference to RDFS) > > 2. its OK to be a member of more than one class. It’s OK for something to be both a Product and CreativeWork. > > > > From: Anthony Moretti [mailto:anthony.moretti@gmail.com] > > Sent: Friday, 15 June, 2018 16:30 > > To: Dan Brickley <danbri@google.com> > > Cc: Martin Hepp <mfhepp@gmail.com>; elf Pavlik <elf-pavlik@hackers4peace.net>; public-schemaorg@w3.org; Thad Guidry <thadguidry@gmail.com> > > Subject: Re: Schema.org and OWL > > > > Thanks for the links guys. > > > > I'm definitely not trying to make Schema into "one true logical model of the world", I do always think it's worthy to strive for simplicity and consistency though, something maybe similar in intention to code refactoring. > > > > Here is a problem that exists now though because of overly specific domains - if I want to describe the height of the Eiffel Tower, a Place, I'd want to use the "height" property, but the only types "height" can be used on are MediaObject, Person, Product, and VisualArtwork. I completely get the volcano-with-fax-number approach, and I'm actually a big fan of it, that's why I propose moving properties such as "height" to Thing. A guideline that Schema might be able to apply here could take inspiration from the rule of three - whenever a property is used on more than two types move it to the parent type. Using this guideline "height" would be on Thing, and could then be used to describe the Eiffel Tower. > > > > I'll end now with one final suggestion, I realize it probably has no chance of going anywhere, but I'll put it out there for consideration anyway. After moving those properties to Thing I realized that because CreativeWork, Product, and Intangible don't have clear definitions all they do is add complexity (how many times is it asked whether products are also creative works and vice versa). It would arguably be simpler to have all their properties on Thing and ThingType. This is in line with the volcano-with-fax-number approach, and would give great flexibility. > > > > Thanks for all the discussion! > > > > Anthony > > > > On Thu, Jun 14, 2018 at 4:09 PM Dan Brickley <danbri@google.com> wrote: > > On Thu, 14 Jun 2018 at 15:19, Anthony Moretti <anthony.moretti@gmail.com> wrote: > > I think Martin's point about passing information from product types to product instances can be addressed higher in the hierarchy than Product actually. I sense people are opposed to shifting properties from more specific types to Thing though (maybe I don't understand something, can someone please explain that to me?) My view is that using overly specific domains for properties causes strange entailment, e.g. in its current form the "height" property entails the subject is either a MediaObject, Person, Product, or VisualArtwork, which doesn't seem right. > > > > On this point - "e.g. in its current form the "height" property entails the subject is either a MediaObject, Person, Product, or VisualArtwork, which doesn't seem right." -- we don't really say that anywhere, and in fact we created looser variants of rdfs domain/range for documentation, to avoid saying more than we wanted to. On the contrary, in http://schema.org/docs/datamodel.html - > > > > "When we list the expected types associated with a property (or vice-versa) we aim to indicate the main ways in which these terms will be combined in practice. This aspect of schema.org is naturally imperfect. For example the schemas for Volcano suggest that since volcanoes are places, they may have fax numbers. Similarly, we list the unlikely (but not infeasible) possibility of a Country having "opening hours". We do not attempt to perfect this aspect of schema.org's structure, and instead rely heavily on an extensive collection of illustrative examples that capture common and useful combinations of schema.org terms. The type/properties associations of schema.org are closer to "guidelines" than to formal rules, and improvements to the guidelines are always welcome." > > > > In this regard, you might view this aspect of Schema.org as being closer to the "The Code is more what you call guidelines, than actual rules" tradition of the Pirates of the Caribbean than the expectations you might bring from the OWL world, even if we target much the same underlying data model. > > > > If this might seem less thank helpful, I'd suggest a possible middle-ground would be to explore the RDF validation languages - SHACL and ShEx - which suggest ways of layering certain kinds of discipline over messy RDF data. It doesn't address all the modeling concerns raised here, but does offer another layer of expressivity which needn't happen in the core project. You could look at https://www.topquadrant.com/technology/shacl/tutorial/ or http://book.validatingrdf.com/ -- e.g. http://datashapes.org/schema attempts to capture some ofschema.org itself in SHACL, whereas https://github.com/SEMICeu/dcat-ap_shacl/ (in SHACL) and https://github.com/SEMICeu/dcat-ap_shacl/issues/32 (in ShEx) try to capture specific useful community-specific patterns for describing datasets. These languages let people say things about Schema.org data structures, beyond what the project itself chooses to say. For example by constructing and documenting more tidy-minded subsets/profiles, or mixing it with longer tail vocabularies (like Wikidata's e.g. see Thad and friends' mappings) or richer domain models e.g. from the sciences, and explaining sensible patterns for these combinations. You could look at what the Blue Brain project are doing there, for example - https://github.com/BlueBrain/nexus-kg/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+shacl or the ShEx efforts around HL7/FHIR, https://www.hl7.org/fhir/medication.shex.html > > > > That kind of perspective I think makes two points. One is that Schema.org's modeling style and hierarchical structure is not the only place where discipline can be exercised usefully; and the second is that more "knowledge graphy" usecases (beyond simple Web markup) are likely to engage with other vocabularies and systems (e.g. scientific domains or general like Wikidata), in which case we're unlikely to see a unified modeling style across it all, and will likely end up focussing - again - on documenting usefully re-usable patterns that address particular situations. > > > > Dan >
Received on Friday, 15 June 2018 09:44:12 UTC