Re: Generic Property-Value Proposal for Schema.org from Francois-Paul Servant on 2014-05-04 (public-vocabs@w3.org from May 2014)

From: Francois-Paul Servant <francoispaulservant@gmail.com>
Date: Sun, 4 May 2014 16:31:06 +0200
To: Niklas Lindström <lindstream@gmail.com>
Cc: "martin.hepp@ebusiness-unibw.org Hepp" <martin.hepp@ebusiness-unibw.org>, W3C Web Schemas Task Force <public-vocabs@w3.org>
Message-Id: <8EB3ADE3-B178-41B1-99DE-7E67EE2E7EF5@gmail.com>
Niklas, all,

thank you Niklas, this is a very good post!

Le 4 mai 2014 à 05:35, Niklas Lindström <lindstream@gmail.com> a écrit :

> Martin, all,
> 
> It still seems to be unclear what this mechanism really represents. What is the nature of these named composite values? Do they represent a conflation of a property and a value, or just a textually typed structured value? And with that answered, what is the (necessary) meaning of 'additionalProperty'?
> 
> The name 'additionalProperty' certainly seems problematic. This mechanism does not appear equivalent to the 'additionalType' workaround, which is required to overcome a limitation in microdata. It does however, appear to be about representing undefined properties in general. It does not have to be so.
> 
> I think it would be troubling to attempt the creation of a generic solution that could catch all cases of "unknown" properties. I've seen enough hand-waving in XML (and other formats) to know how hard such data can be to deal with (also for producers). It makes for an artificial mechanism, hard to understand and thus being inconsistently used (and soon enough, ad-hoc conventions and sketchy microsyntaxes tend to appear). Also, since values would seem like semi-reified statements, with a conflated composite of property and value, it'd start to blur the notion of what a property is (an effective device for capturing precise data in a direct, flat and simple manner).
> 
> However, and most importantly, the proposal really only deals with capturing certain kinds of values. What these values have in common is that they, in themselves, represent a nameable characteristic – an *aspect* – of a thing, particularly a product. These structured values can be composite (minValue, maxValue, unit and so forth). The proposal doesn't alter this existing StructuredValue. It just adds a name (text label) indicating the nature of this aspect a little. If this nature is *intrinsic* to the value (and not just one of many roles it could have), then we're on reasonably solid ground. And then this seems like a valuable mechanism that shouldn't threaten the integrity of properties, nor to the structured values it captures.
> 
> (Martin and Jarno do make a strong case of this being the limit of what many publishers today are willing to mark up. That state of things is rather unimpressive from a consumer's perspective in principle (and lamentable when striving for unification in general). But there may not be enough direct incentive yet to convince everybody to go look for, or mint, properties for all of these details. This does not have to be an all-or-nothing. Like Martin, I am sure we can find ways to bridge this gap.)
> 
> As Francois-Paul points out, the name of these values seems to represent the *type* of the value just as much as it indicates a relation to it. This is an important observation. I think Richard Cyganiak stated something similar a while ago: that if the value is distinct enough, the relation doesn't need to be very specific. (This is why Dublin Core terms like hasPart/isPartOf are so effective and relatively popular.)
> 
> Consider the example values in the proposal:
> 
> - 450 grams of approximated weight
> - 100-250 volts of Operating Voltage
> - 30 ft. Wifi range
> - 100-6400, 12,800 ISO Sensitivity
> - 500 LTR of Luggage Capacity
> - Ethernet and USB Interfaces
> - 5 liter of fuel consumption per 100 km
> - A sensor size of 23.3 mm x 15.4 mm
> 
> Of course there is potential value in having dedicated properties for them. But they are already structured (composite) values. And none of them play multiple roles. They are all aspects of a product, and the name of this aspect reasonably belongs to them (it's a type hint, if you will). They are not reusable as other things (the same 30 ft. would not also be the approximated distance you can throw the device – that would be *another* aspect with an approximated throwing distance value).
> 
> Thus, it isn't necessary to define a specific property in schema.org (or at all), to get some kind of usage from these values. As long as these named structured values do represent themselves (and not a mix of a property (in the structured data/RDF sense) and its value), you can further specify the relation (and the type) using external vocabularies if you want to, without changing the shape of the data.
> 
> This is my main point. I think a design is possible that maintains the integrity of the commonly uncontrolled but structured values, so that additional precision can be added if desired, without requiring a change in shape. It may be that instances of named StructuredValues are odd creatures in an ideal world. (I am certainly aware of how I've shifted my mind to write this.) But so are SKOS concepts ("strings masquerading as things", as Jeff Young put it). They do have their uses though, since there is a limit to how much semantic engineering and ontological normalization we can collectively muster. (And since this is more about useful communication than a perfect rational reality, I think that is acceptable.)
> 
> Therefore, I maintain my recommendation of renaming 'additionalProperty'. Perhaps to 'aspect'? A dictionary defines that as "a particular part or feature of something". Sounds generally applicable. For the range of that, I can imagine Intangible, since many of its subclasses could be quite useful as values (not just a named StructuredValue). I also maintain that NamedValue is better than PropertyValue, since the whole point is to add a name to a StructuredValue. (Martin, I noticed you wrote NamedProperty in the wiki. Did you really mean NamedValue?)

I like "aspect" (and "Intangible > Aspect"). I was not convinced by "NamedValue", because one can get confused by what this name means (it's the name of the property, not the name of the value).

Maybe schema.org could also create a "ProductFeature" subclass of Aspect. Nothing special in it, just something that people may easily find when searching how to describe a product (which will be one of the main use cases of "aspect")

> 
> I still find propertyID a bit problematic. If you have an external property URI, just use it as a property (alongside additionalProperty/aspect). Same thing if you have a class URI (use it as an additional type). Francois-Paul, wouldn't that work for Renault's case?

yes. Actually, we have a paper currently under review where, basically, we suggest to just add a "ProductFeature" class and a "productFeature" property to schema.org (and to consider our ConfigurationOntology for customizable products)

In our description of Renault's range of new cars, we do not use properties such as "fuelType" or "gearboxType", just types ("FuelType", "Gearbox"), and a few properties of the same nature as "aspect" (properties such as fuelType etc. doesn't work well to describe "configurations", that is, partially defined products).

There is still one possible use of propertyID, I think. (I don't really like it, but I don't see it really problematic either: we are ready to attach a property name to the Aspect - why not the property itself then?). Let me try to explain why it can be useful. It is about enhancing data that has been published "in a not-ideal way", and about well-established vocabularies that define many properties. Such vocabularies make it difficult to produce compliant data.

Consider a site selling used cars. The developers of the site have looked at schema.org and they have found nothing about the description of cars. They may use "aspect", define their own class of features and publish data such as:

x:OneCar :aspect x:Diesel.
x:Diesel a :Aspect;
	:propertyName "FuelType";
	:value "Diesel".

and that's OK. To list diesel cars, in sparql:
select ?car where {
	?car :aspect ?x:Diesel.
}

Now there is one vocabulary to describe cars, VSO (the Vehicle Sales Ontology). VSO defines a "fuelType" property (with range "FuelTypeValue"), so VSO expects that people write things such as:
x:OneCar vso:fuelType vso:Diesel.

There's no way to enhance the initial data to get to there, while it would have been easy to get to
x:OneCar :aspect vso:Diesel.
(just state x:Diesel :sameAs vso:Diesel)

There's no way, unless we use propertyID: 
x:Diesel :propertyID vso:fuelType.

then a program with an ad-hoc handling of propertyID could infer x:OneCar vso:fuelType vso:Diesel.

(But OK, I would prefer that reference vocabularies avoid using too many properties)

> But I do see why some kind of property might be needed, when you have a code from a system which is neither a URI nor a plain label. Maybe <http://schema.org/codingSystem> can be modified to be usable here too (e.g. domainIncludes StructuredValue)? That might be more for the proposed SKOS integration work though (which could play nice together with this).
> 
> Cheers,
> Niklas

Cheers,

fps
Received on Sunday, 4 May 2014 14:31:35 UTC