- From: Jarno van Driel <jarno@quantumspork.nl>
- Date: Sun, 4 May 2014 01:21:55 +0200
- To: Francois-Paul Servant <francoispaulservant@gmail.com>
- Cc: Martin Hepp <martin.hepp@ebusiness-unibw.org>, Niklas Lindström <lindstream@gmail.com>, W3C Web Schemas Task Force <public-vocabs@w3.org>
- Message-ID: <CAFQgrbbPONctu6wxCnr35xzKLr4T8APgkiv7KyLgE62+Ok7fQw@mail.gmail.com>
I'm not quite sure by the way whether "Property http://acme.org/vocab/#Voltage" is considered to be a multi-type-entity RDFa or not. If so I guess one could use @sameAs in RDFa as well: <div vocab="http://schema.org/" typeof="Product"> ... <div property="additionalProperty" typeof="PropertyValue" id=" http://ex.com/ov_100_250"> <link property="sameAs" href="http://acme.org/vocab/#Voltage"> ... </div> </div> On Sun, May 4, 2014 at 12:05 AM, Jarno van Driel <jarno@quantumspork.nl>wrote: > Sorry, should have been: > > <div vocab="http://schema.org/" typeof="Product"> > ... > <div property="additionalProperty" typeof="PropertyValue > http://acme.org/vocab/#Voltage" id="http://ex.com/ov_100_250"> > ... > </div> > </div> > > as opposed to: > > <div itemscope itemtype="http://schema.org/Product"> > ... > <div itemprop="additionalProperty" itemscope itemtype=" > http://schema.org/PropertyValue" itemid="http://ex.com/ov_100_250"> > <link itemprop="sameAs" href="http://acme.org/vocab/#Voltage"> > ... > </div> > </div> > > And would this serve your purpose? > > > On Sat, May 3, 2014 at 11:29 PM, Jarno van Driel <jarno@quantumspork.nl>wrote: > >> Forgive me if I misunderstand your point, but doesn't: >> >> <div vocab="http://schema.org/" typeof="Product"> >> ... >> <div property="additionalProperty" typeof="PropertyValue >> http://ex.com/ov_100_250" id="http://ex.com/ov_100_250"> >> ... >> </div> >> </div> >> >> get the same result as: >> >> <div itemscope itemtype="http://schema.org/Product"> >> ... >> <div itemprop="additionalProperty" itemscope itemtype=" >> http://schema.org/PropertyValue" itemid="http://ex.com/ov_100_250"> >> <link itemprop="sameAs" href="http://acme.org/vocab/#Voltage"> >> ... >> </div> >> </div> >> >> Would the @propertyID still be needed then? >> >> >> On Sat, May 3, 2014 at 2:21 PM, Francois-Paul Servant < >> francoispaulservant@gmail.com> wrote: >> >>> Hi, >>> >>> what does it take to improve data published using PropertyValue, and to >>> share the enhancements? >>> >>> Le 2 mai 2014 à 22:37, martin.hepp@ebusiness-unibw.org a écrit : >>> <snip> >>> >>> Ideal Version: External Property with Qualitative Value >>> >>> <div itemscope itemtype="http://schema.org/Product"> >>> <span itemprop="name">ACME Electric Anvil</span> >>> ... >>> Operating Voltage: <div itemprop="http://acme.org/vocab/#voltage" >>> itemscope >>> itemtype="http://schema.org/QuantitativeValue"> >>> <span itemprop="minValue">100</span>- >>> <span itemprop="maxValue">220</span> >>> <meta itemprop="unitCode" content="VLT" > V >>> </div> >>> >>> with this >>> >>> Variant 1: Property name instead of URI >>> >>> <div itemtype="http://schema.org/Product"> >>> <span itemprop="name">ACME Electric Anvil</span> >>> <div itemprop="additionalProperty" itemscope itemtype=" >>> http://schema.org/PropertyValue"> >>> <span itemprop="name">Operating Voltage</span> >>> <span itemprop="minValue">100</span>- >>> <span itemprop="maxValue">250</span> >>> <meta itemprop="unitCode" content="VLT"> V >>> </div> >>> </div> >>> >>> or this >>> >>> Variant 2: Unit as text instead of UN/CEFACT Common Code and range as a >>> single field >>> >>> >>> <div itemtype="http://schema.org/Product"> >>> <span itemprop="name">ACME Electric Anvil</span> >>> <div itemprop="additionalProperty" itemscope itemtype=" >>> http://schema.org/PropertyValue"> >>> <span itemprop="name">Operating Voltage</span> >>> <span itemprop="value">100-250</span>- >>> <span itemprop="unitText">V</span> >>> </div> >>> </div> >>> >>> or in worst case this: >>> >>> Variant 3: Range and Unit in a joint field >>> >>> <div itemtype="http://schema.org/Product"> >>> <span itemprop="name">ACME Electric Anvil</span> >>> <div itemprop="additionalProperty" itemscope itemtype=" >>> http://schema.org/PropertyValue"> >>> <span itemprop="name">Operating Voltage</span> >>> <span itemprop="value">100-250 V</span>- >>> </div> >>> </div> >>> >>> >>> It is obvious that the version with a dedicated property URI and a >>> proper http://schema.org/QuantitativeValue node is easier to process. >>> >>> But from a data provider's perspective, who typically has the product >>> properties in very light-weight property-value structures, with often >>> proprietary properties, even the step to Variant 1 makes data publication >>> much, much simpler, because he does not have to map the local property name >>> to a standard property URI nor determine the type of the value >>> (quantitative, qualitative, or Boolean). That is VERY difficult from >>> typical Web applications, even if the back-end systems (PDM/PIM) had this >>> additional data. >>> >>> >>> >>> one interesting exercise is to try to take data published in the >>> non-ideal variants, and to see what it requires to get to the ideal one. >>> With one constraint: we must imagine that there is already a lot of data >>> published in the non-ideal variants, and that we want to lift them without >>> republishing them all. This corresponds to the real situation of a client >>> or a third party who wants to make use of these data and share its results. >>> Or even of the publishing corporation, which may not be able without a lot >>> of work to change all the publishing process as it is (neither, of course, >>> to change anything to what has already been published). Is it possible to >>> publish some extra statements (in an independent, supplementary process) to >>> improve the non-ideal published data? >>> (In an ideal situation, we publish the data, and we can improve it >>> afterwards). >>> >>> Note that a player such as a search engine can quite easily handle the >>> situation: from >>> <span itemprop="name">Operating Voltage</span> >>> it can easily recognize the corresponding http://acme.org/vocab/#voltageproperty in its "knowledge graph of known entities and properties" and then >>> correctly index the product in question. >>> >>> What's for the rest of us? >>> >>> In the 3 variants that you describe, as they are, I think that there is >>> no way to efficiently publish improved data. One can use NLP techniques to >>> effectively use the data, but he/she cannot easily publish the results. >>> >>> The first reason is that the PropertyValue is not identified: in RDF >>> terms, it is a blank node. No way to say something about it (no way to lift >>> it therefore). >>> So, if I have, for instance, a small program that knows that a unitText >>> of "V" is equivalent to the unitCode "VLT", I can't simply publish >>> something that would lift data published in variant 2 to the level of >>> variant 1. >>> >>> On the other hand, if the data had been published using an identifier >>> for the PropertyValues, it would have been possible: if we had for instance >>> published in the first place: >>> <div itemtype="http://schema.org/Product"> >>> <span itemprop="name">ACME Electric Anvil</span> >>> <div itemprop="additionalProperty" itemscope itemtype=" >>> http://schema.org/PropertyValue" itemid="http://ex.com/ov_100_250"> >>> <span itemprop="name">Operating Voltage</span> >>> <span itemprop="value">100-250</span>- >>> <span itemprop="unitText">V</span> >>> </div> >>> </div> >>> >>> one could simply state somewhere >>> http://ex.com/ov_100_250 schema:unitCode "VLT". >>> >>> to improve *all* the description of products published by ex.com that >>> have an operating voltage of 100-250. >>> >>> With that, variants 2, 3 4 are basically equivalent: one can use any ML >>> / heuristic technique to do the work, and easily share the results. >>> The publisher of the "non-ideal" data can keep its systems running as >>> they are, and just publish a small set of triples to improve all the >>> already published and the to-be-published data. >>> >>> Now, can we reach the "ideal version" state as easily? >>> >>> Yes, but it requires the use of the propertyID property: >>> <http://ex.com/ov_100_250> schema:propertyID < >>> http://acme.org/vocab/#voltage> >>> and to consider that, if the propertyID is the URI of a property, then if >>> s additionalProperty pv. >>> pv propertyID p. >>> then s p pv. >>> which is not completely in line with Martin's proposal. >>> >>> If this is a problem, there is a variant 0, which is an almost ideal >>> version >>> Variant 0: additionalProperty with External Type >>> >>> <div itemscope itemtype="http://schema.org/Product"> >>> <span itemprop="name">ACME Electric Anvil</span> >>> ... >>> Operating Voltage: <div itemprop="additionalProperty" itemscope >>> itemtype="http://acme.org/vocab/#Voltage<http://acme.org/vocab/#voltage>" >>> itemid="http://ex.com/ov_100_250"> >>> <span itemprop="minValue">100</span>- >>> <span itemprop="maxValue">220</span> >>> <meta itemprop="unitCode" content="VLT" > V >>> </div> >>> (possibly, add the propertyID to this markup) >>> >>> Note BTW that I do not consider the external property pattern as the >>> "ideal version": >>> - there will never be enough properties in a vocab: we need an >>> "additionalProperty" anyway >>> - it's sufficient to just define types of features in practical uses: if >>> you say that your product has (="additionalProperty") a given "Voltage", do >>> you really have to say that it "has voltage" the Voltage in question? >>> - it doesn't work well for "configurations" (partially defined >>> products), cf >>> http://events.linkeddata.org/ldow2013/papers/ldow2013-paper-11.pdf >>> >>> But this in another story. To summarize: >>> data published in "non-ideal" versions can be easily enhanced, and the >>> results shared, if and (I think) only if they include URIs for the >>> PropertyValue in the first place. In this case, publishing some statements, >>> independently of the original publishing, can improve a lot of data at once. >>> The use of URIs for PropertyValues - local ones is fine - should >>> therefore be encouraged. >>> >>> (this assumes, of course, that users of the data make use of URIs and >>> conflate statements published about the same URI in two different places. >>> But without that, it's the whole idea of a web of data which is defeated. >>> This may seem obvious, but last time I checked Google's structured data >>> testing tool, it didn't do it even for statements in the same page.) >>> >>> fps >>> >>> >> >
Received on Saturday, 3 May 2014 23:22:25 UTC