W3C home > Mailing lists > Public > public-vocabs@w3.org > May 2015

Re: Sustainable Codes vs Volatile URIs Re: URIs / Ontology for Physical Units and Quantities

From: Phil Archer <phila@w3.org>
Date: Thu, 07 May 2015 12:00:25 +0100
Message-ID: <554B45C9.8080102@w3.org>
To: Bernard Vatant <bernard.vatant@mondeca.com>, Mark Harrison <mark.harrison@cantab.net>
CC: W3C Web Schemas Task Force <public-vocabs@w3.org>
Let me begin by taking issue with the that URIs are volatile.

Some are, yes.

Some are not.

http://www.w3.org/1999/02/22-rdf-syntax-ns, for example, is not volatile.

If you set up a Web site/service with the specific aim of it being 
persistent, it will be. Only the intention behind them makes a 
difference between temp.com and purl.org, not the architecture.

W3C would love to host a system where vocabularies could be developed 
GitHub-style, complete with guarantees of persistence. It's only money 
that stops us doing it. You want to build that on w3.org? Please let me 
know - and we can talk about and publish clear statements about what 
happens when the money runs out and we host a static copy.

Meanwhile, anyone can use our Community group system now and develop and 
maintain vocabularies for which you can have a w3.org/ns namespace if 
desired.

And if you have a vocabulary you'd like us to host, again, please talk 
to me.

A few extra comments inline below.

On 07/05/2015 11:17, Bernard Vatant wrote:
> Hi Mark
>
> 2015-05-07 11:10 GMT+02:00 Mark Harrison <mark.harrison@cantab.net>:
>
>> Dear Bernard,
>>
>> Just to respond to your example, that is probably an acceptable approach
>> provided that both resources sharing that string code do so via a
>> well-defined property that has an inverse-functional relationship (i.e.
>> only one Subject is allowed to have that Value), much like a social
>> security number.
>
>
> Yes and no :)
> Yes for the use of a shared property in a shared stable vocabulary, or even
> equivalent properties in separate vocabularies.
> But definitely no for the inverse-functional relationship. The weak
> semantics of a code implies that it does not commit to any ontological
> assumption of whatever the code denotes, and in particular if it denotes a
> single entity. In the case of a city code, one can consider the city as a
> geographical entity, a surface delimited by a polygon, a minimal and
> maximal altitude etc, and another as a populated place with a population at
> date X, and yet another one as an administrative subdivision with its
> parent territory etc. Those three representations will have different URIs
> and different descriptions,

Yes, and there might be significant differences in any of them over 
time. City names change, boundaries change and so on. I spent time 
recently with someone who had lived in 5 different countries, even 
though he'd lived in the same place all his life (Belgrade).


  and infering they are the same based on an
> inverse functional property is likely to entail inconsistent
> representations.

True.


> The bottom line of this, and I'm aware to be in vehement disagreement with
> many people around here, is that a URI does not identify an entity, but a
> representation.

Please let's not get into HR14.

  And a shared code is just a shared key, agnostic on the
> ontological status of its referent.

So is a URI. It's a dumb string that has the property that you can look 
it up and find out what it identifies, unlike codes that have no such 
functionality.

>
>
>> In that case, it's reasonable to infer that the two resources are the
>> same.
>
>
> Which is leading you dangerously closer to a semantic black hole horizon.
>
>
>> However, there are several 5-character codes in circulation, whether CAGE
>> / NCAGE codes, US 5-digit zip codes or INSEE codes - so it's essential to
>> unambiguously specify explicitly what the code represents

I see you have a list of codes a bit like mine, we should align our 
systems! (by which I mean, you should deleted yours and use mine).

Ain't going to happen.

That'll do for now

Phil.

>
>
> This is simply an impossible task. You share a code, but views on what this
> code denotes, implemented as different URIs, can be different. And that
> should not be an issue.
> If you ask me, the whole semantic enterprise will fail as long as this
> point has not been widely understood. I seem to be very abrupt here, but
> this is my conclusion after about 15 years munching on those issues, in
> theory and in practice ...
>
>
>> and whether the relationship is inverse functional.  If that is not
>> specified in a machine-interpretable manner, we all lose efficiency because
>> each responsible developer must verify that relationship manually before
>> making that assumption.
>>
>> The major downside of bare code strings vs URIs is that it's not
>> immediately obvious where to go to find information - you can't simply make
>> a web request and reasonably hope to find a definition or other
>> relationships.  Of course, as Martin points out, we need a stable
>> foundation, which for Linked Data means stable URIs and a commitment to
>> maintain resources and web vocabularies for the common good, within a
>> framework that does not allow them to collapse or wither if one committed
>> individual leaves or is run over by a bus.
>>
>> Best wishes,
>>
>> - Mark
>>
>>
>> On 7 May 2015, at 09:36, Bernard Vatant <bernard.vatant@mondeca.com>
>> wrote:
>>
>>> Dear all
>>>
>>> This issue has been surfacing again and again lately, and I would like
>> to support Martin. I've already pushed this viewpoint here and there, I
>> understand the reaction of "orthodox" linked data supporters for whom
>> "things must be identified by URIs", period. But to put in bluntly, in many
>> cases, well-maintained codes for standardized identities (languages,
>> countries, towns, units ...) are more sustainable ways to share identities
>> than URIs, for the obvious reasons given by Martin (URIs are volatile) plus
>> three other ones at least.
>>>
>>> - Codes are not tied to any technical architecture, they can be used and
>> exchanged across any information system, not only the Web (semantic or
>> not). They allow to "weave beyond the Web" [1] any kind of data using them.
>>>
>>> - Codes have minimal semantics (if any), they just carry shared
>> identities, and that's great. Different data publishers can propose
>> different representations, identified by different URIs, and sharing the
>> same standard code. The sharing of a code via a common property/value pair
>> is the best way to provide loose coupling between those entities without
>> engaging into the neverending ontological and technical debate of knowing
>> if those representations represent the same/similar/equivalent thing(s),
>> and catastrophic chaining triggered by such hazardous equivalences.
>>>
>>> Let me take just one example. Is not it safer to tie
>> http://id.insee.fr/geo/commune/21231 to http://dbpedia.org/resource/Dijon
>> by the common value of INSEE code "21231" (standardized by INSEE) than to
>> rely on cascading sameAs leading to the stupid semantic black hole at
>>> http://sameas.org/html?uri=http%3A%2F%2Fdbpedia.org%2Fresource%2FDijon
>> which is the patent proof of the failure of a dogmatic and positivist use
>> of URIs.
>>>
>>> [1] http://bvatant.blogspot.fr/2015/04/weaving-beyond-web.html
>>>
>>>
>>> 2015-05-07 0:31 GMT+02:00 martin.hepp@ebusiness-unibw.org <
>> martin.hepp@ebusiness-unibw.org>:
>>> The problem is not the one time generation. The problems are as follows:
>>>
>>> 1. Copyright - Are you allowed to republish the code set as RDF?
>>> 2. Sustainability - Are you commited to keep the URIs dereferencable, or
>> will some domain grabber take the domain name once the creator has
>> completed his/her PhD and lost interest.
>>> 3. Updates - Will you keep the RDF version in sync whenever the standard
>> changes?
>>>
>>> Unless there is a clear "yes" to all three questions, it is better to
>> use the official codes than derived URIs.
>>>
>>> Martin
>>>
>>>
>>>
>>>> On 06 May 2015, at 23:56, Wes Turner <wes.turner@gmail.com> wrote:
>>>>
>>>> How much time do you think it would take to generate RDF (and
>> namespaced URIs) from the linked spreadsheet?
>>>>
>>>> Mappings to/from UN/CEFACT codes (as owl:sameAs mappings to strings)
>> could certainly be useful.
>>>>
>>>> On May 6, 2015 4:31 PM, "martin.hepp@ebusiness-unibw.org" <
>> martin.hepp@ebusiness-unibw.org> wrote:
>>>> I think a validator should simply use the list of valid codes from the
>> most recent UN/CEFACT document (available as MS Excel from
>> http://www.unece.org/cefact/codesfortrade/codes_index.html).
>>>>
>>>> There might be unit of measurement ontologies out there that hold the
>> UN/CEFACT Common Code string for a subset of all units as a literal value.
>> But for validation, one should use the authoritative list from the Excel
>> files (since they are updated from time to time).
>>>>
>>>> URIs are not better than strings for validation, because URIs are
>> strings.
>>>>
>>>> Best wishes / Mit freundlichen Grüßen
>>>>
>>>> Martin Hepp
>>>>
>>>> -------------------------------------------------------
>>>> martin hepp
>>>> e-business & web science research group
>>>> universitaet der bundeswehr muenchen
>>>>
>>>> e-mail:  martin.hepp@unibw.de
>>>> phone:   +49-(0)89-6004-4217
>>>> fax:     +49-(0)89-6004-4620
>>>> www:     http://www.unibw.de/ebusiness/ (group)
>>>>           http://www.heppnetz.de/ (personal)
>>>> skype:   mfhepp
>>>> twitter: mfhepp
>>>>
>>>> Check out GoodRelations for E-Commerce on the Web of Linked Data!
>>>> =================================================================
>>>> * Project Main Page: http://purl.org/goodrelations/
>>>>
>>>>
>>>>
>>>>
>>>>> On 06 May 2015, at 20:34, Wes Turner <wes.turner@gmail.com> wrote:
>>>>>
>>>>> Thanks!
>>>>>
>>>>> I notice that with QUDT there are SI conversion factors and complete
>> URIs for each unit.
>>>>>
>>>>> Is there a schema for validation of "schema:QuantativeValues
>> supports all UN/CEFACT Common Codes"?
>>>>>
>>>>> (A similar quandry as with MedicalCode; where URI namespaces (like
>> icd10:) would be more helpful for terminological validation and
>> disambiguation than plain string keys)
>>>>>
>>>>> On May 6, 2015 4:26 AM, "martin.hepp@ebusiness-unibw.org" <
>> martin.hepp@ebusiness-unibw.org> wrote:
>>>>>>
>>>>>> Hi Wes,
>>>>>> sorry for a very late reply:
>>>>>>
>>>>>> Actually you could easily use schema:QuantitativeValue for both
>> time and volume, with SEC as the unit code for t and LTR as the unit code
>> for liters, and link both via schema:valueReference, or better, and
>> owl:subProperty thereof.
>>>>>>
>>>>>> For the principle, see
>>>>>>
>>>>>>
>> http://wiki.goodrelations-vocabulary.org/Documentation/Structured_values_and_value_references
>>>>>>
>>>>>>
>>>>>> schema:QuantativeValues supports all UN/CEFACT Common Codes for
>> units, which should cover all you need:
>>>>>>
>>>>>>
>>>>>>
>> http://wiki.goodrelations-vocabulary.org/Documentation/UN/CEFACT_Common_Codes
>>>>>>
>>>>>> (Mind the full list in the public Excel files, the page just
>> highlights a small subset.)
>>>>>>
>>>>>> Best wishes / Mit freundlichen Grüßen
>>>>>>
>>>>>> Martin Hepp
>>>>>>
>>>>>> -------------------------------------------------------
>>>>>> martin hepp
>>>>>> e-business & web science research group
>>>>>> universitaet der bundeswehr muenchen
>>>>>>
>>>>>> e-mail:  martin.hepp@unibw.de
>>>>>> phone:   +49-(0)89-6004-4217
>>>>>> fax:     +49-(0)89-6004-4620
>>>>>> www:     http://www.unibw.de/ebusiness/ (group)
>>>>>>           http://www.heppnetz.de/ (personal)
>>>>>> skype:   mfhepp
>>>>>> twitter: mfhepp
>>>>>>
>>>>>> Check out GoodRelations for E-Commerce on the Web of Linked Data!
>>>>>> =================================================================
>>>>>> * Project Main Page: http://purl.org/goodrelations/
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On 01 May 2015, at 13:45, ☮ elf Pavlik ☮ <
>> perpetual-tripper@wwelves.org> wrote:
>>>>>>>
>>>>>>> Hi Wes,
>>>>>>>
>>>>>>> On 01/26/2014 07:20 AM, Wes Turner wrote:
>>>>>>>> Say I am trying to share a tabular dataset. [1] There's
>> metadata for
>>>>>>>> the Dataset, and there's metadata for the particular columns
>> (which
>>>>>>>> applies to the particular data items).
>>>>>>>>
>>>>>>>> For example:
>>>>>>>>
>>>>>>>> t   volume (liters)
>>>>>>>> -----------------
>>>>>>>> 1  1
>>>>>>>> 2  0.7
>>>>>>>> 3  0.5
>>>>>>>> 4  0.3
>>>>>>>> 5  0.1
>>>>>>>>
>>>>>>>> Questions
>>>>>>>> ===========
>>>>>>>> # Is there (a good) way to specify these units and quantities
>> (in
>>>>>>>> addition to XSD datatypes)?
>>>>>>> You might like to check out
>>>>>>> * https://iotdb.org/pub/iot-unit.html
>>>>>>>
>>>>>>> Cheers!
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Bernard Vatant
>>> Vocabularies & Data Engineering
>>> Tel :  + 33 (0)9 71 48 84 59
>>> Skype : bernard.vatant
>>> http://google.com/+BernardVatant
>>> --------------------------------------------------------
>>> Mondeca
>>> 35 boulevard de Strasbourg 75010 Paris
>>> www.mondeca.com
>>> Follow us on Twitter : @mondecanews
>>> ----------------------------------------------------------
>>
>>
>
>

-- 


Phil Archer
W3C Data Activity Lead
http://www.w3.org/2013/data/

http://philarcher.org
+44 (0)7887 767755
@philarcher1
Received on Thursday, 7 May 2015 11:00:32 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 7 May 2015 11:00:33 UTC