Re: URIs and Unique IDs

On Wed, Nov 26, 2008 at 5:17 PM, John Graybeal <graybeal@mbari.org> wrote:

> In our research community, on a regular basis you hear how important it is
> to know, as precisely as possible, the meaning of the parameters in a
> historical data collection. Whether or not "sea surface temperature" meant
> the temperature of water "collected somewhere near the surface and brought
> back on board", "measured in situ 1 meter below the surface", or "measured
> by a satellite" can significantly impact the temperature trend of a global
> ocean temperature analysis.
>
> Before you jump:  I appreciate that fundamentally these can be 3 different
> concepts. My observation is that the people defining the terms don't always
> appreciate that; and simply letting a concept evolve, without tracking or
> versioning the evolution, will obviously produce analyses in the future that
> say "We don't know which version of the concept they had in mind when they
> labeled this data value."   Tracking the necessary information to answer
> questions like that is a minimal requirement for supporting historical data
> analyses for environmental science.  For me, that's a decisive argument for
> versioning.

I see this as an argument for better modeling, not versioning. But
first let me see if I understand the scenario.

You want to define sea surface temperature. There are a number of
methods for doing so. You are proposing to have a single class
(relation?) "sea surface temperature" that is versioned as follows:

"sea surface temperature"
  v1: temperature of water "collected somewhere near the surface and
brought back on board
  v2: measured in situ 1 meter below the surface
  v3: measured by a satellite

Your presumption is that for a while people will use v1, then they
will use v2 then they will use v3 and therefore you will know what
they mean in each case.

Do I understand this correctly?

-Alan

>
> John
>
> On Nov 9, 2008, at 10:41 PM, Alan Ruttenberg wrote:
>
>> On Sun, Nov 9, 2008 at 9:18 PM, Peter Ansell <ansell.peter@gmail.com>
>> wrote:
>>>
>>> ----- "Alan Ruttenberg" <alanruttenberg@gmail.com> wrote:
>>>>
>>>> The OBO ontologies are moving towards *all* URI being numeric id based
>>>> for this reason (until recently it had only been classes that were
>>>> named that way).
>>>
>>> How will people using OBO ever be sure that they aren't going to use a
>>> term thinking it doesn't have reaching consequences like the
>>> broader->broaderTransitive difference and find out in future that it has
>>> changed and influenced their results in some way when someone could
>>> reasonably have determined that the nature of the term had changed and it
>>> needed a new number/name/URI/UID. I do recognise that whenever any property
>>> attached to a term changes that technically there could be a difference in
>>> the results of some application utilising the data, but reverting to saying
>>> that things just migrate on the spot always isn't a suitable solution either
>>> IMO.
>>
>> Nobody can be sure of anything. However their policy has been arrived
>> at over many years of practice of arguably the most successful
>> collaboratively built ontology in history. If I had to make a wager, I
>> wouldn't bet against the solution they've come up with without a
>> really good case for it.
>>
>> <snip>
>> Bottom line is that there is a decent amount of experience that leads
>> to a conclusion of being very hesitant before changing ids. If you
>> have some experience to share that demonstrates otherwise I'm very
>> interested in hearing the specifics. I think we could do with more
>> case studies and fewer first principles here.
>>
>> Regards,
>>
>> -Alan
>>
>
>
>

Received on Wednesday, 26 November 2008 22:51:51 UTC