Re: Question about identifiers

IMHO there is a basic principle that neatly resolves this - identifiers are
generated by a registration process (i.e. if you accept something is an
identifier you are essentially assuming its minting process is a
registration process, i.e. you are subscribing to that governance).
Registry practices are quite well established - and we should point people
to these.  These include things like not reusing identifers, version
handling etc.

If a dataset does not conform to the principles of registration then it is
not suitable as a source of concept identifiers - e.g. a spatial dataset
whose object ids change every version may be used as a resource, and
features may have URLs, but such URLS must not be used as URIs - it is
necessary to put a redirect from a more stable identifier set to the
resource du jour.

Within a registration paradigm, the URI pattern is a simple registry
delegation model - an item lives within a register (its base URI left of
the /).  These may be nested, in the same way subregisters may be items in
a register. Register URIs should be dereferenceable to get metadata about
the governance process and the type of object in the register.

Thus, hierarchies made this way are stable. If governance of the set of
items change, then new identifiers must be minted and a reference to old
identifers should be included.

The UK examples conform to this pattern, although they seem to have
converged on it rather that started with a registration perspective. WMO
practice at codes.wmo.int formalises this more explicitly

Rob


On Sat, 20 Aug 2016 at 00:09 Joshua Lieberman <jlieberman@tumblingwalls.com>
wrote:

> I’m sorry — or not — to have kicked off this identifier structure debate,
> but it’s an important one to have. It’s easy in a way just to say that URL
> identifiers should carry no meaning for maximum flexibility, but in almost
> all practice they are used in meaningful ways. I am also part of the
> specifying minority (but the URL minting and parsing majority) that feel it
> is done anyway and carries undeniable advantages, so let’s figure out ways
> and means for it.
>
> There are several reasons why it is useful to have agreed URL structures.
> We should note first of all that the host domain name is an important part
> of the meaning context and authenticity for a URL. The value of using HTTPS
> is not just encryption but also having the identity of the URL resolver
> confirmed by a PKI certificate. Both the domain name hierarchy and the URL
> path hierarchy can also support meaningful uniqueness of identifiers. They
> address a problem with UUID’s that is easy to make too many unique
> identifiers, rather than not enough. The hierarchical structures help a lot
> with figuring out what identifiers may actually have been minted to refer
> to the same things. They also help with determining the authority for
> making and maintaining identifiers, on both inter-organizational and
> intra-organizational levels.
>
> On a level of taxonomic meaning, the stability and/or uniqueness of a
> classification may indeed be questionable, as Frans and others have pointed
> out. Certainly many taxonomic identifier systems have gone from hierarchy
> to sequential primary identifiers as classifications have evolved with
> on-going research. I’m still in the smaller minority that dereferencing
> every URL to see what its classification might be is more work than
> necessary. Many classification schemes are quite stable or only slowly
> changing. I feel it is also acceptable as needed to have redirections from
> URL’s representing a current or historical or even alternative
> classification to the same normative or informative material that an
> authoritative URL links to.
>
> On the other hand, we worry about the “semantics” of URL’s versus other
> means, but the formal semantics of an entity are expressed as logical
> relationships to other entities (at least in predicate logic). If a
> substantial portion of those relationships form a hierarchical structure of
> like entities, than a hierarchical URL can be a real form of identity, not
> just a convenience.
>
> So my recommendation is to support a practice of identifiers being
> structured minimally at least for purposes of authority and uniqueness. I
> also recommend considering taxonomic meaning for primary or secondary
> identifiers where the taxonomy is relatively stable and/or an integral part
> of the definition of the identified entity. I’ll have time next week to
> contribute something to this effect to the BP.
>
> —Josh
>
>
>
> On Aug 19, 2016, at 8:44 AM, Ed Parsons <eparsons@google.com> wrote:
>
> Currently a human readable pattern would not help in terms of crawling...
> however I still maintain my (minority) view that as a method of expressing
> current and past geographic hierarchies such uri schemes could be useful.
>
> ed
>
>
> On Fri, 19 Aug 2016 at 13:05 Frans Knibbe <frans.knibbe@geodan.nl> wrote:
>
>> On 19 August 2016 at 12:10, Ed Parsons <eparsons@google.com> wrote:
>>
>>> So perhaps best practice is to update the resource at the old URI to
>>> point to the new one ?
>>>
>>
>> That is a possibility, but it would be messy. For individual resources
>> redirection would have to be set up. That means high maintenance costs and
>> a high risk of mistakes. And still there would be the risk of
>> misinterpretation. A human consumer could interpret the first URI
>> encountered without following it to an alternative URI, still leading to
>> false data.
>>
>> But what would be the point anyway? If a  path in the URI like
>> /{municipality}/{quarter}/{neighbourhood} is for human consumption only
>> it is not that valuable, I think, assuming that most people don't read URIs.
>>
>> The only reason I can think of to want to have a hierchical path in a URI
>> is if web crawlers are known to parse the URI strings themselves (next to
>> the URI payload). That could in theory lead to improved discoverabilty of
>> resources. I wonder if that actually happens... Perhaps Ed knows how the
>> Google crawlers behave in that respect? Or would that be sharing trade
>> secrets?
>>
>> Regards,
>> Frans
>>
>>
>>
>>> Ed
>>>
>>>
>>> On Fri, 19 Aug 2016 at 11:03 Frans Knibbe <frans.knibbe@geodan.nl>
>>> wrote:
>>>
>>>> On 19 August 2016 at 11:11, Linda van den Brink <
>>>> l.vandenbrink@geonovum.nl> wrote:
>>>>
>>>>> Yes…  it is generally easier to make meaningless IDs persistent. But
>>>>> it is nice to have URIs that are human readable. In the Dutch URI strategy
>>>>> we do advise having human-readable parts in the URI scheme, but say that
>>>>> officially these mean nothing i.e. we say it is extremely ill-advised to
>>>>> ascribe any meaning to {concept} **for the machine**. URIs are opaque
>>>>> in a technical sense. Meanwhile, however, they do give hints to human
>>>>> readers.
>>>>>
>>>>
>>>> Then how can you tell humans that they can interpret the URI and tell
>>>> machines that they should not? Is there a mechanism for doing that?
>>>>
>>>> Greetings,
>>>> Frans
>>>>
>>>>
>>>>>
>>>>>
>>>>> *Van:* Ed Parsons [mailto:eparsons@google.com]
>>>>> *Verzonden:* vrijdag 19 augustus 2016 11:02
>>>>> *Aan:* Frans Knibbe; SDW WG (public-sdw-wg@w3.org)
>>>>> *CC:* Linda van den Brink; Joshua Lieberman (
>>>>> jlieberman@tumblingwalls.com); Byron Cochrane
>>>>> *Onderwerp:* Re: Question about identifiers
>>>>>
>>>>>
>>>>>
>>>>> While I accept that the current view of URI schemes having no explicit
>>>>> meaning, I do see great value in the
>>>>> /{municipality}/{quarter}/{neighbourhood} as a simple way of
>>>>> expressing geographical hierarchy independent of geometry... What's the
>>>>> worst that could happen ?
>>>>>
>>>>>
>>>>>
>>>>> Ed
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, 19 Aug 2016 at 09:30 Frans Knibbe <frans.knibbe@geodan.nl>
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>>
>>>>>
>>>>> A prime requirement of good URI minting is to not put any meaning in
>>>>> the URI, at least no meaning that is somehow intended for consumers.
>>>>> Everything that needs to be said about a resource, like its membership of
>>>>> data collections or its versioning, can be said in the data that is
>>>>> returned when the URI is dereferenced.
>>>>>
>>>>>
>>>>>
>>>>> URI schemes like /{municipality}/{quarter}/{neighbourhood} could be
>>>>> dangerous, because consumers could inadvertently try to derive meaning from
>>>>> such an URI. The usefulness of such a scheme in URI minting is also
>>>>> doubtful, because administrative structure can change in time. That could
>>>>> complicate the URI minting procedures over time.
>>>>>
>>>>>
>>>>>
>>>>> I do wonder to what extent common web crawlers try to parse URIs and
>>>>> attach meaning to URI parts.
>>>>>
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Frans
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 18 August 2016 at 22:55, Byron Cochrane <bcochrane@linz.govt.nz>
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I like the guidance under the URI-Strategy under Hierarchical URIs
>>>>> generally, but have some reservations to this intelligent identifiers
>>>>> approach.
>>>>> For metadata access I think it is a good thing.  Most metadata for an
>>>>> individual features will usually reside at the dataset or collection
>>>>> (better term) level.  This hierarchical approach makes this metadata easy
>>>>> to access.
>>>>>
>>>>> But this built in intelligence makes the permanence of the URIs more
>>>>> difficult.  For example, administrative boundaries change through mergers
>>>>> and annexations.  A spatial thing that was in one collection is now in
>>>>> another.  The URIs for these things then confuse more than help.  URI
>>>>> redirects are one way to deal with this, but perhaps tracking these
>>>>> relationships through applied ontologies such as skos:broader and
>>>>> skos:narrower is the better practice?
>>>>>
>>>>> No answers from me here, just questions.
>>>>>
>>>>> Cheers,
>>>>> Byron
>>>>>
>>>>> ________________________________________
>>>>> From: Linda van den Brink [l.vandenbrink@geonovum.nl]
>>>>> Sent: Thursday, August 18, 2016 8:28 PM
>>>>> To: Joshua Lieberman (jlieberman@tumblingwalls.com)
>>>>> Cc: SDW WG (public-sdw-wg@w3.org)
>>>>> Subject: Question about identifiers
>>>>>
>>>>> Hi Josh,
>>>>>
>>>>> Coming back to the telecon yesterday:
>>>>>
>>>>>
>>>>> <joshlieberman> Should identifiers be part of a system for the
>>>>> features of interest?
>>>>>
>>>>> joshlieberman: making identifiers part of a system, where the features
>>>>> are part of the system?
>>>>> ... for example corresponding to paths in a taxonomy
>>>>>
>>>>> Linda: no answer right now, will have to think about it
>>>>>
>>>>> Were you talking about recommending some system for creating HTTP URI
>>>>> identifiers, i.e. some sort of URI strategy or pattern? Specifically where
>>>>> the features can be organised into some system like a hierarchy, as with
>>>>> administrative regions? There are some examples from Geonovums testbed here
>>>>> https://github.com/geo4web-testbed/topic3/wiki/URI-Strategy under
>>>>> Hierarchical URIs.
>>>>>
>>>>> Just trying to understand what you mean… we could add some guidance to
>>>>> the BP about this. I think that would be helpful.
>>>>>
>>>>> Linda
>>>>>
>>>>> ______________________________________
>>>>> Geonovum
>>>>> Linda van den Brink
>>>>> Adviseur Geo-standaarden
>>>>>
>>>>> a: Barchman Wuytierslaan 10, 3818 LH Amersfoort
>>>>> p: Postbus 508, 3800 AM Amersfoort
>>>>> t:  + 31 (0)33 46041 00
>>>>> m: + 31 (0)6 1355 57 92
>>>>> e:  l.vandenbrink@geonovum.nl<mailto:r.beltman@geonovum.nl>
>>>>> i:  www.geonovum.nl<http://www.geonovum.nl/>
>>>>> tw: @brinkwoman
>>>>>
>>>>> This message contains information, which may be in confidence and may
>>>>> be subject to legal privilege. If you are not the intended recipient, you
>>>>> must not peruse, use, disseminate, distribute or copy this message. If you
>>>>> have received this message in error, please notify us immediately (Phone
>>>>> 0800 665 463 or info@linz.govt.nz) and destroy the original message.
>>>>> LINZ accepts no responsibility for changes to this email, or for any
>>>>> attachments, after its transmission from LINZ. Thank You.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> *Ed Parsons *FRGS
>>>>> Geospatial Technologist, Google
>>>>>
>>>>> Google Voice +44 (0)20 7881 4501
>>>>> www.edparsons.com @edparsons
>>>>>
>>>> --
>>>
>>> *Ed Parsons *FRGS
>>> Geospatial Technologist, Google
>>>
>>> Google Voice +44 (0)20 7881 4501
>>> www.edparsons.com @edparsons
>>>
>> --
>
> *Ed Parsons *FRGS
> Geospatial Technologist, Google
>
> Google Voice +44 (0)20 7881 4501
> www.edparsons.com @edparsons
>
>
>

Received on Saturday, 20 August 2016 04:32:47 UTC