Re: Question about identifiers from Rob Atkinson on 2016-08-22 (public-sdw-wg@w3.org from August 2016)

From: Rob Atkinson <rob@metalinkage.com.au>
Date: Mon, 22 Aug 2016 00:45:10 +0000
To: Simon.Cox@csiro.au, rob@metalinkage.com.au, jlieberman@tumblingwalls.com, eparsons@google.com
Cc: frans.knibbe@geodan.nl, public-sdw-wg@w3.org
Message-ID: <CACfF9Lxb3b_OmczeR+rZvjyBbMaY6N=nNRoH8N3nrkNuKNVo_w@mail.gmail.com>
And hence governance arrangements are the key thing to work through - as
per UK establishing a policy framework. This in turn provides a driver for
leveraging the vendor and jurisdictional neutral processes of SDO
(standards development organisations) wherever possible. Other governance
arrangements are "buyer beware" - not a problem as long as your
implementation strategy accepts and accounts for the potentially limited
lifespan of whatever you choose.

Rob



On Mon, 22 Aug 2016 at 10:10 <Simon.Cox@csiro.au> wrote:

> Just saw that Linda made similar point in a branch to this thread.
>
>
>
> *From:* Simon.Cox@csiro.au [mailto:Simon.Cox@csiro.au]
> *Sent:* Monday, 22 August 2016 9:40 AM
> *To:* rob@metalinkage.com.au; jlieberman@tumblingwalls.com;
> eparsons@google.com
> *Cc:* frans.knibbe@geodan.nl; public-sdw-wg@w3.org
> *Subject:* [ExternalEmail] RE: Question about identifiers
>
>
>
> And joining these thoughts together,
>
> -          if URIs are assigned by a registration process, and
>
> -          if the registrar uses a hierarchical path to manage governance
> (including maintaining uniqueness)
>
> -          then the URI will reflect the governance arrangement at the
> time of registration.
>
> This might mean that the original identifier for a thing does not reflect
> some future governance arrangement.
>
> At which time there are two options:
>
> (i)                 keep the original identifier
>
> (ii)               make a new registration and mark the original
> identifier ‘superseded’ by the new one.
>
>
>
> Simon
>
>
>
>
>
> *From:* Rob Atkinson [mailto:rob@metalinkage.com.au
> <rob@metalinkage.com.au>]
> *Sent:* Saturday, 20 August 2016 2:32 PM
> *To:* Joshua Lieberman <jlieberman@tumblingwalls.com>; Ed Parsons <
> eparsons@google.com>
> *Cc:* Frans Knibbe <frans.knibbe@geodan.nl>; SDW WG (public-sdw-wg@w3.org)
> <public-sdw-wg@w3.org>
> *Subject:* Re: Question about identifiers
>
>
>
> IMHO there is a basic principle that neatly resolves this - identifiers
> are generated by a registration process (i.e. if you accept something is an
> identifier you are essentially assuming its minting process is a
> registration process, i.e. you are subscribing to that governance).
> Registry practices are quite well established - and we should point people
> to these.  These include things like not reusing identifers, version
> handling etc.
>
>
>
> If a dataset does not conform to the principles of registration then it is
> not suitable as a source of concept identifiers - e.g. a spatial dataset
> whose object ids change every version may be used as a resource, and
> features may have URLs, but such URLS must not be used as URIs - it is
> necessary to put a redirect from a more stable identifier set to the
> resource du jour.
>
>
>
> Within a registration paradigm, the URI pattern is a simple registry
> delegation model - an item lives within a register (its base URI left of
> the /).  These may be nested, in the same way subregisters may be items in
> a register. Register URIs should be dereferenceable to get metadata about
> the governance process and the type of object in the register.
>
>
>
> Thus, hierarchies made this way are stable. If governance of the set of
> items change, then new identifiers must be minted and a reference to old
> identifers should be included.
>
>
>
> The UK examples conform to this pattern, although they seem to have
> converged on it rather that started with a registration perspective. WMO
> practice at codes.wmo.int formalises this more explicitly
>
>
>
> Rob
>
>
>
>
>
> On Sat, 20 Aug 2016 at 00:09 Joshua Lieberman <
> jlieberman@tumblingwalls.com> wrote:
>
> I’m sorry — or not — to have kicked off this identifier structure debate,
> but it’s an important one to have. It’s easy in a way just to say that URL
> identifiers should carry no meaning for maximum flexibility, but in almost
> all practice they are used in meaningful ways. I am also part of the
> specifying minority (but the URL minting and parsing majority) that feel it
> is done anyway and carries undeniable advantages, so let’s figure out ways
> and means for it.
>
>
>
> There are several reasons why it is useful to have agreed URL structures.
> We should note first of all that the host domain name is an important part
> of the meaning context and authenticity for a URL. The value of using HTTPS
> is not just encryption but also having the identity of the URL resolver
> confirmed by a PKI certificate. Both the domain name hierarchy and the URL
> path hierarchy can also support meaningful uniqueness of identifiers. They
> address a problem with UUID’s that is easy to make too many unique
> identifiers, rather than not enough. The hierarchical structures help a lot
> with figuring out what identifiers may actually have been minted to refer
> to the same things. They also help with determining the authority for
> making and maintaining identifiers, on both inter-organizational and
> intra-organizational levels.
>
>
>
> On a level of taxonomic meaning, the stability and/or uniqueness of a
> classification may indeed be questionable, as Frans and others have pointed
> out. Certainly many taxonomic identifier systems have gone from hierarchy
> to sequential primary identifiers as classifications have evolved with
> on-going research. I’m still in the smaller minority that dereferencing
> every URL to see what its classification might be is more work than
> necessary. Many classification schemes are quite stable or only slowly
> changing. I feel it is also acceptable as needed to have redirections from
> URL’s representing a current or historical or even alternative
> classification to the same normative or informative material that an
> authoritative URL links to.
>
>
>
> On the other hand, we worry about the “semantics” of URL’s versus other
> means, but the formal semantics of an entity are expressed as logical
> relationships to other entities (at least in predicate logic). If a
> substantial portion of those relationships form a hierarchical structure of
> like entities, than a hierarchical URL can be a real form of identity, not
> just a convenience.
>
>
>
> So my recommendation is to support a practice of identifiers being
> structured minimally at least for purposes of authority and uniqueness. I
> also recommend considering taxonomic meaning for primary or secondary
> identifiers where the taxonomy is relatively stable and/or an integral part
> of the definition of the identified entity. I’ll have time next week to
> contribute something to this effect to the BP.
>
>
>
> —Josh
>
>
>
>
>
> On Aug 19, 2016, at 8:44 AM, Ed Parsons <eparsons@google.com> wrote:
>
>
>
> Currently a human readable pattern would not help in terms of crawling...
> however I still maintain my (minority) view that as a method of expressing
> current and past geographic hierarchies such uri schemes could be useful.
>
>
>
> ed
>
>
>
>
>
> On Fri, 19 Aug 2016 at 13:05 Frans Knibbe <frans.knibbe@geodan.nl> wrote:
>
> On 19 August 2016 at 12:10, Ed Parsons <eparsons@google.com> wrote:
>
> So perhaps best practice is to update the resource at the old URI to point
> to the new one ?
>
>
>
> That is a possibility, but it would be messy. For individual resources
> redirection would have to be set up. That means high maintenance costs and
> a high risk of mistakes. And still there would be the risk of
> misinterpretation. A human consumer could interpret the first URI
> encountered without following it to an alternative URI, still leading to
> false data.
>
>
>
> But what would be the point anyway? If a  path in the URI like
> /{municipality}/{quarter}/{neighbourhood} is for human consumption only
> it is not that valuable, I think, assuming that most people don't read URIs.
>
>
>
> The only reason I can think of to want to have a hierchical path in a URI
> is if web crawlers are known to parse the URI strings themselves (next to
> the URI payload). That could in theory lead to improved discoverabilty of
> resources. I wonder if that actually happens... Perhaps Ed knows how the
> Google crawlers behave in that respect? Or would that be sharing trade
> secrets?
>
>
>
> Regards,
>
> Frans
>
>
>
>
>
>
>
> Ed
>
>
>
>
>
> On Fri, 19 Aug 2016 at 11:03 Frans Knibbe <frans.knibbe@geodan.nl> wrote:
>
> On 19 August 2016 at 11:11, Linda van den Brink <l.vandenbrink@geonovum.nl>
> wrote:
>
> Yes…  it is generally easier to make meaningless IDs persistent. But it is
> nice to have URIs that are human readable. In the Dutch URI strategy we do
> advise having human-readable parts in the URI scheme, but say that
> officially these mean nothing i.e. we say it is extremely ill-advised to
> ascribe any meaning to {concept} **for the machine**. URIs are opaque in
> a technical sense. Meanwhile, however, they do give hints to human readers.
>
>
>
> Then how can you tell humans that they can interpret the URI and tell
> machines that they should not? Is there a mechanism for doing that?
>
>
>
> Greetings,
>
> Frans
>
>
>
>
>
> *Van:* Ed Parsons [mailto:eparsons@google.com]
> *Verzonden:* vrijdag 19 augustus 2016 11:02
> *Aan:* Frans Knibbe; SDW WG (public-sdw-wg@w3.org)
> *CC:* Linda van den Brink; Joshua Lieberman (jlieberman@tumblingwalls.com);
> Byron Cochrane
> *Onderwerp:* Re: Question about identifiers
>
>
>
> While I accept that the current view of URI schemes having no explicit
> meaning, I do see great value in the /{municipality}/{quarter}/{neighbourhood} as
> a simple way of expressing geographical hierarchy independent of
> geometry... What's the worst that could happen ?
>
>
>
> Ed
>
>
>
>
>
> On Fri, 19 Aug 2016 at 09:30 Frans Knibbe <frans.knibbe@geodan.nl> wrote:
>
> Hi,
>
>
>
> A prime requirement of good URI minting is to not put any meaning in the
> URI, at least no meaning that is somehow intended for consumers. Everything
> that needs to be said about a resource, like its membership of data
> collections or its versioning, can be said in the data that is returned
> when the URI is dereferenced.
>
>
>
> URI schemes like /{municipality}/{quarter}/{neighbourhood} could be
> dangerous, because consumers could inadvertently try to derive meaning from
> such an URI. The usefulness of such a scheme in URI minting is also
> doubtful, because administrative structure can change in time. That could
> complicate the URI minting procedures over time.
>
>
>
> I do wonder to what extent common web crawlers try to parse URIs and
> attach meaning to URI parts.
>
>
>
> Regards,
>
> Frans
>
>
>
>
>
>
>
> On 18 August 2016 at 22:55, Byron Cochrane <bcochrane@linz.govt.nz> wrote:
>
> Hi,
>
> I like the guidance under the URI-Strategy under Hierarchical URIs
> generally, but have some reservations to this intelligent identifiers
> approach.
> For metadata access I think it is a good thing.  Most metadata for an
> individual features will usually reside at the dataset or collection
> (better term) level.  This hierarchical approach makes this metadata easy
> to access.
>
> But this built in intelligence makes the permanence of the URIs more
> difficult.  For example, administrative boundaries change through mergers
> and annexations.  A spatial thing that was in one collection is now in
> another.  The URIs for these things then confuse more than help.  URI
> redirects are one way to deal with this, but perhaps tracking these
> relationships through applied ontologies such as skos:broader and
> skos:narrower is the better practice?
>
> No answers from me here, just questions.
>
> Cheers,
> Byron
>
> ________________________________________
> From: Linda van den Brink [l.vandenbrink@geonovum.nl]
> Sent: Thursday, August 18, 2016 8:28 PM
> To: Joshua Lieberman (jlieberman@tumblingwalls.com)
> Cc: SDW WG (public-sdw-wg@w3.org)
> Subject: Question about identifiers
>
> Hi Josh,
>
> Coming back to the telecon yesterday:
>
>
> <joshlieberman> Should identifiers be part of a system for the features of
> interest?
>
> joshlieberman: making identifiers part of a system, where the features are
> part of the system?
> ... for example corresponding to paths in a taxonomy
>
> Linda: no answer right now, will have to think about it
>
> Were you talking about recommending some system for creating HTTP URI
> identifiers, i.e. some sort of URI strategy or pattern? Specifically where
> the features can be organised into some system like a hierarchy, as with
> administrative regions? There are some examples from Geonovums testbed here
> https://github.com/geo4web-testbed/topic3/wiki/URI-Strategy under
> Hierarchical URIs.
>
> Just trying to understand what you mean… we could add some guidance to the
> BP about this. I think that would be helpful.
>
> Linda
>
> ______________________________________
> Geonovum
> Linda van den Brink
> Adviseur Geo-standaarden
>
> a: Barchman Wuytierslaan 10, 3818 LH Amersfoort
> p: Postbus 508, 3800 AM Amersfoort
> t:  + 31 (0)33 46041 00
> m: + 31 (0)6 1355 57 92
> e:  l.vandenbrink@geonovum.nl<mailto:r.beltman@geonovum.nl>
> i:  www.geonovum.nl<http://www.geonovum.nl/>
> tw: @brinkwoman
>
> This message contains information, which may be in confidence and may be
> subject to legal privilege. If you are not the intended recipient, you must
> not peruse, use, disseminate, distribute or copy this message. If you have
> received this message in error, please notify us immediately (Phone 0800
> 665 463 or info@linz.govt.nz) and destroy the original message. LINZ
> accepts no responsibility for changes to this email, or for any
> attachments, after its transmission from LINZ. Thank You.
>
>
>
> --
>
> *Ed Parsons *FRGS
> Geospatial Technologist, Google
>
> Google Voice +44 (0)20 7881 4501
> www.edparsons.com @edparsons
>
> --
>
> *Ed Parsons *FRGS
> Geospatial Technologist, Google
>
> Google Voice +44 (0)20 7881 4501
> www.edparsons.com @edparsons
>
> --
>
> *Ed Parsons *FRGS
> Geospatial Technologist, Google
>
> Google Voice +44 (0)20 7881 4501
> www.edparsons.com @edparsons
>
>
>
>
Received on Monday, 22 August 2016 00:46:01 UTC