Re: RDF for molecules, using InChI from Alan Ruttenberg on 2007-08-05 (public-semweb-lifesci@w3.org from August 2007)

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Sun, 5 Aug 2007 01:25:19 -0400
To: Egon Willighagen <egon.willighagen@gmail.com>
Cc: public-semweb-lifesci hcls <public-semweb-lifesci@w3.org>, Michel_Dumontier <Michel_Dumontier@carleton.ca>, Jonathan A Rees <jar@mumble.net>
Message-Id: <1E2A966E-4732-4BCB-9B11-C6CCF95223E5@gmail.com>
I don't think it is likely that the HCLS recommendations will suggest  
using INFO uri's. They haven't been championed by anyone, urn schemes  
are generally discouraged by the W3C TAG,  and in our discussions  
thus far haven't seen any advantages to using them while noting  
difficulties. Too many URN schemes lead to difficulties on the part  
of clients, which is why there is still a lot of discord about LSIDs,  
which are certainly in line before INFOs. Finally, there are better  
alternatives.

Just a heads up.

-Alan

On Aug 3, 2007, at 10:14 AM, Michel_Dumontier wrote:

>
>> info:inchi/InChI=1/C10H8/c1-2-6-10-8-4-3-7-9(10)5-1/h1-8H
>>
>> The owl:sameAs can make the link of this URI to the one I suggested.
>>
>> Egon
>
> Egon,
>  Excellent! This is exactly what I'm looking for - in addition, the  
> info
> registry [1] contains several other namespaces such as pmids and  
> refseq
> identifiers.
>
> [1]
> http://info-uri.info/registry/OAIHandler? 
> verb=ListRecords&metadataPrefix
> =oai_dc
>
>
> Also, take note of their excellent FAQ which addresses the  
> rationale for
> using the INFO URI (http://info-uri.info/registry/docs/misc/faq.html)
>
> <<<
> #   Why was it necessary to develop the INFO URI scheme?  <<
>
> The INFO URI scheme was developed from within the library and  
> publishing
> communities to expedite the referencing by URIs of information assets
> that have identifiers in public namespaces but have no representation
> within the URI allocation.
>
> For various reasons (both cultural and technical) the creation and
> registration of a new URI scheme or URN namespace to support a given
> public namespace under the URI allocation may not have been  
> attempted by
> the authority for that namespace. It is precisely to facilitate the
> representation of these public namespaces within the URI allocation  
> that
> the INFO URI scheme was developed.
> # What was the motivation behind the INFO URI scheme?  <<
>
> The motivation behind developing the INFO URI scheme was to allow  
> legacy
> identification systems to become part of the World Wide Web global
> information architecture so that the information assets they identify
> can be referenced by Web-based description technologies such as XLink,
> RDF or Topic Maps. Note that we are concerned with "information  
> assets",
> not "digital assets" per se - the information assets may be variously
> digital, physical or conceptual.
>>>>
>
> <<<
> #   Why not just use HTTP URIs?  <<
>
> HTTP URIs (RFC 2616) are Internet protocol elements for referencing
> hypertext documents which can be retrieved from a network authority
> using the HTTP transfer protocol. There is a common expectation that
> HTTP URIs can be dereferenced.
>
> The following considerations hold in respect of HTTP URIs:
>
>     * HTTP URIs are inappropriate for INFO namespaces because HTTP  
> URIs
> provide:
>           o network transport
>           o network path (discovery obstacle)
>           o strong dereference expectation
>           o poor branding (network path overhead)
>     * A transport mechanism adds meaningless semantic overhead to
> nondereferencable URIs.
>     * Absolute HTTP URIs include a network path (comprised of an
> authority component and a hierarchical path component). INFO  
> namespaces
> may not have (or may not make) any network authority available. A
> central network authority would also be inappropriate as this would
> introduce a dependency between a third party namespace and a central
> network authority.
>
>       Further, were INFO namespaces to make a network authority
> available they would each have to publish the particular hierarchical
> path syntax employed by that network authority. A central network
> authority would mitigate this requirement by providing a single path
> syntax, although it would still need to publish that path syntax.
>     * Use of HTTP URIs might only encourage the provisioning of  
> resource
> representations (e.g. metadata descriptions) which could conflict with
> representations provided under any possible future URI registration on
> the part of the Namespace Authority. Further, if HTTP URIs were  
> used to
> provide resource representations, it must be recognized that managing
> the namespace and infrastructure is a costly enterprise that may  
> not be
> appropriate or cost effective in a given business context.
>     * The network path of HTTP URIs adds unnecessary string  
> overhead and
> consequent loss of branding for legacy identifiers.
>
> #
>
>     *
>
> # Well then, why not just use URN URIs?  <<
>
> URN URIs (RFC 2141) are Internet protocol elements for referencing
> resources using persistent and location-independent identifiers,
> representations of which may be retrieved using various resolution
> mechanisms. There is a common expectation that URN URIs can be
> dereferenced, once suitable resolution mechanisms are defined (e.g.  
> DDDS
> or other proprietary mechanisms). Indeed, RFC 1737 goes so far as to
> make a strong recommendation that "there be a mapping between the  
> names
> generated by each naming authority and URLs".
>
> Use of URN URIs requires a URN namespace registration. An informal URN
> namespace is of limited utility because its numerical nature  
> obliterates
> any branding or name recognition and effectively renders the namespace
> anonymous. A formal URN namespace, on the other hand, would require a
> more substantial review than a corresponding registration under the  
> INFO
> Registry. Based on experience with the initial INFO namespace target
> group, it is unlikely that many Namespace Authorities will proceed  
> with
> independent applications as the burden of registering a URN  
> namespace is
> high, especially in the case of organizations that are not strongly
> steeped in technology.
>
> One particular impediment in applying for a URN namespace for INFO is
> that this would compromise any possible future URN namespace
> registration that a Namespace Authority might seek to make in  
> respect of
> considerations of persistence, location independence and/or  
> dereference
> to resource representations.
>
> The following considerations hold in respect of URN URIs:
>
>     * URN URIs are inappropriate for INFO namespaces because URN URIs
> provide:
>           o claims of persistence of resource identifiers
>           o dereference expectation
>           o no delegated naming responsibility
>           o restricted syntax (no hierarchy)
>           o no support for fragment identifiers
>           o poor branding and extra semantic layer (additional  
> namespace
> tier)
>     * INFO URIs make no claims on persistence. INFO URIs may be  
> location
> independent and in consequence may enjoy some degree of  
> persistence, but
> INFO does not make these assertions. Instead INFO is neutral with
> respect to identifier persistence.
>     * Use of URN URIs might only encourage the provisioning of  
> resource
> representations (e.g. metadata descriptions) which could conflict with
> representations provided under any possible future URI registration on
> the part of the Namespace Authority. Further, if URN URIs were used to
> provide resource representations, it must be recognized that managing
> the namespace and infrastructure is a costly enterprise that may  
> not be
> appropriate or cost effective in a given business context.
>     * For INFO to operate as a URN namespace would require that  
> INFO be
> constituted as a delegated naming authority. It is not clear that a  
> URN
> namespace would be an appropriate choice for such naming authority
> delegation.
>     * Syntactically, URN URIs do not support hierarchy (in URI syntax
> hierarchy proceeds through the "/" character) and are thus more
> difficult to use with legacy identifiers because of their restricted
> character set. Other characters reserved by URN URIs, but allowed by
> INFO URIs are "&" and "~".
>
>       For a demonstration in the difficulty of mapping legacy
> identifiers the reader is referred to RFC 3151 which provides a set of
> complex transcriptions for mapping SGML formal public identifiers onto
> the URN URI syntax. Formal public identifiers would have been more
> readily presented under the more expressive INFO syntax.
>     * Additionally, URN URIs do not support fragment identifiers thus
> not allowing the identification of secondary resources with respect  
> to a
> primary resource. This is a pratical requirement that INFO supports.
>     * With INFO as a URN namespace, the INFO namespaces would then
> become sub-sub-namespaces, with a consequent loss of branding. This
> would also introduce three tiers of semantic layers for an
> implementation to navigate.
>
>>>>
>
>
> I think this is really interesting, and well worth further  
> investigating
> its merits for knowledge communities.
>
>
> -=Michel=-
>
> Michel Dumontier
> Assistant Professor of Bioinformatics
>
> Department of Biology, School of Computer Science, Institute of
> Biochemistry
> Carleton University
>
> Member of the Ottawa Institute of Systems Biology
> Member of the Ottawa-Carleton Institute for Biomedical Engineering
>
> Office: 4610 Carleton Technology and Training Center
> Mailing: 209 Nesbitt, 1125 Colonel By Drive, Ottawa, ON K1S5B6
> Tel:  +1 (613) 520-2600 x4194
> Fax:  +1 (613) 520-3539
> Web:  http://dumontierlab.com
> Skype: micheldumontier
>
>> -----Original Message-----
>> From: Egon Willighagen [mailto:egon.willighagen@gmail.com]
>> Sent: Friday, August 03, 2007 6:15 AM
>> To: Michel_Dumontier
>> Subject: Re: RDF for molecules, using InChI
>>
>> Michel,
>>
>> On 8/2/07, Michel_Dumontier <Michel_Dumontier@carleton.ca> wrote:
>>> I support the use of InChI as URI. Of course, the use of such a URI
> will
>>> annoy those that want URL resolvable URIs... another reason to
> relate the
>> URI
>>> and the resolvable URL with an owl:sameAs predicate.
>>
>> FYI, I was just informed by Tony Hammond about this blog post:
>>
>> http://www.crossref.org/CrossTech/2007/02/at_last_uris_for_inchi.html
>>
>> in which this is suggested:
>>
>> info:inchi/InChI=1/C10H8/c1-2-6-10-8-4-3-7-9(10)5-1/h1-8H
>>
>> The owl:sameAs can make the link of this URI to the one I suggested.
>>
>> Egon
>
Received on Sunday, 5 August 2007 05:25:27 UTC