Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

Either "Linked Data ecosystem" or "linked data Ecosystem" is a dangerously flawed paradigm, IMHO.  You don't "improve" MeSH by flattening it, for example, it is what it is. Since CAS numbers are not a directed graph, an algorithmic transform to a URI (which *is* a directed graph) is risks the creation of a "new" irreconcilable taxonomy.  For example, Nitrogen is ok to breathe and liquid Nitrogen is a not very practical way to chill wine.

Just my 2 cents.

--- On Tue, 8/23/11, John Erickson <olyerickson@gmail.com> wrote:

> From: John Erickson <olyerickson@gmail.com>
> Subject: Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my  Semantic Web presentation at SXSW)
> To: public-lod@w3.org
> Date: Tuesday, August 23, 2011, 8:05 AM
> This is an important discussion that
> (I believe) foreshadows how
> canonical identifiers are managed moving forward.
> 
> Both CAS and DUNS numbers are a good example. Consider the
> challenge
> of linking EPA data; it's easy to create a list of toxic
> chemicals
> that are common across many EPA datasets. Based on those
> chemical
> names, its possible to further find (in most cases)
> references in
> DBPedia and other sources, such as PubChem:
> 
> * ACETALDEHYDE
> * http://dbpedia.org/page/Acetaldehyde
> * http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=177
> * etc...
> 
> Now, add to this a sensible agency-rooted URI design and a
> DBPedia-like infrastructure and one has a very powerful hub
> that
> strengthens the Linked Data ecosystem. It would arguably be
> stronger
> if CAS identifiers were also (somehow) included, but even
> the bits of
> linking shown above change the value proposition of
> traditional
> proprietary naming schemes...
> 
> John
> PS: At TWC we are about to go live with a registry called
> "Instance
> Hub" that will demonstrate the association of agency-based
> URI schemes
> --- think EPA, HHS, DOE, USDA, etc --- with instance data
> over which
> the agency has some authority or interest...More very
> soon!
> 
> On Tue, Aug 23, 2011 at 8:31 AM, Patrick Durusau <patrick@durusau.net>
> wrote:
> > David,
> >
> > On 8/22/2011 9:55 PM, David Booth wrote:
> >
> > On Mon, 2011-08-22 at 20:27 -0400, Patrick Durusau
> wrote:
> > [ . . . ]
> >
> > The use of CAS identifiers supports searching across
> vast domains of
> > *existing* literature. Not all, but most of it for the
> last 60 or so
> > years.
> >
> > That is non-trivial and should not be lightly
> discarded.
> >
> > BTW, your objection is that "non-licensed systems"
> cannot use CAS
> > identifiers? Are these commercial systems that are
> charging their
> > customers? Why would you think such systems should be
> able to take
> > information created by others?
> >
> > Using the information associated with an identifier is
> one thing; using
> > the identifier itself is another.  I'm sure the
> CAS numbers have added
> > non-trivial value that should not be ignored. 
> But their business model
> > needs to change.  It is ludicrous in this web era
> to prohibit the use of
> > the identifiers themselves.
> >
> > If there is one principle we have learned from the
> web, it is enormous
> > value and importance of freely usable universal
> identifiers.  URIs rule!
> > http://urisrule.org/
> >
> > :)
> >
> > Well, I won't take the bait on URIs, ;-), but will
> note that re-use of
> > identifiers of a sort was addressed quite a few years
> ago.
> >
> > See: Feist Publications, Inc., v. Rural Telephone
> Service Co., 499 U.S. 340
> > (1991) or follow this link:
> >
> > http://en.wikipedia.org/wiki/Feist_v._Rural
> >
> > The circumstances with CAS numbers is slightly
> different because to get
> > access to the full set of CAS numbers I suspect you
> have to sign a licensing
> > agreement on re-use, which makes it a matter of
> *contract* law and not
> > copyright.
> >
> > Perhaps they should increase the limits beyond 10,000
> identifiers but the
> > only people who want the whole monty as it were are
> potential commercial
> > competitors.
> >
> > The people who publish the periodical "Brain" for
> example at $10,000 a year.
> > Why should I want the complete set of identifiers to
> be freely available to
> > help them?
> >
> > Personally I think given the head start that the CAS
> maintainers have on the
> > literature, etc., that different models for use of the
> identifiers might
> > suit their purposes just as well. Universal
> identifiers change over time and
> > my concern is with the least semantic friction and not
> as much with how we
> > get there.
> >
> > Hope you are having a great day!
> >
> > Patrick
> >
> >
> >
> >
> > --
> > Patrick Durusau
> > patrick@durusau.net
> > Chair, V1 - US TAG to JTC 1/SC 34
> > Convener, JTC 1/SC 34/WG 3 (Topic Maps)
> > Editor, OpenDocument Format TC (OASIS), Project Editor
> ISO/IEC 26300
> > Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
> >
> > Another Word For It (blog): http://tm.durusau.net
> > Homepage: http://www.durusau.net
> > Twitter: patrickDurusau
> >
> 
> 
> 
> -- 
> John S. Erickson, Ph.D.
> http://bitwacker.com
> olyerickson@gmail.com
> Twitter: @olyerickson
> Skype: @olyerickson
> 
> 

Received on Tuesday, 23 August 2011 14:18:10 UTC