- From: Sandro Hawke <sandro@w3.org>
- Date: Tue, 26 Apr 2016 15:10:25 -0400
- To: Phil Archer <phila@w3.org>, "Haag, Jason" <jason.haag.ctr@adlnet.gov>
- Cc: Monica Omodei <monica.omodei@gmail.com>, "public-perma-id@w3.org" <public-perma-id@w3.org>
On 04/26/2016 01:31 PM, Phil Archer wrote: > Hi Jason, pls see inline below. > > On 26/04/2016 16:33, Haag, Jason wrote: >> For what it's worth we also stay away from DOIs and recently moved to >> using w3id.org instead of purl.org, but the problems/challenges of >> decentralization and a single point of failure are still there. If >> something happened with w3id.org similar to purl.org (lack of >> resources,etc) then how will we provide readability/resolvability of >> our vocabularies? > > This can be minimised by having a domain just designed to take care of > your own persistent URIs. Once you set up a service that resolves > other people's as well as your own, sooner or later an accountant will > ask why you're subsidising other people's needs. OCLC did a fantastic > job supporting purl.org for many years and is still providing the > power and connectivity for its server. > >> >> Potential Epiphany: Is there a semantic web property that provides a >> secondary option for vocabulary redundancy / fail-over? rdfs:seeAlso >> provides additional information, but I'm wondering if RDF publishing >> practices should be updated to allow for some form of redundancy in >> situations where persistent IRIs don't live up to their purpose and >> become unsteady. Hmmm..does this impractical and a lot of extra work >> though? > > My colleague Sandro Hawke has proposed something along these lines > although it's no more than an informal idea over a beer. The idea is > that we'd have a property like definitionText, the value of which > would (obviously) be the definition of that term. I could take the > definitionText from one of your terms, copy and paste it into my vocab > and because the definition text was identical, it would be the *same* > term, even though it had a different URI. That gives us redundancy in > that multiple copies would be made without multiplying the actual > number of terms. Lots of issues there but it might be worth pursuing > one day. > FWIW, I defined an ontology for this behavior: http://www.w3.org/ns/mics but I don't know of anyone using it. If I were doing it again, I might do it more simply. I also wrote a blog post about using this idea with JSON: https://decentralyze.com/2014/06/30/growjson/ -- Sandro >> >> Phil Archer's suggestion of setting up your own domain and resolution >> service for your community probably is the best approach, but for >> communities that don't have the resources or have disparate/evolving >> publishing practices this ins't always an option (at least initially). >> Persistence is a choice, but sometimes it's an unavoidable one. >> >> Have there ever been any talks of the W3C providing a long-term >> persistence service/solution other than what the W3id community group >> has supported? Just curious. If not, perhaps a stand-alone resolution >> service that could be run or managed independently by each community >> within their own servers & domain might be useful. > > Yes, but we always come down to money. Our own material is persistent. > See https://www.w3.org/Consortium/Persistence.html and > http://philarcher.org/diary/2011/20yearsofmlarchives/. It is no > exaggeration to say that docs in w3.org/TR and /ns at least will > almost certainly outlive anyone reading this. > > But... we don't have the resources to make an open ended commitment to > manage a service for everyone indefinitely. > > Cheers > > Phil > > >> >> >> >> >> On Fri, Apr 22, 2016 at 7:14 AM, Phil Archer <phila@w3.org> wrote: >>> Hi Monica, pls see inline below - although in fairness, I must warn >>> you, I >>> have a big hobby horse to ride here. >>> >>> On 22/04/2016 00:26, Monica Omodei wrote: >>>> >>>> Thanks for that input Phil which reinforced our own thinking. The >>>> whole >>>> research infrastructure environment is under review and we are in a >>>> holding >>>> pattern organisationally for another 12 months. We are confident the >>>> outcome will be more sustained support into the future but it may look >>>> funny if we set up a new persistence service at this point in time no >>>> matter what neutral domain name we use. We do have many services >>>> which we >>>> know will continue though no matter under what organisational >>>> structure - >>>> eg Research Data Australia, Research Vocabularies Australia, our >>>> Research >>>> Grants and Projects portal, our DataCite DOI minting service ..... >>> >>> >>> Understood. >>> >>>> >>>> One option I didn't add was minting DOIs for vocab terms. Not an >>>> option I >>>> would have considered before but I note that Content negotiation is >>>> being >>>> implemented by DOI Registration Agencies for their DOI names as I read >>>> here >>>> >>>> https://www.doi.org/doi_handbook/5_Applications.html#5.4.1 >>>> >>>> What do you think ? >>> >>> >>> Sorry but to be blunt I think it's a dreadful idea that should be >>> squashed >>> at the earliest opportunity. >>> >>> I can rant about this for hours but will desist (I've deleted several >>> versions of this e-mail, complete with logs of requests to the >>> examples in >>> that DOI handbook showing what happens when you dereference them with >>> different accept headers and counting the external dependencies >>> along the >>> way). >>> >>> DOIs are not a magic solution, they just another redirection >>> service, like >>> purl.org. You are looking for an alternative to purl.org as the >>> future of >>> that service is currently uncertain. Why jump from one centralised >>> redirection service to another? There is nothing fundamentally >>> different >>> about a DOI that makes it any more stable or more decentralised than >>> purl.org. The organisational commitments behind it are stronger, >>> yes, but >>> that's the only difference. >>> >>> What's that? They don't depend on any technology? OK, let's see. >>> What does >>> doi:10.1103/PhysRevD.89.032002 identify? Well, it might be the paper >>> that >>> describes the discovery of the Higgs Bosun or it might be something >>> else >>> entirely, depending on your choice of resolver >>> http://philarcher.org/10.1103/PhysRevD.89.032002 >>> >>> More seriously, DOIs are shared around as identifiers for articles and >>> datasets etc. Except they very often dereference to a landing page >>> *about* >>> that thing: one identifier, two resources. At which point the Web is >>> well >>> and truly broken. They work well for people following links and >>> tracking >>> citations, but they're not good for machines. DOIs allow people to >>> run Web >>> sites and not bother about managing for persistence. But they are a >>> terrible >>> fit for vocabulary look up by machines. >>> >>> OK, soap box going back in the cupboard. >>> >>> Cheers >>> >>> Phil. >>> >>> >>> >>> >>> >>> >>> There are some fundamental problems with DOIs: >>> >>> - they are centralised and are no more stable >>> >>> >>> >>> I did some digging into the link you sent. Taking the example they >>> give: >>> >>> Dependency 1: You're minded not to use purl.org because its future is >>> uncertain. But you are happy to be dependent on another centralised >>> redirection service? That seems odd for a start but let's follow it >>> up and >>> dereference the example in the DOI handbook: >>> >>> curl -IiH "Accept: application/rdf+xml;q=0.5, >>> application/vnd.citationstyles.csl+json;q=1.0" >>> http://dx.doi.org/10.1126/science.169.3946.635 >>> HTTP/1.1 303 See Other >>> Server: Apache-Coyote/1.1 >>> Vary: Accept >>> Location: http://data.crossref.org/10.1126%2Fscience.169.3946.635 >>> Expires: Fri, 22 Apr 2016 12:00:26 GMT >>> Content-Type: text/html;charset=utf-8 >>> Content-Length: 195 >>> Date: Fri, 22 Apr 2016 11:15:31 GMT >>> >>> So I'm redirected from doi.org to crossref.org - chalk up external >>> centralised dependency no. 2. If I deref that I get >>> >>> curl -H "Accept: application/rdf+xml;q=0.5, >>> application/vnd.citationstyles.csl+json;q=1.0" >>> http://data.crossref.org/10.1126%2Fscience.169.3946.635 >>> { >>> "indexed":{ >>> "date-parts":[[2015,12,26]], >>> "date-time":"2015-12-26T11:19:00Z", >>> "time stamp":1451128740588}, >>> "reference-count":0, >>> "publisher":"American Association for the Advancement of Science >>> (AAAS)", >>> "issue":"3946", >>> "published-print":{ >>> "date-parts":[[1970,8,14]] >>> >>> ... blah blah. So there's some machine readable data. >>> >>> What happens if I deref that original DOI with a different accept >>> header? >>> >>> curl -I http://dx.doi.org/10.1126/science.169.3946.635 >>> HTTP/1.1 303 See Other >>> Server: Apache-Coyote/1.1 >>> Vary: Accept >>> Location: >>> http://www.sciencemag.org/cgi/doi/10.1126/science.169.3946.635 >>> Expires: Fri, 22 Apr 2016 12:18:37 GMT >>> Content-Type: text/html;charset=utf-8 >>> Content-Length: 209 >>> Date: Fri, 22 Apr 2016 11:30:22 GMT >>> >>> A different domain again. >>> >>> Is the info I get back from that third service the same? It's >>> consistent, >>> but it's clearly not the same, showing that it's managed separately. >>> Want to >>> bet how long it will be before the two get out of sync? >>> >>> This is a system that by design has different people managing the >>> data and >>> the metadata, all with centralised identifiers that are technically >>> no more >>> robust that the one you're minded not to use. When I hear Andrew >>> Treloar, >>> Jan Brasse et al talking about DOIs I want to scream - what actually >>> does a >>> DOI identify? Is it the dataset? 'Cos if I deref a DOI I normally get a >>> landing page, which is not the same thing. >>> >>> >>> >>> I am aware of the greater institutional support for DOIs but that's >>> all it >>> is. >>> >>> Persistence is a choice, not a technical thing >>> >>> DOIs are a solution to a problem people choose to give themselves, >>> i.e. the >>> problem of not managing their own Web space properly. I know they are >>> beloved of the publishing industry and are seern as the answer to >>> all sorts >>> of problems but this example proves why they are antithetical to the >>> architecture of the Web. >>> >>> >>> >>> >>> >>>> >>>> Monica >>>> >>>> >>>> >>>> The broken link on our persistence awareness guide is embarrassing - >>>> apparently all the necessary redirects were requested and ostensibly >>>> implemented by the company who did our new web site but there was a >>>> problem >>>> with url file type extensions (don't ask me why that would be a >>>> problem). >>>> Still waiting the full solution but meanwhile our content person >>>> has done >>>> some manual fixes have been done. >>>> >>>> >>>> >>>> On Tue, Apr 19, 2016 at 8:43 PM, Phil Archer <phila@w3.org> wrote: >>>> >>>>> In my view, running your own domain name is the best option. ANDS >>>>> already >>>>> has persistence as a core idea although, irony of ironies, I >>>>> notice that >>>>> links I have in some of my work to your guidance on persistent >>>>> identifiers >>>>> leads to, yes, 404s :-( (see >>>>> http://philarcher.org/diary/2013/uripersistence/#ands and the >>>>> links to >>>>> things like >>>>> http://ands.org.au/guides/persistent-identifiers-awareness.html) >>>>> >>>>> Set up a domain that doesn't mention any organisational name, set >>>>> it up >>>>> for one job only - to provide permanent URIs - and make it >>>>> transferable. >>>>> id.org.au seems to be available at the moment, for example. >>>>> >>>>> purl.org was set up as *the* solution, thus creating a single >>>>> point of >>>>> failure - which is now close to failure. Relying on someone else's >>>>> centralised system, be it purl or DOI, leaves you susceptible to >>>>> their >>>>> future failures. Stepping in would mean taking on responsibility for >>>>> other >>>>> people's redirections as well are your own. So be decentralised, >>>>> use the >>>>> Web, look after your own needs. >>>>> >>>>> My 2 cents. >>>>> >>>>> Phil. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On 19/04/2016 05:42, Monica Omodei wrote: >>>>> >>>>>> I hope this is an appropriate forum to ask advice. >>>>>> >>>>>> We have been using the purl.org resolver service for some years to >>>>>> provide >>>>>> a globally unique, persistent, resolvable identifier for Australian >>>>>> research grants so they can be used in metadata describing research >>>>>> outputs >>>>>> like publications, data, software etc. They resolve currently to >>>>>> a view >>>>>> page in Research Data Australia - researchdata.ands.org.au. There >>>>>> is an >>>>>> API >>>>>> but it returns only JSON currently, not XML or RDF >>>>>> >>>>>> We decided not to run our own resolver service because we are not an >>>>>> ongoing organisation with a guaranteed persistent domain and felt >>>>>> that a >>>>>> public resolver service was more suitable. At the time purl.org >>>>>> seemed >>>>>> the >>>>>> right choice. We are now struggling because we cannot make any >>>>>> changes. >>>>>> As >>>>>> has been noted in this forum the admin UI is not available at the >>>>>> moment. We also were concerned when the ability to create our own >>>>>> sub-domain was removed as we envisage want to hand over some >>>>>> sub-domains >>>>>> from our root, *au-research, *to different parties for maintenance. >>>>>> >>>>>> We now also support the Vocabularies Australia Service >>>>>> http://ands.org.au/online-services/research-vocabularies-australia using >>>>>> >>>>>> the SISSVoc software which was established to assist with the >>>>>> publication >>>>>> and widespread use of scientific vocabularies. >>>>>> >>>>>> The ANZSRC Field of Research Vocabulary (ABS 1297.0) which is >>>>>> used to >>>>>> classify research and its outputs in Australian and New Zealand >>>>>> is also >>>>>> published through this service and we maintain purls for these >>>>>> vocabulary >>>>>> terms eg >>>>>> >>>>>> http://purl.org/au-research/vocabulary/anzsrc-for/2008/ >>>>>> >>>>>> We want to extend this provision of PURLs to other vocabularies >>>>>> but we >>>>>> cannot use the sane purl domain because of the current problems with >>>>>> OCLC >>>>>> supported purl.org service >>>>>> >>>>>> We need to decide whether to - >>>>>> >>>>>> - switch to w3id.org as the domain for the other vocabulary >>>>>> purls >>>>>> - run our own resolver service under a domain name we think >>>>>> can be >>>>>> transferred to another organisation if/when necessary >>>>>> - look for another public resolver service >>>>>> - be patient and wait for the situation with purl.org to >>>>>> resolve >>>>>> itself >>>>>> (no pun intended) >>>>>> >>>>>> Comments welcome, >>>>>> >>>>>> Monica Omodei >>>>>> Project Manager >>>>>> Australian National Data Service >>>>>> >>>>>> >>>>> -- >>>>> >>>>> >>>>> Phil Archer >>>>> W3C Data Activity Lead >>>>> http://www.w3.org/2013/data/ >>>>> >>>>> http://philarcher.org >>>>> +44 (0)7887 767755 >>>>> @philarcher1 >>>>> >>>> >>> >>> -- >>> >>> >>> Phil Archer >>> W3C Data Activity Lead >>> http://www.w3.org/2013/data/ >>> >>> http://philarcher.org >>> +44 (0)7887 767755 >>> @philarcher1 >>> >> >> >
Received on Tuesday, 26 April 2016 19:10:29 UTC