- From: Phil Archer <phila@w3.org>
- Date: Tue, 14 Feb 2012 23:07:35 +0000
- To: Richard Cyganiak <richard@cyganiak.de>
- CC: Michael Hausenblas <michael.hausenblas@deri.org>, Government Linked Data Working Group WG <public-gld-wg@w3.org>
I did forget one didn't I - the PRONOM work that Dave Reynolds pointed us to. That looks like the kind of thing we need but I'll look at it in more detail. On 14/02/2012 17:51, Phil Archer wrote: > Pls see below. > > On 14/02/2012 17:04, Richard Cyganiak wrote: >> On 10 Feb 2012, at 16:27, Phil Archer wrote: >>> I think I'm happiest with, again, pointing to the "this is what we >>> mean by a stable URI scheme" in the best practices doc and then maybe >>> giving one or two examples and finally saying that if they can't find >>> one then "foo/bar" is a reasonable fall-back. >> >> Can't we do better than this? > > Let's try, > >> >> The goal is interoperability. If four catalogs use these four >> different ways of denoting RDF/XML: >> >> <http://dbpedia.org/resource/RDF/XML> >> <http://www.w3.org/ns/formats/RDF_XML> >> "RDF/XML" >> "application/rdf+xml" >> "rdf" >> >> … then we have failed to reach the goal of interoperability. > > True. > >> >> IMO the requirements for our recommended representation of formats are: >> >> 1. Easy to convert from the de facto standard identifiers (IETF media >> types) to our chosen representation (ideally incl. possibility to >> validate) > > Yes. From your list I'd say that leaves us with any of > > <http://dbpedia.org/resource/RDF/XML> > <http://www.w3.org/ns/formats/RDF_XML> > "application/rdf+xml" > > i.e. remove rdf as it's way too ambiguous. Also RDF/XML. If we just > suggest people use a string it's not controlled enough and we'll get > junk (the Excel example being an excellent case in point). > >> >> 2. Reasonably complete coverage of file formats > > Yes. That knocks out /ns/formats which is currently very small. > > <http://dbpedia.org/resource/RDF/XML> > "application/rdf+xml" > >> >> 3. Ability to handle existing data that doesn't use a controlled >> vocabulary (e.g., what if you have all of “XLS”, “Excel”, “Excel 95”, >> “MS Office Spreadsheet” in the input data?) > > And here's where we hit the reason why Ivan created /ns/formats. MIME > types don't provide a 1-1 mapping to actual file formats. So I'm going > to put it back in and knock out the plain MIME type. > > <http://dbpedia.org/resource/RDF/XML> > <http://www.w3.org/ns/formats/RDF_XML> > >> >> 4. Some recommendation for how to deal with file formats that are not >> registered anywhere, e.g., shapefiles > > Wikipedia doesn't distinguish between the different versions of Excel > (and presumably the same is true for Word, PPT etc. I didn't bother to > check). Also, is there a wiki/DBpedia entry for every file format? > > Which drives me, with all due respect and acknowledgement to the person > I'm about to say this to, to knock out DBpedia to leave us with: > > <http://www.w3.org/ns/formats/RDF_XML> > > Another issue is the push-back from governments on using DBpedia. It may > not be considered stable enough (I know, I know...) and if we propose > something that people don't like well, the outcome is obvious. > > The /ns/formats solution has come up in the ADMS work as being > "something it would be really good to have extended." It's not a huge > job to do this, but it is a human task. And that means it needs > maintenance. > > We would need to set up a system that made it easy to add new entries > (at the moment you have to write a few files, update a .htacees file and > what have you). So, OK, let's take this out again which leaves us with: > > <empty /> > > > Have we overlooked anything? > > Well, there's Ed Summers' work that Michael pointed us to > http://mediatypes.appspot.com/ which gets around most of the problems > but still leaves us with the lack of 1-1 mapping. But that might be the > best we can do. > > To be usable, we'd need to bring it into w3.org or some other > über-stable domain. And that means sys team support. It's not > impossible, especially as the code exists, and if there's a community > willing to maintain it, OK, but I wonder if this is the time to test out > Sandro's idea of a single domain for a single purpose. > > That would mean setting up, say, fileformats.org (it's for sale) and > then managing it as part of the W3C 'estate'. That's probably an even > higher hurdle than getting the necessary permissions to run Ed's code on > w3.org. > > I really hope that others can blow a hole in my thinking and point out > the easy answer! > > Phil. > > > > > > > > > >> >>> >>> On 10/02/2012 15:51, Michael Hausenblas wrote: >>>> >>>> Or, why not re-deploy Ed's excellent http://mediatypes.appspot.com/ >>>> under an W3C domain? :) >>>> >>>> Cheers, >>>> Michael >>>> -- >>>> Dr. Michael Hausenblas, Research Fellow >>>> LiDRC - Linked Data Research Centre >>>> DERI - Digital Enterprise Research Institute >>>> NUIG - National University of Ireland, Galway >>>> Ireland, Europe >>>> Tel. +353 91 495730 >>>> http://linkeddata.deri.ie/ >>>> http://sw-app.org/about.html >>>> >>>> On 10 Feb 2012, at 15:19, Phil Archer wrote: >>>> >>>>> I'm getting some push-back from gov data publishers on using DBpedia >>>>> sadly (it's third party, it's not real, it's not stable, not like all >>>>> our wonderful government department Web sites that sometimes stay on >>>>> line for whole months!). The PROMOM effort that Dave has highlighted >>>>> looks like the kind of thing they'd like more - government agency to >>>>> government agency - as long as there's no ".uk" anywhere in the URIs I >>>>> guess. >>>>> >>>>> How about "use a stable URI scheme for file formats if available, >>>>> falling back to the MIME type if not available" ? >>>>> >>>>> Phil. >>>>> >>>>> >>>>> >>>>> On 10/02/2012 15:06, John Erickson wrote: >>>>>>>> The Right Thing to do would be to get IETF to mint URIs for all >>>>>>>> media >>>>>>>> types, and get ESRI to register a media type for their file format, >>>>>>>> etc. >>>>>>>> This may not be feasible. >>>>>> >>>>>> ...or maybe we could just follow the same, de facto convention we've >>>>>> been following of using URIs from A Certain Third party: >>>>>> >>>>>> http://dbpedia.org/resource/TIFF >>>>>> http://dbpedia.org/resource/JPEG >>>>>> http://dbpedia.org/resource/GZIP >>>>>> >>>>>> ...etc. ;) >>>>>> >>>>> >>>>> -- >>>>> >>>>> >>>>> Phil Archer >>>>> W3C eGovernment >>>>> http://www.w3.org/egov/ >>>>> >>>>> http://philarcher.org >>>>> +44 (0)7887 767755 >>>>> @philarcher1 >>>>> >>>> >>>> >>> >>> -- >>> >>> >>> Phil Archer >>> W3C eGovernment >>> http://www.w3.org/egov/ >>> >>> http://philarcher.org >>> +44 (0)7887 767755 >>> @philarcher1 >>> >> >> > -- Phil Archer W3C eGovernment http://www.w3.org/egov/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Received on Tuesday, 14 February 2012 23:08:03 UTC