- From: Phil Archer <phila@w3.org>
- Date: Tue, 14 Feb 2012 17:51:47 +0000
- To: Richard Cyganiak <richard@cyganiak.de>
- CC: Michael Hausenblas <michael.hausenblas@deri.org>, Government Linked Data Working Group WG <public-gld-wg@w3.org>
Pls see below. On 14/02/2012 17:04, Richard Cyganiak wrote: > On 10 Feb 2012, at 16:27, Phil Archer wrote: >> I think I'm happiest with, again, pointing to the "this is what we mean by a stable URI scheme" in the best practices doc and then maybe giving one or two examples and finally saying that if they can't find one then "foo/bar" is a reasonable fall-back. > > Can't we do better than this? Let's try, > > The goal is interoperability. If four catalogs use these four different ways of denoting RDF/XML: > > <http://dbpedia.org/resource/RDF/XML> > <http://www.w3.org/ns/formats/RDF_XML> > "RDF/XML" > "application/rdf+xml" > "rdf" > > … then we have failed to reach the goal of interoperability. True. > > IMO the requirements for our recommended representation of formats are: > > 1. Easy to convert from the de facto standard identifiers (IETF media types) to our chosen representation (ideally incl. possibility to validate) Yes. From your list I'd say that leaves us with any of <http://dbpedia.org/resource/RDF/XML> <http://www.w3.org/ns/formats/RDF_XML> "application/rdf+xml" i.e. remove rdf as it's way too ambiguous. Also RDF/XML. If we just suggest people use a string it's not controlled enough and we'll get junk (the Excel example being an excellent case in point). > > 2. Reasonably complete coverage of file formats Yes. That knocks out /ns/formats which is currently very small. <http://dbpedia.org/resource/RDF/XML> "application/rdf+xml" > > 3. Ability to handle existing data that doesn't use a controlled vocabulary (e.g., what if you have all of “XLS”, “Excel”, “Excel 95”, “MS Office Spreadsheet” in the input data?) And here's where we hit the reason why Ivan created /ns/formats. MIME types don't provide a 1-1 mapping to actual file formats. So I'm going to put it back in and knock out the plain MIME type. <http://dbpedia.org/resource/RDF/XML> <http://www.w3.org/ns/formats/RDF_XML> > > 4. Some recommendation for how to deal with file formats that are not registered anywhere, e.g., shapefiles Wikipedia doesn't distinguish between the different versions of Excel (and presumably the same is true for Word, PPT etc. I didn't bother to check). Also, is there a wiki/DBpedia entry for every file format? Which drives me, with all due respect and acknowledgement to the person I'm about to say this to, to knock out DBpedia to leave us with: <http://www.w3.org/ns/formats/RDF_XML> Another issue is the push-back from governments on using DBpedia. It may not be considered stable enough (I know, I know...) and if we propose something that people don't like well, the outcome is obvious. The /ns/formats solution has come up in the ADMS work as being "something it would be really good to have extended." It's not a huge job to do this, but it is a human task. And that means it needs maintenance. We would need to set up a system that made it easy to add new entries (at the moment you have to write a few files, update a .htacees file and what have you). So, OK, let's take this out again which leaves us with: <empty /> Have we overlooked anything? Well, there's Ed Summers' work that Michael pointed us to http://mediatypes.appspot.com/ which gets around most of the problems but still leaves us with the lack of 1-1 mapping. But that might be the best we can do. To be usable, we'd need to bring it into w3.org or some other über-stable domain. And that means sys team support. It's not impossible, especially as the code exists, and if there's a community willing to maintain it, OK, but I wonder if this is the time to test out Sandro's idea of a single domain for a single purpose. That would mean setting up, say, fileformats.org (it's for sale) and then managing it as part of the W3C 'estate'. That's probably an even higher hurdle than getting the necessary permissions to run Ed's code on w3.org. I really hope that others can blow a hole in my thinking and point out the easy answer! Phil. > >> >> On 10/02/2012 15:51, Michael Hausenblas wrote: >>> >>> Or, why not re-deploy Ed's excellent http://mediatypes.appspot.com/ >>> under an W3C domain? :) >>> >>> Cheers, >>> Michael >>> -- >>> Dr. Michael Hausenblas, Research Fellow >>> LiDRC - Linked Data Research Centre >>> DERI - Digital Enterprise Research Institute >>> NUIG - National University of Ireland, Galway >>> Ireland, Europe >>> Tel. +353 91 495730 >>> http://linkeddata.deri.ie/ >>> http://sw-app.org/about.html >>> >>> On 10 Feb 2012, at 15:19, Phil Archer wrote: >>> >>>> I'm getting some push-back from gov data publishers on using DBpedia >>>> sadly (it's third party, it's not real, it's not stable, not like all >>>> our wonderful government department Web sites that sometimes stay on >>>> line for whole months!). The PROMOM effort that Dave has highlighted >>>> looks like the kind of thing they'd like more - government agency to >>>> government agency - as long as there's no ".uk" anywhere in the URIs I >>>> guess. >>>> >>>> How about "use a stable URI scheme for file formats if available, >>>> falling back to the MIME type if not available" ? >>>> >>>> Phil. >>>> >>>> >>>> >>>> On 10/02/2012 15:06, John Erickson wrote: >>>>>>> The Right Thing to do would be to get IETF to mint URIs for all media >>>>>>> types, and get ESRI to register a media type for their file format, >>>>>>> etc. >>>>>>> This may not be feasible. >>>>> >>>>> ...or maybe we could just follow the same, de facto convention we've >>>>> been following of using URIs from A Certain Third party: >>>>> >>>>> http://dbpedia.org/resource/TIFF >>>>> http://dbpedia.org/resource/JPEG >>>>> http://dbpedia.org/resource/GZIP >>>>> >>>>> ...etc. ;) >>>>> >>>> >>>> -- >>>> >>>> >>>> Phil Archer >>>> W3C eGovernment >>>> http://www.w3.org/egov/ >>>> >>>> http://philarcher.org >>>> +44 (0)7887 767755 >>>> @philarcher1 >>>> >>> >>> >> >> -- >> >> >> Phil Archer >> W3C eGovernment >> http://www.w3.org/egov/ >> >> http://philarcher.org >> +44 (0)7887 767755 >> @philarcher1 >> > > -- Phil Archer W3C eGovernment http://www.w3.org/egov/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Received on Tuesday, 14 February 2012 17:52:19 UTC