- From: Matthias Samwald <samwald@gmx.at>
- Date: Sat, 11 Sep 2010 12:04:07 +0200
- To: "Lee Feigenbaum" <lee@thefigtrees.net>, "Michel_Dumontier" <Michel_Dumontier@carleton.ca>
- Cc: "Eric Prud'hommeaux" <eric@w3.org>, "Chimezie Ogbuji" <ogbujic@ccf.org>, <public-semweb-lifesci@w3.org>
I guess we should keep in mind that this discussion was (at least originally) not about how units are represented on the Semantic Web, but how they should be represented for a specific project: the TMO. Different people, projects and communities will have different needs, and we will not be able to achieve a consensus that will make everyone happy. Therefore, it might be reasonable to focus on the specific case of TMO -- and maybe some of the consensus we reach there can be generalized to other areas. David wrote: > the Mars Climate Orbiter was famously lost because one team assumed Metric > units and another team assumed English units It is silly not to include explicit information about units, but it might be equally silly not to use SI units in a science or technology environment. I guess it might be easy to say this as a continental European, but non-SI units should be eradicated from sci/tech data. That might have more impact on interoperability than any standardized vocabularies, mapping algorithms etc., and it might be simpler to implement in the long run. However, I see one problem with requiring data providers to convert their units to standard units (besides the extra effort involved): in some settings it might be important to capture the _original_ value and unit of the measurement, just for the sake of knowing the original datum. This might even be a legal requirement in some clinical settings. In my understanding, the goal of TMO is to be used in translational research, not clinical practice, and therefore this will probably not be an issue. Mark wrote: > It speaks to a conversation that I had with my review committee this > morning about how The Web was built by simply being completely open. > Anyone could (can) publish anything in any way they want, so long as they > adhere to the simple rules of HTML. I am very concerned that the Semantic > Web is not learning its lessons from the WWW. We are trying to > institutionalize everything, and that simply doesn't work (it doesn't > scale!). I guess the classic web and its tremendous global success is a good inspiration, but I am not sure about how easily the principles of the web can be translated into principles of the web of data. The 'anything goes' approach might just shift the problem from the data publishing phase to the data consumption phase, which could result in the temporary belief of having solved the problem. Let me make a bold statement: there is no lack of biomedical RDF data anymore. In fact, we are now in a situation where the same open dataset is often RDFized several times by different groups. This growing number of duplicated efforts is an interesting new development, and I might try to document and analyze this trend when I find the time. Still, it is far from trivial to actually query these datasets, because of their heterogeneity. The answer is not to institutionalize everything, but to simply make RDF publishers better aware of concerns about overabundant heterogeneity and lack of transparency. And it could be a good reason to reduce sources of heterogeneity in a project that is under our control, such as the TMO. Cheers, Matthias Samwald // DERI Galway, Ireland // Konrad Lorenz Institute for Evolution and Cognition Research, Austria // http://samwald.info
Received on Saturday, 11 September 2010 10:04:47 UTC