- From: William Bug <William.Bug@DrexelMed.edu>
- Date: Sat, 8 Jul 2006 11:51:42 -0400
- To: Chimezie Ogbuji <ogbujic@bio.ri.ccf.org>
- Cc: w3c semweb hcls <public-semweb-lifesci@w3.org>, Phillip Lord <phillip.lord@newcastle.ac.uk>
- Message-Id: <870BFC80-8E54-475A-A51E-B57A3673F5BA@DrexelMed.edu>
Hi Chimezie, I would say we vigorously agree, and any appearance to the contrary was my inability to more clearly describe the issue at hand. To be honest, OS X Mail crashed on me as I tried to send out the first version of this post which was much more nuanced. I should have taken that as a sign from the almighty Leibniz this was a email better left unsent. ;-) In particular, I'm a very heavy user of XML-only technologies - couldn't live without them - for all the reason you state below. My concern is given the nature of the problem under discussion in the debate to which I refer - semantic representation, semantic-based integration, and supporting semantic queries on federated neuroscientific data repositories - the problem is just the opposite - RDF++ technologies are not being clearly vetted properly yet in the larger neuroinformatics community re: the requirements at hand and some are inclined to go with XML-only technologies because it's what they know and what they are invested in. There are many counter examples to this statement - projects being worked on by folks on this list and others - but they still have limited visibility in the larger community of neuroinformatics researchers. I should also state I didn't mean to indicate I interpreted what you said as implying XSD/XSLT --> RDF was the preferred way of keeping the XML-only space in sync with the RDF++ space - when such a specific need arises - but merely a potential route, when confronted with the specific task of moving XML-only based data representations into the RDF++ space. I think my concern was this simple answer - without stating the caveats added by Philip & Chris M. - can give some the impression this was what you were saying. When it comes to performing KE on existing data sources, tools such as GRDDL are - and will continue to be - invaluable. XSLT-based translation will also be required, but as Philip has indicated in his response, this can be fraught with problems and is not really an invertible process. Yes - you can certainly perform the translation in the opposite direction, but if you are seeking to move the semantic information into the XML-only space - as opposed to mere moving data back to XML-only representation to lean on the uniform, explicit syntax provided by some constellation of XSDs & XSLTs used to interoperate amongst them - you probably shouldn't even be going back to XML-only space at all. Part of the confusion in the debate, I believe, results from the fact the work being done needs to support both KE from existing sources, as well as providing tools and "best practices" for how we'd want to see researchers encapsulate semantic information going forward. Such broad requirements will profit from using BOTH XML-only as well as RDF ++ specific technologies. As you say, it all comes down to the Use Cases and user requirements you seek to support. Please see additional, brief inline comments below. Cheers, Bill On Jul 8, 2006, at 8:23 AM, Chimezie Ogbuji wrote: > > > On Sat, 8 Jul 2006, William Bug wrote: > >> Dear Philip, >> >> Many thanks for this concise and accessible qualification to >> Chimezie's explanation. I was a little crest-fallen when I saw >> his original answer to Trish, and thought I really had >> misunderstood an issue that is becoming of very significant >> importance to several projects with which I'm involved. >> >> There have been several debates recently in the neuroinformatics >> community as to whether an XML-only (XML, XSD, XSLT, XLink) will >> suffice when creating creating sub-domain knowledge resources - >> especially if you are just collecting terminologies, as opposed to >> creating a full-blown, well-founded ontology. Whether it really >> isn't necessary to go to Semantic Web tech - i.e., the >> constellation of RDF-associated specs (RDF++ - sorry to add to the >> acronym soup - this is just a shorthand for this email) and the >> growing number of utilities for manipulating RDF/OWL and all the >> other RDF-related formalisms. > > Here is the crux of the issue. I think there is a misunderstanding > of my original response. In suggesting that XSLT makes such a > transformation relatively painless (from an established XML format > to one or more RDF representations), I wasn't suggesting this as > an argument *for* XML-only representation but as a consideration > that shouldn't be disregarded. I think one of the biggest > misconceptions people who debate whether to go for XML-only > solutions versus RDF++ (as you put it) is that the two technologies > are mutually exclusive - which the ability to write such XSLT > transforms shows is not the case at all. Afterall, XML *is* in the > semantic web stack and for good reason as well. > > I think too much time is often invested in comparison and contrast > of two representation languages that each address a different set > of problems rather than in focusing on asking the more important > question of what the requirements for representation are: > > 1) Is the data you wish to represent subject to lots of > interpretation? > 2) Is uniform syntax more important than semantics? > 3) Is the domain being modelled subject to expansion in a semantic > way? > 4) What is the nature of systems with which interoperability is > important > > etc.. I should have explained more fully at the outset that the answer to all of these requirements issues - for the specific issue in which this debate has arisen - are: 1) yes 2) no 3) yes 4) semantic integration is what's under question > > I think a handful of your points below fall more along the line of > direct comparison and contrast that I don't think is as useful for > answering the questions the neuroinformatic community may be > grappling than focusing on what are the specific problems being > solved and what are the short and long term requirements / goals > for representation. Admittedly, I should have been more clear about the focus of the debate, as I've done above. > > Cross-technological debate with well established trenches often do > very little to answer the original questions but only further > misconceptions - which is why the subject of this post (XML vs RDF) > concerns me. Sorry - I intended it to catch folks attention. I agree stating the topic so broadly doesn't help provide guidance on how to implement specific solutions to well defined requirements. The problem I'm having is if you go to Google and post that very string "XML vs RDF", you get a myriad of answers all coming from different directions. In this particular context - semantic representation, integration, manipulation for the life science space - it would be very helpful to have this group present the pros AND the cons for the rest of the community to use as a set of "best practices". One might then say, how does it help to achieve that goal by just posting an email to this list with that generic title. My answer would be - it gets the attention of the people who care - and have cogitated over this issue - thereby bootstrapping the process of creating such a resource. > > Both representation languages bring with them a set of well > established tools that become readily available once you express > your content with them and you have more to gain in leveraging dual- > representation between both (where it's feasible - I agree with the > qualification of the use of XSLT that emphasizes that it's > contingent on having a well defined mapping in the first place) via > XSLT. The word I'd focus on in this comment is BOTH. I agree with you completely. My concern - as some of the points in my previous email point to - is some don't feel the RDF++ space has the required tools. > > Consider for instance XForms (which we are using quite heavily for > instance data entry). XForms is an XML dialect that addresses > specific and well known pitfalls with legacy brower-based user > interface dialects and does so in a *very* powerful and promising > way. If a dynamic, expressive means of data entry is an important > requirement for you data (as it is in our case) then you already > have a good argument for having representation in XML for which > there is no equivalent alternative in an RDF++ only approach. The > main difficulty is that with forms-based user interfaces uniform > syntax and declarative structure is of more concern than > semantics. I've chatted about this before, see this thread: > > http://www.dehora.net/journal/2005/08/ > automated_mapping_between_rdf_and_forms_part_i.html > > Ofcourse, you don't get your lunch for free and the price for > leveraging uniform syntactical representation in order to simplify > your use of forms for data entry is the effort up front in devising > a mapping that provides the level of semantic grounding (if you > will) sufficient for your needs and express such a mapping in an > XSLT transform. Yes - absolutely - HTTP based data entry technologies have advanced considerably. XForms, XHTML, AJAX - not that they are mutually exclusive - can all add considerable flexibility and responsiveness to a data entry environment. I totally agree. As these continue to mature, it would be wonderful it specific extensions designed to interface with the RDF++ space where representation of semantic specificity and relations are an important part of the data entry process. This is critical to what we are trying to do in BIRN with BIRNLex, its use for semantic annotation of data, and the use of the resulting annotation by the BIRN query mediator. Daniel Rubin has indicated there are projects underway at NCBO that will be specifically valuable in the area of semantically- based data entry. There are many tools already developed in the context of working with the Gene Ontology that also support many of these requirements. > >> driver behind the creation of RDF++. You'll have a lot more code >> to write and maintain, if you don't take advantage of Semantic Web >> tech. > > This depends more on what it is you are trying to achieve with > representation than by the technologies by themselves, so I don't > agree with this very broad assesment. Again - sorry - I meant this statement to be limited to the still general requirements laid out in my answer to your 4 questions above and should have stated this more clearly. > >> 6) We can leave it to others to create XSLT converters to move >> the XML-only resources into the RDF++ space >> Philip & Chris M. have both given clear answers to this ill- >> advised use of XSLT. > > I don't see how use of XSLT in this way can be considered 'ill- > advised' and I don't think that was the point. The issue is that a > neccessary > prerequisite for using XSLT in this way is a well defined mapping > (if such a mapping exists) to begin with. Once you have a well > established mapping, XSLT *does* render the remaining mechanics a > non-issue and it's for this particular reason that I think > diregarding such a possibility is more ill-advised, especially if > there is already a large and valuable body of existing XML content > - this is precisely one of the main motivations for technologies > such as GRDDL. See above. I don't feel its ALWAYS ill-advised, I just think - as you are clearly stating throughout - one shouldn't rely on this solution to be appropriate for all movement of data between the XML- only space and RDF++ space. Where it's specifically ill-advised is in assuming the existence of this option precludes your having to even think about direct use of RDF++ to meet your formal semantic information representation needs now. As i believe you state below, it's not the that this is an "ill-advised use of XSLT", but rather it's the assumption the availability of this option precludes having to consider RDF++ technologies in designing your specific solution. That's what's ill-advised. > >> The other issue Eric N. has described clearly is the N**2 problem >> - the combinatorial proliferation of XSLTs as more XSDs are added >> to the mix. > > Once again, a misunderstanding of what I was suggesting. The > ability to use XSLT in such a fashion isn't an endorsement to XML- > only representation solutions but as an effective way to leverage > dual representation where there is value to do so. Agreed. > >> 9) Proponents of RDF++ argue that XML has limited semantic >> expressivity, but that's just not true. >> I think this argument is completely inverted. The problem is >> XML has nearly unlimited expressivity, but any semantic meaning >> you want to imbue your XML with must be made explicit in the >> parsers you write. > > An XML parser interprets at the syntactic level (not at the > semantic level). Semantic mapping from XML dialects typically > occurs directly via XSLT (written perhaps by those familar with the > XML schema) to RDF or by other more novel means. See: > > http://copia.ogbuji.net/blog/2006-04-03/_Semantic_ > > Ofcourse, such mappings will not be sufficient if your original > needs for representation go above and beyond what XML provides > (with regards to semantic expressiveness), but it's worth noting > that there *is* a spectrum of oppurtunity between both technologies. Again - agreed. Sorry for the generality that makes this statement ambiguous. When I said parser above, I was including all the code you write - or make use of - which includes the low-level syntactic parser such as one gets from Xerces, the XSLT mapping from the XSD into the specific semantic representation you require, and any other code you need to write to fully realize your semantic extraction/ representation requirements. > >> I) if you try to perform semantically-based KE/KR/KD with XML- >> only, you will have a lot more code to write & maintain YOURSELF - >> and much of it will reproduce what you'd get automatically using >> RDF++. > > XML was never meant to address Knoweledge Representation and > attempts to use it in such a fashion is the fault of the author not > the technology being misused. My point exactly. Sorry I didn't state this more clearly. > >> II) You just can't provide the flexibility, guaranteed >> resolvability of resources, and efficient expression required when >> representing semantic relations in the rigid, strictly >> hierarchical document-oriented world of XML-only, so you'll likely >> fall short on a lot of your requirements. > > Only with those requirements that have more to do with KR and > ubiquitous semantics than uniform, interoperable syntax. Once > again, the more constructive questions are about the nature of the > requirements not the two technologies by themselves - there *is* > always a context with their use. > > Ask yourself why message protocols such as REST / POX and Web > Services are expressed in XML and not in RDF. Ask yourself why the > same is true of user interface dialects (such as XHTML and it's > derivatives - XForms), syndication formats, etc.. and perhaps the > value of context and the nature of the problem being solved becomes > more evident. > > Polarizing comparison and contrast of both ends of the > representation strata does more harm than good to both technologies > and the more constructive questions should *first* be about what > the requirements for representation are. > > >> >> I'd really appreciate hearing the views both pro & con on these >> issues from others on this list. >> >> Thanks again, Philip, for your lucid and concise explanation. >> >> Cheers, >> Bill >> >> On Jul 7, 2006, at 6:35 AM, Phillip Lord wrote: >> >>>>>>>> "TW" == Trish Whetzel <whetzel@pcbi.upenn.edu> writes: >>> TW> Hi all, >>> TW> As a terribly simple question, is it possible to take the >>> actual >>> TW> FuGE-ML that is generated on a per instance reporting of an >>> TW> experiment/study/investigation and then convert than to RDF >>> for >>> TW> use with semantic web technologies? >>> Converting between one syntax and another is fairly simple, and >>> there >>> are some reasonably tools for it. XSLT would work for converting XML >>> into RDF. I wouldn't like to use it for converting the other way >>> (actually I wouldn't like to use it at all, but this is personal >>> prejudice!). >>> This is assuming, however, that the semantics of the two >>> representations are compatible. To give an example, syntactically it >>> is possible to convert between the GO DAG and an OWL >>> representation of >>> GO. However, the GO part-of relationship doesn't distinguish >>> universal and existential, while OWL forces you to make this >>> distinction; you can't sit on the fence. >>> So, the simple answer to a simple question is: it depends. I >>> wouldn't >>> assume that FuGE-ML will be convertible into a given >>> ontology or representation in RDF, unless a reasonable amount of >>> care >>> is taken in the design of FuGE-ML or the ontology to ensure that it >>> can happen. >>> Course, you could always hack it with some rules and a bit of human >>> intervention. That works as well. >>> Cheers >>> Phil >> >> Bill Bug >> Senior Analyst/Ontological Engineer >> >> Laboratory for Bioimaging & Anatomical Informatics >> www.neuroterrain.org >> Department of Neurobiology & Anatomy >> Drexel University College of Medicine >> 2900 Queen Lane >> Philadelphia, PA 19129 >> 215 991 8430 (ph) >> 610 457 0443 (mobile) >> 215 843 9367 (fax) >> >> >> Please Note: I now have a new email - William.Bug@DrexelMed.edu >> > > Chimezie Ogbuji > Lead Systems Analyst > Thoracic and Cardiovascular Surgery > Cleveland Clinic Foundation > 9500 Euclid Avenue/ W26 > Cleveland, Ohio 44195 > Office: (216)444-8593 > ogbujic@ccf.org > > >> >> >> >> >> >> >> This email and any accompanying attachments are confidential.This >> information is intended solely for the use of the individualto >> whom it is addressed. Any review, disclosure, >> copying,distribution, or use of this email communication by others >> is strictlyprohibited. If you are not the intended recipient >> please notify usimmediately by returning this message to the >> sender and deleteall copies. Thank you for your cooperation. > Bill Bug Senior Analyst/Ontological Engineer Laboratory for Bioimaging & Anatomical Informatics www.neuroterrain.org Department of Neurobiology & Anatomy Drexel University College of Medicine 2900 Queen Lane Philadelphia, PA 19129 215 991 8430 (ph) 610 457 0443 (mobile) 215 843 9367 (fax) Please Note: I now have a new email - William.Bug@DrexelMed.edu This email and any accompanying attachments are confidential. This information is intended solely for the use of the individual to whom it is addressed. Any review, disclosure, copying, distribution, or use of this email communication by others is strictly prohibited. If you are not the intended recipient please notify us immediately by returning this message to the sender and delete all copies. Thank you for your cooperation.
Received on Saturday, 8 July 2006 15:52:03 UTC