- From: Eric Jain <Eric.Jain@isb-sib.ch>
- Date: Thu, 12 Feb 2004 12:22:21 +0100
- To: "rdf-interest" <www-rdf-interest@w3.org>
- Cc: "Jeremy Carroll" <jjc@hplb.hpl.hp.com>
> http://www-uk.hpl.hp.com/people/jjc/tmp/trix.pdf Interesting approach. I definitely agree there is a problem with the current syntax. However, the main issue I see is that there are too many strategies mixed into a single syntax. This is bound to be used by everyone in quite different ways (see Perl, TMTOWTDI). A consequence of the flexible syntax is that current parsers are way to slow to be of any use when dealing with large amounts of data. By restricting myself to a subset of the complete syntax, I was able to write a parser that is a full order of magnitude faster than ARP (distributed with Jena). Others may be able to do even better, provided they don't reconsider and decide that the technology is neither suitable nor worth the effort. In our case it is important that the files we distribute can also be used by people who are familiar with XML, but completely clueless about RDF (the majority, today). Therefore, I'd rather not introduce terms such as 'graph', 'triple' and 'literal' into the syntax. Grouping of statements into what you call 'graphs' on the other hand is very useful for people trying to map the data to objects. This task can however also be simplified by requiring logical sets of statements to occur in sequence, rather than being scattered throughout the file. Interestingly, when presented with the choice of working with an XML or an RDF/XML representation of the same data, our developers (somewhat familiar with XML, not RDF) choose to use the RDF version (to my great relief :-). The data is relatively complex, with lots of cross-referencing, which the RDF/XML syntax can handle in a simple and consistent way. See below. Another issue is size. The RDF/XML data is currently not more than 20% larger than plain XML. Using a syntax such as TriX on the other hand I fear would increase the size by a factor of at least two, more than acceptable. In conclusion, what we need, I believe, is not a new syntax, but rather something along the line of Simon St. Laurent's 'Common XML' [http://www.simonstl.com/articles/cxmlspec.txt]; let's call it 'Common RDF'... <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://expasy.org/rdf-syntax-ns#" > <rdf:Description rdf:about="urn:lsid:expasy.org:uniref:C50-Q10466"> <rdf:type rdf:resource="&exp;Cluster"/> <rdfs:label>Titin, heart isoform N2-B related cluster</rdfs:label> <similarity>0.5</similarity> <gene rdf:ID="#_2" rdf:resource="#_1"/> <member rdf:resource="urn:lsid:expasy.org:uniprot:Q10466"/> <member rdf:resource="urn:lsid:expasy.org:uniprot:Q8TCG8"/> <member rdf:resource="urn:lsid:expasy.org:uniprot:Q15598"/> ... </rdf:Description> <rdf:Description rdf:about="#_1"> <rdf:type rdf:resource="&exp;Gene"/> <rdfs:label>BRCA</rdfs:label> ... </rdf:Description> <rdf:Description rdf:about="#_2"> <rdf:type rdf:resource="&exp;ExtendedStatement"/> <updated>2004-02-01</updated> ... </rdf:Description> <rdf:Description rdf:about="urn:lsid:expasy.org:uniref:C50-Q10467"> <rdf:type rdf:resource="&exp;Cluster"/> ... </rdf:Description> ... </rdf:RDF>
Received on Thursday, 12 February 2004 06:24:18 UTC