- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Thu, 12 Feb 2004 13:54:05 +0200
- To: "ext Eric Jain" <Eric.Jain@isb-sib.ch>
- Cc: "Jeremy Carroll" <jjc@hplb.hpl.hp.com>, "rdf-interest" <www-rdf-interest@w3.org>
On Feb 12, 2004, at 13:22, ext Eric Jain wrote: > >> http://www-uk.hpl.hp.com/people/jjc/tmp/trix.pdf > > Interesting approach. I definitely agree there is a problem with the > current syntax. However, the main issue I see is that there are too > many > strategies mixed into a single syntax. This is bound to be used by > everyone in quite different ways (see Perl, TMTOWTDI). > > A consequence of the flexible syntax is that current parsers are way to > slow to be of any use when dealing with large amounts of data. By > restricting myself to a subset of the complete syntax, I was able to > write a parser that is a full order of magnitude faster than ARP > (distributed with Jena). Others may be able to do even better, provided > they don't reconsider and decide that the technology is neither > suitable > nor worth the effort. > > In our case it is important that the files we distribute can also be > used by people who are familiar with XML, but completely clueless about > RDF (the majority, today). Certainly one of the target groups that TriX is meant for. > Therefore, I'd rather not introduce terms > such as 'graph', 'triple' and 'literal' into the syntax. The vocabulary of TriX was specifically intended to reflect the official terminology used to describe the RDF graph syntax. One of the challenges to folks learning RDF is that the vocabulary used for RDF/XML does not sync with the abstract model of the graph. > Grouping of > statements into what you call 'graphs' on the other hand is very useful > for people trying to map the data to objects. They are called 'graphs' because that's precisely what they are: RDF graphs. TriX reflects the underlying graph model of RDF in as true a fashion as possible, just as do NTriples. Hence "TriX" -> Triples in XML. > This task can however also > be simplified by requiring logical sets of statements to occur in > sequence, rather than being scattered throughout the file. > > Interestingly, when presented with the choice of working with an XML or > an RDF/XML representation of the same data, our developers (somewhat > familiar with XML, not RDF) choose to use the RDF version (to my great > relief :-). The data is relatively complex, with lots of > cross-referencing, which the RDF/XML syntax can handle in a simple and > consistent way. See below. > > Another issue is size. The RDF/XML data is currently not more than 20% > larger than plain XML. Using a syntax such as TriX on the other hand I > fear would increase the size by a factor of at least two, more than > acceptable. > I think you will find that introducing mechanisms for compression of the expression of statements will result in either (a) variability in representation, reducing the utility for tools such as XQuery, and/or (b) complexity in parsing/output, reducing the utility for tools such as XSLT, SAX, etc. TriX is not intended to be used by humans. It is also not necessary to explicitly serialize graphs as TriX, but simply to provide a virtual interface to a knowledge base that allows generic XML tools to view/search/manipulate the graph in terms of the TriX syntax. In this way, the same XQuery could be executed against a knowledge base and/or an actual TriX instance. One could then think of TriX as a means of integrating RDF and XQuery. Patrick -- Patrick Stickler Nokia, Finland patrick.stickler@nokia.com
Received on Thursday, 12 February 2004 06:54:01 UTC