- From: Jason Diamond <jason@injektilo.org>
- Date: Sat, 11 Nov 2000 14:45:56 -0800
- To: "www-rdf-interest" <www-rdf-interest@w3.org>
> This was something I thought about when implementing Redland, and in > a language without decent, portable threading such as C [or Java, > threading in Java is awful], you can't have an active thread of > control at more than once place, so you have to compromise. For > managing streams of statements generated by de/serialising models, I > created the stream abstraction which handled the data flow > interaction, pulled by the reader. This seems to work OK. That's exactly the kind of abstraction that I was talking about. And you even did it in C! > For RDF/XML parsers, I can see why pushing and pulling interfaces > would be useful, e.g. today just for fun I just used Repat and > Redland to parse 1/6 of the 600M of dmoz RDF data before it hit a > mis-aligned tag and stopped. It consistently used a small amount of > memory, since it was not storing anything in memory, just what you > need for that size of data. You wouldn't want a DOM like way (ROM?, > W[eb]OM?) where it was all stored in memory and then made available. How did you convert the dumps to real RDF? Does anyone know why they haven't converted over yet? > However for small data (say standalone RDF/XML docs) you might want > want a DOM-like view, since it would be more convienient to work with. I agree. Manipulating an RSS model, for example, would be much more convenient if you could load it into something similar to a DOM. Of course, we're not talking about _the_ DOM, but an in-memory RDF graph similar to Redland's RDF Model Class. Populating the graph with statements from serialized XML should ideally bypass loading the XML into a DOM and use either a simpler push or pull based parser. Loading the data into one object model just to convert it into another simply wastes cycles. If you already had a DOM, however, you should be able to "read" your statements from it using abstractions like your RDF Statement Stream Class and my RDFReader class. I haven't quite been able to wrap my head around how one would use an RDF model like the DMOZ dumps without loading it into memory or importing it into a database that you could query against. The question I have is this: Can an API be devised that abstracts away whether or not the model is loaded into memory or persisted in a database and still be useful to us as developers? > > I currently favor the resource-centric view. I think most > developers today > > who are used to OO programming would find it more familiar as > well. But the > > statement-centric model is more appropriate for logic and inferencing. > > However the formal model is defined in terms of statements - fun > isn't it! I think it is easy to write the resource-centric API > around statements which makes practical sense since all the proposed > storage systems for RDF are also based on statements. I remember reading (but can't recall where) that the RDF model is actually extremely close to the relational model. The author pointed out that columns are like properties and field like objects where the primary key in each row was the subject. This obviously doesn't take into consideration repeated properties and a number of other issues but it did open my mind up to start thinking about storing a RDF in a more resource-centric manner. One of my goals for RDF.NET is to explore this approach and see where it ends up. > An RDF InfoSet - funny you should say that, some of us were > discussing what that would mean recently. The contention is that the > processing of RDF/XML syntax generally looses information (namespace > prefixies, aboutEachPrefix, xml:lang, ...) which is bad and the > output should be defined in terms of the Information Items expected > with no information loss. Supposedly, the RDF model is so wonderfully simple that it doesn't need an Infoset. It's all just triples! We all know, however, that that just isn't true. I thought that David Megginson did a good job of identifying the components of a statement (as implied by RDF M&S 1.0): SubjectType, Subject, Predicate, ObjectType, Object, and Language. Both repat and RDFReader have taken that approach and I think it works well though I would have preferred the simpler model that was advertised. Jason.
Received on Saturday, 11 November 2000 17:48:54 UTC