- From: Nicholas J Humfrey <njh@aelius.com>
- Date: Tue, 29 Jan 2013 20:42:38 -0000
- To: denny.vrandecic@wikimedia.de
- Cc: scorlosquet@gmail.com, semantic-web@w3.org
Hello Denny, Sorry, I am not on the semantic-web mailing list - too many emails for me to take in and not enough time. Stéphane kindly forwarded your email to me. No, it is not currently possible to serialise a triple stream with EasyRdf. I took this decision for a number of reasons: 1) EasyRdf was designed with the BBC's web platform in mind. This typically uses Java (and others) as a 'heavy lifting' service layer and PHP as a lightweight presentation layer. As such PHP should only have a single page worth of data to process at a time - thus streaming was not an important requirement. 2) At the core of the EasyRdf is the a graph model object. EasyRdf started off as an object model layer on top of ARC2 (and others). Since ARC2 has had less development work done on it, I have been expanding the number of native parsers and serialisers in it. I want to avoid making it overly complex with multiple APIs for doing similar things (!) 3) The HTTP client API that I have been using (based on Zend_HTTP_Client, which is again what the BBC uses) doesn't support streaming - it loads the full response into memory. Therefore there are fewer benefits in EasyRdf being able to stream triples. 4) I have worked hard to try and make the RDF/XML and Turtle serialisations as pretty as possible - this involves collecting/sorting all the same resources and properties together, so that the document reads well. Otherwise you just end up with a triple oriented document that reads like N-Triples or Trix. Some implementations (such as Redland) do this within the serialiser itself but that seemed like an extra overhead, when I already had the data organised like that inside the EasyRdf graph object. Having said all of that, some of the serialisers would be fairly easy to convert and I would be willing to look at changing the API in order to help you with your requirements (I am a big fan of WikiData!). It would also make sense to not have multiple PHP libraries for serialising RDF, with varying quality and features - I think this is one of the reasons why the semantic web hasn't taken off faster. What is your streaming source of triples? Are you serialising direct from the database? Can the database pre-sort subjects and properties, so they are ready to be serialised? Is this for a bulk-export or individual API queries? nick. > ---------- Forwarded message ---------- > From: Denny Vrandečić <denny.vrandecic@wikimedia.de> > Date: Tue, Jan 29, 2013 at 11:54 AM > Subject: Light-weight streaming PHP library for RDF serialization? > To: SW-forum <semantic-web@w3.org> > > > Hi, > > is there an actively maintained open source pure PHP library that can be > used to create RDF serialization from a model? > > It should be able to stream a big number of triples. > > Pluspoints if there it has no Parser or SPARQL processing library as a > dependency, in order to decrease the size of the library (smaller library > = > happier code reviewer, less maintenance costs). > > Cheers, > Denny > > -- > Project director Wikidata > Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin > Tel. +49-30-219 158 26-0 | http://wikimedia.de > > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg > unter > der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für > Körperschaften I Berlin, Steuernummer 27/681/51985. > > > > -- > Steph. >
Received on Tuesday, 29 January 2013 20:43:01 UTC