- From: Pieter Colpaert <pieter.colpaert@ugent.be>
- Date: Mon, 24 Apr 2017 16:54:40 +0200
- To: Nicholas Humfrey <nicholas.humfrey@bbc.co.uk>
- Cc: "public-lod@w3.org" <public-lod@w3.org>
Hi Nick, Thank you! I agree fully that more choices for processing RDF in PHP is a good thing and I remain a fan of your work! :) I think comparing our two libraries as DOM vs. SAX would be far-fetched. Hardf also supports just parsing a small file in memory, and also does that much faster. I think we could say that EasyRDF is a high-level library while hardf is providing low-level functionality. The only functional features it provides over EasyRDF are Quads and N3 parsing though. The low-level interface of hardf is pretty limited compared to the rich and developer friendly interface of EasyRDF. Furthermore, it does not parse JSON-LD or RDF-XML and does not yet provide in memory querying functionalities. Its performance comes from the simple triple representation in an array. I think EasyRDF could become much faster if it adopted the same triple representation: https://github.com/pietercolpaert/hardf#triple-representation and then Hardf would be useful to provide parts of the EasyRDF functionality. It would however require a big EasyRDF refactor. Would you be interested in that? Kind regards, Pieter On 24-04-17 16:26, Nicholas Humfrey wrote: > Hello, > > Congratulations on Hardf, it is great to have more choices for processing > RDF in PHP. > > It is worth noting that EasyRDF was never intended to be handle large > numbers of triples. EasyRDF loads everything into memory, so the data > structures can be walked, queried and manipulated. > > > Could EasyRDf v. Hardf be linked to DOM v. SAX? > > > nick. > > > On 18/04/2017, 17:50, "Pieter Colpaert" <pieter.colpaert@ugent.be> wrote: > >> Thank you Masaka! >> >> I have created an issue about this and added a test to the development >> branch that fails: https://github.com/pietercolpaert/hardf/issues/7 >> >> There is a regular expression in which I do not manage to get the >> regular expressions with the right unicode code points to work. Somehow, >> PHP tells me even they are disallowed: >> https://travis-ci.org/pietercolpaert/hardf/jobs/223208399#L195 >> >> I will try to find a solution. >> >> Kind regards, >> >> Pieter >> >> On 18-04-17 15:31, KANZAKI Masahide wrote: >>> Hello, thank you for the useful PHP tool. It's really appreciated. >>> >>> I found that TriGParser fails to handle non-ASCII prefixed names, e.g. >>> >>> @prefix c: <http://example.org/>. >>> c:test a c:テスト . >>> >>> while it's OK to parse IRI : >>> >>> @prefix c: <http://example.org/>. >>> c:test a <http://example.org/テスト> . >>> >>> >>> (Note N3.js can parse both properly.) >>> >>> I guess you need to use mb_str..functions in N3Lexer::tokenizeToEnd, >>> but simple replacement didn't work... >>> >>> hope this helps. >>> cheers, >>> >>> 2017-04-18 15:24 GMT+09:00 Pieter Colpaert <pieter.colpaert@ugent.be>: >>>> Dear all, >>>> >>>> In PHP, there used to be no library supporting parsing/writing n-quads >>>> and >>>> TriG. Libraries that could handle Turtle or N-Triples however did not >>>> have >>>> streaming support and were always limited to files the size a machine >>>> could >>>> keep in memory. Today, that changed: >>>> >>>> https://github.com/pietercolpaert/hardf >>>> >>>> I have spent some time to port Ruben Verborgh’s great N3.js library to >>>> PHP. >>>> By doing so, we also achieved a parsing speed of 200 times the current >>>> most >>>> popular turtle parsing library in PHP [1]. We hope this library is a >>>> contribution to all websites that already served RDF today using PHP >>>> (Drupal, The DataTank, Wordpress, ...) and saves some servers from >>>> spending >>>> too many CPU cycles on RDF handling ;) >>>> >>>> Kind regards, >>>> >>>> Pieter Colpaert >>>> >>>> [1] https://github.com/pietercolpaert/hardf#performance >>>> >>>> >>>> >>> >> > > > ----------------------------- > http://www.bbc.co.uk > This e-mail (and any attachments) is confidential and > may contain personal views which are not the views of the BBC unless specifically stated. > If you have received it in > error, please delete it from your system. > Do not use, copy or disclose the > information in any way nor act in reliance on it and notify the sender > immediately. > Please note that the BBC monitors e-mails > sent or received. > Further communication will signify your consent to > this. > -----------------------------
Received on Monday, 24 April 2017 14:55:08 UTC