Re: State of the art tools for rdf stream processing

Thank you all, @Wetz, @Rinne.
After reviewing your links I think I can specify better my desired
requirements. I write them down and add a brief analysis (please correct me
If I'm wrong).

Requirements:

1. Data stream kind of processing: I'm ok with windows and simple aggregate
functions.  (C-SPARQL, CQELS, SparqlStream)
2. Background RDF access (C-SPARQL, CQELS).
3. Be able to cross link or layer streams (C-SPARQL, CQELS, SparqlStream).
4. Ontology querying using SSN (SparqlStream)
5. Spatial filtering: just bounding box or named location, nothing fancy
(so regular SPARQL might suffice).
6. Able to integrate with current SCADA (via RDBMS).

Analysis:

=> None of the approaches covers 1-6 requirements.
=> It seems all approaches rely on an existing DSMS in order to execute the
queries. Functionality is limited by underlying DSMS. morph-streams provide
perhaps a happier path to integrate with new sources (RDBMS), but I still
have to look at the code to see it this is feasible or it would be very
costly.
=> SparqlStraem+morph-streams does not support access to background rdf,
which is basic to our purposes of demoing integration of data.
=> Not clear which approach will be the foundation of RSP-QL.

Do you think that ESWC2015 will change this situation substantially?

2015-04-18 7:40 GMT+02:00 Rinne Mikko <mikko.rinne@aalto.fi>:

>
>  Dear Javier,
>
>  Continuing the excellent summary from Peter, an important part of the
> tool selection is deciding what kind of stream processing you want to do:
>
>  1) Data stream processing characterized by the extraction of windows
> from the input stream using a stream-to-relation operator, and running
> queries over those windows. A typical application is the calculation of
> aggregate statistics (min, max, count, sum, average) over periods of time.
>
>  2) Event processing characterized by layered processing of potentially
> heterogeneous events. Examples in literature include stock trading,
> logistics (supply chain management) and computer network monitoring.
>
>  C-SPARQL, CQELS, SPARQLstream/morph-streams and Sparkwave focus on data
> stream processing with special extensions for window extraction. INSTANS
> focuses on event processing by supporting events in TriG, asynchronously
> interconnected query networks and intermediate storage of query results in
> graphs. EP-SPARQL/ETALIS implements sequence and time interval operators,
> but I'm unsure about layered event processing.
>
>  Data stream processing with INSTANS can be done, but you will need to
> write a lot more SPARQL than with the tools having built-in extensions for
> that purpose. On the other hand, layered event processing tasks tend to be
> either very awkward or altogether impossible with data stream processing
> tools, because window extraction limits delay performance on all levels and
> efforts to decrease detection delay by increasing window density force
> extra computations producing multiple duplicate answers which need to be
> filtered out.
>
>  On the specific use case of GIS, I'm not aware of any of these tools
> currently offering special support for geographical computations. I have
> tested SERVICE queries to factforge (Fig. 7
> <http://www.cs.hut.fi/~mjrinne/papers/odbase2014/Constructing%20Event%20Processing%20Systems%20of%20Layered%20and%20Heterogeneous%20Events%20with%20SPARQL%20(annotated%20author%20copy).pdf>),
> which supports e.g. omgeo:nearby into their database. INSTANS supports
> square root as an extension function
> <https://github.com/aaltodsg/instans/wiki/Extension-functions> if that
> helps with distance calculations. :-)
>
>  All the best to your project!
>
>  Mikko
>
>  On 17. Apr 2015, at 11:59, Wetz Peter <peter.wetz@tuwien.ac.at> wrote:
>
>   Dear Javier,
>
>  I’ll try to come up with a concise and (of course) subjective answer :)
>
>  First of all, it’s great to hear that you want to explore rdf streaming
> implementations combined with a GIS use case. I think the combination with
> GIS is really interesting and relevant.
>
>  To answer your question, I can give you some hints on what is my
> subjective impression:
>  C-SPARQL seems to me as quite mature in terms of rdf stream processing.
> It is also backed by many publications, which discuss its real-world
> application in different scenarios (social media monitoring, city sensing,
> etc.). Have a look at the webpage for more details [1]. I also got the
> impression that Emanuele Della Valle (initiator of C-SPARQL) is always
> willing to discuss issues and the like.
>
>  CQELS [2] is somewhat similar to C-SPARQL, yet, it does some things
> differently. It is also backed by several publications and real-world
> applications. I would recommend to take a look at it. Word on the street
> is, that there will be a new version soon-ish, which I am looking forward
> to.
>
>  Then there is EP-SPARQL/ETALIS which takes a more Complex Event
> Processing-like approach. However, I am not sure if it’s still
> maintained/updated. Source code [3] and several publications [4, 5] are
> available.
>
>  To do more namedropping, I’d like to mention some more approaches.
> However, I did not have any time to get my hands dirty on them, yet, so I
> cannot provide you with more detailed information:
>  SPARQLstream/morph-streams [6, 7], INSTANS [8], Sparkwave [9].
>
>  Another good place to get information on practical aspects are the
> tutorials given at ESWC/ISWC conferences. Luckily you can access their
> contents and slides [10]. I think it’s really helpful to look at the slides
> and get an impression of the engines’ capabilities before getting your
> hands on. Another good place to get information is the wiki of this very
> group. We collected many things there. Even though it may still appear a
> bit unorganized I’d recommend to take a look: [11].
>
>  One open question of yours is still the integration with OGC standards.
> I do not know what you mean precisely, but I think this is still a topic,
> which has not been quite addressed by the RSP community. I am not sure how
> tight of an integration with OGC standards you  imagine, but things like
> spatial queries are definitely doable right now.
>
>  Hope that helps!
>
>  Best regards,
>  Peter
>
>  [1] http://streamreasoning.org/
>  [2] https://code.google.com/p/cqels/
>  [3] https://code.google.com/p/etalis/
>  [4]
> http://iospress.metapress.com/content/t7284477156m77j1/?issue=4&genre=article&spage=397&issn=1570-0844&volume=3
>  [5] http://aifb.kit.edu/images/c/c0/Www29-anicic.pdf
>  [6] https://github.com/jpcik/morph-streams
>  [7] http://oa.upm.es/16330/1/corcho_enabling.pdf
>  [8]
> http://cse.aalto.fi/en/research/groups/distributed_systems/software/instans/
>  [9] http://sparkwave.sti2.at/index.html
>  [10] http://streamreasoning.org/events/sr4ld2014
>  [11] http://www.w3.org/community/rsp/wiki/Main_Page
>
>
>  --
>  DI (FH) Peter Wetz
> PhD Candidate
>  Doctoral College Environmental Informatics
>  Vienna University of Technology
>  Favoritenstraße 9-11
>  1040 Vienna
>  Austria
>
> M: +43-650-7954890
>  E: peter.wetz@tuwien.ac.at
>
>
>
>
>   *Von:* belitre@gmail.com [mailto:belitre@gmail.com <belitre@gmail.com>] *Im
> Auftrag von *Javier Ruiz Aranguren
> *Gesendet:* Donnerstag, 16. April 2015 15:30
> *An:* public-rsp@w3.org
> *Betreff:* State of the art tools for rdf stream processing
>
>  Hi, all:
>
>  In the GeoSmartCity project <http://www.geosmartcity.eu/> we aim at
> developing a framework in which Geo Open Data can be exploited towards
> Smart City paradigm. One of the scenarios planned forour pilots is
> underground network management involving water and sewage networkmanagement
> <https://www.w3.org/community/rsp/wiki/Use_cases#Water_Supply_and_Sewage_Network_Management>
> . This includes GIS access to sensor data from Water management SCADAs
> and use of GIS and sensed data to improve modeling and planning of water
> networks.
>
>  We would like to explore an rdf streaming implementation in order to:
>  - be able to define continous and advanced queries.
>  - integrate sources, dynamic (weather) or static (type of sensors,
> geospatial features, etc.).
>  - integrate with OGC standards frictionless.
>
>  Unfortunately the number of different query languages and discontinued
> tools discourage a bit to follow in this direction.
>
>  I would like to ask you which tools that could accomplish this goal have
> ongoing development and have some traction.
>
>  Thanks.
>
>  P.D. (Will all of these previous efforts will go to bin when RSP-QL
> become the unique standard?)
>
>
>

Received on Tuesday, 21 April 2015 08:32:28 UTC