- From: Javier Ruiz Aranguren <jruizaranguren@gmail.com>
- Date: Tue, 21 Apr 2015 10:32:00 +0200
- To: Rinne Mikko <mikko.rinne@aalto.fi>
- Cc: "public-rsp@w3.org" <public-rsp@w3.org>
- Message-ID: <CAG_3m4LpdujUuoQrhovHwfXsN_dKCz8W3YE0z0NhWfrJAZMFww@mail.gmail.com>
Thank you all, @Wetz, @Rinne. After reviewing your links I think I can specify better my desired requirements. I write them down and add a brief analysis (please correct me If I'm wrong). Requirements: 1. Data stream kind of processing: I'm ok with windows and simple aggregate functions. (C-SPARQL, CQELS, SparqlStream) 2. Background RDF access (C-SPARQL, CQELS). 3. Be able to cross link or layer streams (C-SPARQL, CQELS, SparqlStream). 4. Ontology querying using SSN (SparqlStream) 5. Spatial filtering: just bounding box or named location, nothing fancy (so regular SPARQL might suffice). 6. Able to integrate with current SCADA (via RDBMS). Analysis: => None of the approaches covers 1-6 requirements. => It seems all approaches rely on an existing DSMS in order to execute the queries. Functionality is limited by underlying DSMS. morph-streams provide perhaps a happier path to integrate with new sources (RDBMS), but I still have to look at the code to see it this is feasible or it would be very costly. => SparqlStraem+morph-streams does not support access to background rdf, which is basic to our purposes of demoing integration of data. => Not clear which approach will be the foundation of RSP-QL. Do you think that ESWC2015 will change this situation substantially? 2015-04-18 7:40 GMT+02:00 Rinne Mikko <mikko.rinne@aalto.fi>: > > Dear Javier, > > Continuing the excellent summary from Peter, an important part of the > tool selection is deciding what kind of stream processing you want to do: > > 1) Data stream processing characterized by the extraction of windows > from the input stream using a stream-to-relation operator, and running > queries over those windows. A typical application is the calculation of > aggregate statistics (min, max, count, sum, average) over periods of time. > > 2) Event processing characterized by layered processing of potentially > heterogeneous events. Examples in literature include stock trading, > logistics (supply chain management) and computer network monitoring. > > C-SPARQL, CQELS, SPARQLstream/morph-streams and Sparkwave focus on data > stream processing with special extensions for window extraction. INSTANS > focuses on event processing by supporting events in TriG, asynchronously > interconnected query networks and intermediate storage of query results in > graphs. EP-SPARQL/ETALIS implements sequence and time interval operators, > but I'm unsure about layered event processing. > > Data stream processing with INSTANS can be done, but you will need to > write a lot more SPARQL than with the tools having built-in extensions for > that purpose. On the other hand, layered event processing tasks tend to be > either very awkward or altogether impossible with data stream processing > tools, because window extraction limits delay performance on all levels and > efforts to decrease detection delay by increasing window density force > extra computations producing multiple duplicate answers which need to be > filtered out. > > On the specific use case of GIS, I'm not aware of any of these tools > currently offering special support for geographical computations. I have > tested SERVICE queries to factforge (Fig. 7 > <http://www.cs.hut.fi/~mjrinne/papers/odbase2014/Constructing%20Event%20Processing%20Systems%20of%20Layered%20and%20Heterogeneous%20Events%20with%20SPARQL%20(annotated%20author%20copy).pdf>), > which supports e.g. omgeo:nearby into their database. INSTANS supports > square root as an extension function > <https://github.com/aaltodsg/instans/wiki/Extension-functions> if that > helps with distance calculations. :-) > > All the best to your project! > > Mikko > > On 17. Apr 2015, at 11:59, Wetz Peter <peter.wetz@tuwien.ac.at> wrote: > > Dear Javier, > > I’ll try to come up with a concise and (of course) subjective answer :) > > First of all, it’s great to hear that you want to explore rdf streaming > implementations combined with a GIS use case. I think the combination with > GIS is really interesting and relevant. > > To answer your question, I can give you some hints on what is my > subjective impression: > C-SPARQL seems to me as quite mature in terms of rdf stream processing. > It is also backed by many publications, which discuss its real-world > application in different scenarios (social media monitoring, city sensing, > etc.). Have a look at the webpage for more details [1]. I also got the > impression that Emanuele Della Valle (initiator of C-SPARQL) is always > willing to discuss issues and the like. > > CQELS [2] is somewhat similar to C-SPARQL, yet, it does some things > differently. It is also backed by several publications and real-world > applications. I would recommend to take a look at it. Word on the street > is, that there will be a new version soon-ish, which I am looking forward > to. > > Then there is EP-SPARQL/ETALIS which takes a more Complex Event > Processing-like approach. However, I am not sure if it’s still > maintained/updated. Source code [3] and several publications [4, 5] are > available. > > To do more namedropping, I’d like to mention some more approaches. > However, I did not have any time to get my hands dirty on them, yet, so I > cannot provide you with more detailed information: > SPARQLstream/morph-streams [6, 7], INSTANS [8], Sparkwave [9]. > > Another good place to get information on practical aspects are the > tutorials given at ESWC/ISWC conferences. Luckily you can access their > contents and slides [10]. I think it’s really helpful to look at the slides > and get an impression of the engines’ capabilities before getting your > hands on. Another good place to get information is the wiki of this very > group. We collected many things there. Even though it may still appear a > bit unorganized I’d recommend to take a look: [11]. > > One open question of yours is still the integration with OGC standards. > I do not know what you mean precisely, but I think this is still a topic, > which has not been quite addressed by the RSP community. I am not sure how > tight of an integration with OGC standards you imagine, but things like > spatial queries are definitely doable right now. > > Hope that helps! > > Best regards, > Peter > > [1] http://streamreasoning.org/ > [2] https://code.google.com/p/cqels/ > [3] https://code.google.com/p/etalis/ > [4] > http://iospress.metapress.com/content/t7284477156m77j1/?issue=4&genre=article&spage=397&issn=1570-0844&volume=3 > [5] http://aifb.kit.edu/images/c/c0/Www29-anicic.pdf > [6] https://github.com/jpcik/morph-streams > [7] http://oa.upm.es/16330/1/corcho_enabling.pdf > [8] > http://cse.aalto.fi/en/research/groups/distributed_systems/software/instans/ > [9] http://sparkwave.sti2.at/index.html > [10] http://streamreasoning.org/events/sr4ld2014 > [11] http://www.w3.org/community/rsp/wiki/Main_Page > > > -- > DI (FH) Peter Wetz > PhD Candidate > Doctoral College Environmental Informatics > Vienna University of Technology > Favoritenstraße 9-11 > 1040 Vienna > Austria > > M: +43-650-7954890 > E: peter.wetz@tuwien.ac.at > > > > > *Von:* belitre@gmail.com [mailto:belitre@gmail.com <belitre@gmail.com>] *Im > Auftrag von *Javier Ruiz Aranguren > *Gesendet:* Donnerstag, 16. April 2015 15:30 > *An:* public-rsp@w3.org > *Betreff:* State of the art tools for rdf stream processing > > Hi, all: > > In the GeoSmartCity project <http://www.geosmartcity.eu/> we aim at > developing a framework in which Geo Open Data can be exploited towards > Smart City paradigm. One of the scenarios planned forour pilots is > underground network management involving water and sewage networkmanagement > <https://www.w3.org/community/rsp/wiki/Use_cases#Water_Supply_and_Sewage_Network_Management> > . This includes GIS access to sensor data from Water management SCADAs > and use of GIS and sensed data to improve modeling and planning of water > networks. > > We would like to explore an rdf streaming implementation in order to: > - be able to define continous and advanced queries. > - integrate sources, dynamic (weather) or static (type of sensors, > geospatial features, etc.). > - integrate with OGC standards frictionless. > > Unfortunately the number of different query languages and discontinued > tools discourage a bit to follow in this direction. > > I would like to ask you which tools that could accomplish this goal have > ongoing development and have some traction. > > Thanks. > > P.D. (Will all of these previous efforts will go to bin when RSP-QL > become the unique standard?) > > >
Received on Tuesday, 21 April 2015 08:32:28 UTC