Re: swap-scala -- an exploration of RDF and N3 in scala

[resent as there seemed to be an issue with the attachments]

Hi Eric,

That looks very promising and I wonder how scalable you will get
this using Scala which actually stands for scalable language :-)

So suppose you get from your database

:ind001 rdf:type :N0.

and you want the answer for

SELECT ?WHO WHERE {?WHO rdf:type :N10}

using following CONSTRUCTs

{?X rdf:type :N0} => {?X rdf:type :N1}.
{?X rdf:type :N1} => {?X rdf:type :N2}.
{?X rdf:type :N2} => {?X rdf:type :N3}.
{?X rdf:type :N3} => {?X rdf:type :N4}.
{?X rdf:type :N4} => {?X rdf:type :N5}.
{?X rdf:type :N5} => {?X rdf:type :N6}.
{?X rdf:type :N6} => {?X rdf:type :N7}.
{?X rdf:type :N7} => {?X rdf:type :N8}.
{?X rdf:type :N8} => {?X rdf:type :N9}.
{?X rdf:type :N9} => {?X rdf:type :N10}.


How long does it take to get the answer?
How long for the case up to :N100?
How long for the case up to :N1000?
How long for the case up to :N10000?


Really hope you can get this (sub)linear and we'll keep thumbs up!


Kind regards,

Jos De Roo | Agfa HealthCare
Senior Researcher | HE/Advanced Clinical Applications Research
T  +32 3444 7618
http://www.agfa.com/w3c/jdroo/

Quadrat NV, Kortrijksesteenweg 157, 9830 Sint-Martens-Latem, Belgium
http://www.agfa.com/healthcare



Eric Prud'hommeaux <eric@w3.org> 
Sent by: "Eric Prud'hommeaux" <ericw3c@gmail.com>
02/13/2010 07:52 PM

To
Jos De Roo/AMDUS/AGFA@AGFA
cc
connolly@w3.org, Alexandre Bertails <bertails@w3.org>, 
public-cwm-talk@w3.org, public-cwm-talk-request@w3.org
Subject
Re: swap-scala -- an exploration of RDF and N3 in scala







* jos.deroo@agfa.com <jos.deroo@agfa.com> [2010-01-22 22:59+0100]
> Please continue Dan :-)
> This is so nice and so leading to N3's full potential!

heya Jos, I think I got Cc'd 'cause I've been working on scala code to
express SPARQL queries as SQL (attached), and further up the pipeline,
map SPARQL queries backwards through CONSTRUCTs to work on other
ontologies.

SparqlToSparql is more challenging; feels a bit like a back-chaining
engine. It has no access to the a-box so it has to invent possible
antecedent patterns. For example, SparqlToSparqlTest test "trans head"
   { ?emp  empP:lastName   ?wname .
     ?pair task:drone      ?emp .
     ?pair task:manager    ?man .
     ?man  empP:lastName   ?mname } => { ?emp  foaf:last_name  ?wname .
                                         ?emp  foaf:knows      ?man .
                                         ?man  foaf:last_name  ?mname }

a query: SELECT ?lname { ?who  foaf:last_name    ?lname .
                         ?who  foaf:knows        ?whom  .
                         ?whom foaf:knows        ?whom2 .
                         ?whom2 foaf:last_name   "Smith" }

needs to traverse multiple instantiations of the rule:

SELECT ?lname { ?who     empP:lastName   ?lname .
                ?_0_pair task:drone      ?who .
                ?_0_pair task:manager    ?whom .
                ?_1_pair task:drone      ?whom .
                ?_1_pair task:manager    ?whom2 .
                ?whom2   empP:lastName   "Smith" }

I'm playing with magic sets algorithms but it's easy for the resolver
to get confused and start the substitution on the wrong foaf:last_name
predicate. (In SQL land, there are never two horn rules with the same
head.)

I thought I'd show you this stuff in case you have thoughts or interest
in bridging relational data to SemWeb, or just thought the problem was
interesting.

> It went very smooth to run
> sbt test
> [info] Building project RDF Semantics 0.1 against Scala 2.8.0.Beta1-RC5
> [info]    using TestingProject with sbt 0.6.10 and Scala 2.7.7
> [info] 
> [info] == compile ==
> [info]   Source analysis: 0 new/modified, 0 indirectly invalidated, 0 
> removed.
> [info] Compiling main sources...
> [info] Nothing to compile.
> [info]   Post-analysis: 226 classes.
> [info] == compile ==
> [info] 
> [info] == copy-test-resources ==
> [info] == copy-test-resources ==
> [info] 
> [info] == copy-resources ==
> [info] == copy-resources ==
> [info] 
> [info] == test-compile ==
> [info]   Source analysis: 0 new/modified, 0 indirectly invalidated, 0 
> removed.
> [info] Compiling test sources...
> [info] Nothing to compile.
> [info]   Post-analysis: 170 classes.
> [info] == test-compile ==
> [info] 
> [info] == test-start ==
> [info] == test-start ==
> [info] 
> [info] == org.w3.swap.test.n3parsing ==
> [info] + N3 Parsing.empty document parses to the true formula, (and): 
OK, 
> passed 100 tests.
> [info] + N3 Parsing.simple statements of 3 URI ref terms work: OK, 
passed 
> 100 tests.
> [info] + N3 Parsing.comments work like whitespace: OK, passed 100 tests.
> [info] + N3 Parsing.integer, string literals work as objects, subjects: 
> OK, passed 100 tests.
> [info] + N3 Parsing.is/of inverts sense of properties: OK, passed 100 
> tests.
> [info] + N3 Parsing.empty prefix decl: OK, passed 100 tests.
> [info] + N3 Parsing.document with 2 statements works: OK, passed 100 
> tests.
> [info] == org.w3.swap.test.n3parsing ==
> [info] 
> [info] == org.w3.swap.SExpTest ==
> [info] SExp
> [info] Test Starting: SExp should convert a simple s-exp to a string
> [info] Test Passed: SExp should convert a simple s-exp to a string
> [info] == org.w3.swap.SExpTest ==
> [info] 
> [info] == org.w3.swap.test.RDFSyntax ==
> [info] triples as atomic formulas
> [info] Test Starting: triples as atomic formulas should convert RDF 
triple 
> Atoms to strings reasonably
> [info] Test Passed: triples as atomic formulas should convert RDF triple 

> Atoms to strings reasonably
> [info] Test Starting: triples as atomic formulas should convert to 
> S-Expression reasonably
> [info] Test Passed: triples as atomic formulas should convert to 
> S-Expression reasonably
> [info] graph building
> [info] Test Starting: graph building should make a graph of 2 triples
> [info] Test Passed: graph building should make a graph of 2 triples
> [info] Test Starting: graph building should convert to S-Expression 
> reasonably
> [info] Test Passed: graph building should convert to S-Expression 
> reasonably
> [info] Test Starting: graph building should handle a bit larger graph
> [info] Test Passed: graph building should handle a bit larger graph
> [info] == org.w3.swap.test.RDFSyntax ==
> [info] 
> [info] == org.w3.swap.test.strutil ==
> [info] + String Utilities.escaping backslash: OK, passed 100 tests.
> [info] + String Utilities.quote distributes over +: OK, passed 100 
tests.
> [info] + String Utilities.d(q(q(x))) == q(x): OK, passed 100 tests.
> [info] + String Utilities.dequote(quote(s)) == s for genQuotEsc: OK, 
> passed 100 tests.
> [info] + String Utilities.dequote(quote(s)) == s for arbitrary s: OK, 
> passed 100 tests.
> [info] == org.w3.swap.test.strutil ==
> [info] 
> [info] == org.w3.swap.test.numberLex ==
> [info] + N3 tokenization.numerals tokenize correctly: OK, passed 100 
> tests.
> [info] + N3 tokenization.other tokens tokenize correctly: OK, passed 100 

> tests.
> [info] == org.w3.swap.test.numberLex ==
> [info] 
> [info] == org.w3.swap.test.ntp ==
> [info] + NTriples parsing.gives well formed formula on good parse: OK, 
> passed 100 tests.
> [info] == org.w3.swap.test.ntp ==
> [info] 
> [info] == org.w3.swap.test.NTriplesMisc ==
> [info] NTriples blank nodes
> [info] Test Starting: NTriples blank nodes should match by name
> [info] Test Passed: NTriples blank nodes should match by name
> [info] Formula.variables()
> [info] Test Starting: Formula.variables() should expect caller to remove 

> dups
> [info] Test Passed: Formula.variables() should expect caller to remove 
> dups
> [info] NTriples parser
> [info] Test Starting: NTriples parser should grok simple n-triples
> [info] Test Passed: NTriples parser should grok simple n-triples
> [info] Test Starting: NTriples parser should have a decent API
> [info] Test Passed: NTriples parser should have a decent API
> [info] == org.w3.swap.test.NTriplesMisc ==
> [info] 
> [info] == org.w3.swap.test.RDFSemantics ==
> [info] Unification
> [info] Semantics: Conjunction (aka merge)
> [info] Test Starting: Semantics: Conjunction (aka merge) should result 
in 
> a conjuction of 3 atoms
> [info] Test Passed: Semantics: Conjunction (aka merge) should result in 
a 
> conjuction of 3 atoms
> [info] Test Starting: Semantics: Conjunction (aka merge) should work on 
> this formula
> [info] Test Passed: Semantics: Conjunction (aka merge) should work on 
this 
> formula
> [info] Test Starting: Semantics: Conjunction (aka merge) should do 
> renaming when necessary
> [info] Test Passed: Semantics: Conjunction (aka merge) should do 
renaming 
> when necessary
> [info] Entailment
> [info] Test Starting: Entailment should handle X |= X for atomic, ground 
X
> [info] Test Passed: Entailment should handle X |= X for atomic, ground X
> [info] Test Starting: Entailment should handle A |= Ex x A/x 
> [info] Test Passed: Entailment should handle A |= Ex x A/x 
> [info] Test Starting: Entailment should *not* think that A |= B for 
> distinct ground A, B
> [info] Test Passed: Entailment should *not* think that A |= B for 
distinct 
> ground A, B
> [info] Test Starting: Entailment should handle A^B |= Ex v (A^B)/v
> [info] Test Passed: Entailment should handle A^B |= Ex v (A^B)/v
> [info] Test Starting: Entailment should *not* think that A^B |= Ex v 
> (A^C)/v
> [info] Test Passed: Entailment should *not* think that A^B |= Ex v 
(A^C)/v
> [info] Test Starting: Entailment should handle 2 bindings for v1, 1 for 
v2
> [info] Test Passed: Entailment should handle 2 bindings for v1, 1 for v2
> [info] Test Starting: Entailment should not bind the same var to 2 terms
> [info] Test Passed: Entailment should not bind the same var to 2 terms
> [info] Test Starting: Entailment should handle variable loops, 
> out-of-order triples
> [info] Test Passed: Entailment should handle variable loops, 
out-of-order 
> triples
> [info] Test Starting: Entailment should notice one extra character
> [info] Test Passed: Entailment should notice one extra character
> [info] Test Starting: Entailment should not loop endlessly
> [info] Test Passed: Entailment should not loop endlessly
> [info] == org.w3.swap.test.RDFSemantics ==
> [info] 
> [info] == org.w3.swap.test.LogicSyntax ==
> [info] logical formulas
> [info] Test Starting: logical formulas should represent formulas
> [info] Test Passed: logical formulas should represent formulas
> [info] Test Starting: logical formulas should find variables
> [info] Test Passed: logical formulas should find variables
> [info] == org.w3.swap.test.LogicSyntax ==
> [info] 
> [info] == org.w3.swap.test.ent ==
> [info] + RDF 2004 Entailment.add() preserves well-formedness: OK, passed 

> 100 tests.
> [info] + RDF 2004 Entailment.conjunction preserves well-formedness: OK, 
> passed 100 tests.
> [info] + RDF 2004 Entailment.f ^ g |= f: OK, passed 100 tests.
> [info] + RDF 2004 Entailment.f ^ g |= g: OK, passed 100 tests.
> [info] + RDF 2004 Entailment.skolemize(f) |= f: OK, passed 100 tests.
> [info] + RDF 2004 Entailment.not f |= skolemize(f) when f has variables: 

> OK, passed 100 tests.
> [info] ! RDF 2004 Entailment.entailment is transitive: Gave up after 
only 
> 0 passed tests. 500 tests were discarded.
> [info] == org.w3.swap.test.ent ==
> [info] 
> [info] == org.w3.swap.test.URIPathTest ==
> [info] Combining base URI with URI reference
> [info] Test Starting: Combining base URI with URI reference should 
handle 
> ..
> [info] Test Passed: Combining base URI with URI reference should handle 
..
> [info] Test Starting: Combining base URI with URI reference should 
handle 
> the empty ref
> [info] Test Passed: Combining base URI with URI reference should handle 
> the empty ref
> [info] Test Starting: Combining base URI with URI reference should 
handle 
> data: as a base URI
> [info] Test Passed: Combining base URI with URI reference should handle 
> data: as a base URI
> [info] == org.w3.swap.test.URIPathTest ==
> [info] 
> [info] == test-complete ==
> [info] == test-complete ==
> [info] 
> [info] == test-finish ==
> [info] Passed: : Total 50, Failed 0, Errors 0, Passed 49, Skipped 1
> [info] 
> [info] All tests PASSED.
> [info] == test-finish ==
> [info] 
> [info] == test-cleanup ==
> [info] == test-cleanup ==
> [info] 
> [info] == test ==
> [info] == test ==
> [success] Successful.
> [info] 
> [info] Total time: 5 s, completed Jan 22, 2010 10:53:25 PM
> [info] 
> [info] Total session time: 5 s, completed Jan 22, 2010 10:53:25 PM
> [success] Build completed successfully.
> 
> 
> Kind regards,
> 
> Jos De Roo | Agfa HealthCare
> Senior Researcher | HE/Advanced Clinical Applications Research
> T  +32 3444 7618
> http://www.agfa.com/w3c/jdroo/
> 
> Quadrat NV, Kortrijksesteenweg 157, 9830 Sint-Martens-Latem, Belgium
> http://www.agfa.com/healthcare
> 
> 
> 
> Dan Connolly <connolly@w3.org> 
> Sent by: public-cwm-talk-request@w3.org
> 01/19/2010 01:20 AM
> 
> To
> public-cwm-talk@w3.org
> cc
> Alexandre Bertails <bertails@w3.org>, Eric Prud'hommeaux <eric@w3.org>
> Subject
> swap-scala -- an exploration of RDF and N3 in scala
> 
> 
> 
> 
> 
> 
> 
> I had an idea about the semantics of N3 graph literals a couple
> weeks ago, and I wanted to explore it in scala.
> 
> I ended up doing more work around the bottom of the stack...
> e.g. very careful n-triples parsing.
> 
> I have basically run out of time to work on this for now.
> I wrote up the scala experience...
> 
> Fun and Frustration with scala
> http://www.advogato.org/person/connolly/diary/71.html
> 
> But I haven't written up the logic bits. I'm just sharing
> them in raw form here.
> 
> 
> I'm exploring code hosting options... DVCS makes that so
> much easier...
> 
> http://bitbucket.org/DanC/swap-scala/
> http://code.google.com/p/swap-scala/
> 101:882f04140bf2 2010-01-18 RDFXMLParser now passes most interesting
> tests (except maybe xml:base)
> 
> 
> 
> from the README...
> 
> Goals
>  * Implement N3Logic proof checking independent of cwm
>  * Influence EricP's scala SQL/SPARQL integration work
>  * Influence the Datagraph.org scala RDF work
>  * Influence Sandro's RDF2 thinking
>  * Influence Pat Hayes's RDF semantics advocacy around "named graphs"
> and such
> 
> Testing Plan
>  * standard RDF entailment tests, which need
>    * RDF/XML parser (working well enough to do RDF (not RDFS) entailment
> tests)
>    * n-triples parser (done)
>    * RDF proof generator, or
>      * N3 proof reader and use cwm to generate proofs
>    * RDF proof checker (entailment method is done)
>      Currently, we have an RDF entailment method, though some questions
>      about variable handling remain.
>  * standard RDF positive/negative syntax tests (done except xml:base;
>    there are other missing features, but they can come another day)
>  * Dave's turtle syntax tests (optional), which needs
>    * turtle parser or perhaps just
>      * N3 parser (working; feature complete for turtle high level
> structures,
>        but not low-level details such as string escaping)
>    * RDF entailment method
>  * standard RDFS tests, which needs
>    * RDF/XML parser
>    * n-triples parser
>    * RDFS rules in N3Rules
>    * N3Rules proof generator, or
>      * N3 proof reader
>    * N3Rules proof checker
>  * standard RIF BLD entailment tests (http://www.w3.org/TR/rif-test/),
>    which needs
>    * RIF BLD XML reader
>  * N3 syntax tests (http://www.w3.org/2000/10/swap/test/n3parser.tests),
> (opt)
>    which needs
>    * N3 parser (working; not feature-complete)
>  * some sort of N3Logic proof testing, which needs
>    * N3Logic proof generator and/or N3 proof reader (and use cwm)
>    * N3Logic proof checker
> 
> -- 
> Dan Connolly, W3C http://www.w3.org/People/Connolly/
> gpg D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E
> 
> 
> 
> 

Received on Monday, 15 February 2010 09:26:51 UTC