Re: Algae Evaluation

On Mon, Jun 28, 2004 at 07:43:42PM +0100, Dave Beckett wrote:
> 
> An Evaluation of Algae (or Algae2)
> 
> [algae+] supports a requirement or design objective
> [algae-] does not support it
> [algea?] unknown or undeterminable
> 
> 
> Algae is described at http://www.w3.org/2004/05/06-Algae/
> and is used in the W3C Annotea Server and implemented in Perl.
> 
> In general Algae looks like:
> 
>   ns rss=<<http://purl.org/rss/1.0/>
>   ns prose=<http://example.org/proseInfo#>
> 
>   read <http://example.org/some-data> ()
>   read <file:mydata.rdf> (inputlang=\"rdf\")

note, the escaped '"'s are only present in shell script files
(because otherwise the shell script would look for a new arg).
in the actual language, it's this:
   read <file:mydata.rdf> (inputlang="rdf")

>   ask (
>       ?article <http://purl.org/rss/1.0/title> ?title .
>       ?article <http://purl.org/rss/1.0/link> ?link .
>       ?article <http://purl.org/rss/1.0/description> ?description
>       ?article prose:wordCount ?words { ?words < 1000 }.
>   )
>   collect (?title ?link ?description ?words)
> 
> The execution is done in the order specified in the file so that
> multiple read(...) ask(...) etc. can be performed, extending the
> query graph as well as supplementing the results.

this is something that really excites me about this language.
i find it very useful for query federation.

> There are other keywords not described here, such as attach for
> connecting to a store (in a SQL database).
> 
> 
> Requirements
> ------------
> 
> [algae+] 3.1 RDF Graph Pattern Matching
> 
>   ?article <http://purl.org/rss/1.0/title> ?title  
> 
> [algae+] 3.2 Variable Binding Results
> 
>   collect (?title ?link ?description ?words)
> 
> [algae-] 3.3 Extensible Value Testing
> 
>   an extensibility declaration "requires" exists but not for value testing.

Actually, require was intended to provide a safe mechanism for all
extensions, including grammar and set of functions. To test this,
implemented a simply hostname() function call and created a test file [1]:

ns <http://purl.org/rss/1.0/>
ns da=<http://www.w3.org/2001/sw/DataAccess/>
ns ex=<http://example.org/foo/bar/>

# Get some data to play with.
require <http://www.w3.org/2004/06/20-rules/#assert>
assert (
 da:UseCases title "RDF Data Access Use Cases and Requirements" ;
             link <http://www.w3.org/2001/sw/DataAccess/> ;
             description "specifies use cases, requirements, ..." .

 ex:someDoc title "Title of Some Document" ;
            link <http://example.org/foo/bar/> ;
            description "A more verbose description of some document."
)

# Play with it.
ns prov=<http://example.org/provenanceStuff#>
require prov:profile
ask (?article title ?title ;
              link ?link {hostname(?link) == "www.w3.org"} ;
              description ?description)

collect (?title ?link ?description)


and got back the expected document at W3:
+-------...-------+--------------------...-------+---------...-------------+
|       ...  title|                    ...   link|         ...  description|
|-------...-------|--------------------...-------|---------...-------------|
|"RDF Da...ements"|<http://www.w3.org/2...ccess/>|"specifie...rements, ..."|
+-------...-------+--------------------...-------+---------...-------------+


> [algae-] 3.4 Subgraph Results

This is something I think algae is especially good at. You can get back
results like:

+-------------------------+
|                        n|
|-------------------------|
|<http://example.org/n#A3>|
|<http://example.org/n#A1>|
|<http://example.org/n#A2>|
+-------------------------+

or

-------------------------+----------------------------------------------------+
                        n|                                                    |
-------------------------|----------------------------------------------------|
<http://example.org/n#A3>|                                                    |
<http://example.org/n#A3> <http://example.org/n#p2> <http://example.org/n#C> .|
<http://example.org/n#A3> <http://example.org/n#p3> <http://example.org/n#D> .|
------------------------------------------------------------------------------|
<http://example.org/n#A1>|                                                    |
<http://example.org/n#A1> <http://example.org/n#p2> <http://example.org/n#C> .|
------------------------------------------------------------------------------|
<http://example.org/n#A2>|                                                    |
<http://example.org/n#A2> <http://example.org/n#p3> <http://example.org/n#D> .|
-------------------------+----------------------------------------------------+

cf [3].

> [algae+] 3.5 Local Queries
> 
>   the default is local/in-memory and can also read from file URIs, databases.
> 
> [algae+] 3.6 Optional Match
> 
>   The ~ term inside ask(..) allows optional matches such as:
>   
>   ~?article foaf:mbox ?mbox .
> 
>   which are declared to be analogous to SQL outer joins, and
>   implemented as such.
> 
> [algae+] 3.7 Limited Datatype Support
> 
>   Integers, floating point literals, strings, datatyped strings and a
>   variety of common numeric, boolean and comparison operators.
> 
>       ?article prose:wordCount ?words { ?words < 1000 }.
> 
> [algae?] 3.8 Bookmarkable Queries
> 
> [algae-] 3.10 Result Limits
> 
>   The document claims limits are implemented in the core profile but
>   the language does not seem to have a way to use them.
> 
>   [Typo?  "required features" or "requires features" the doc has both]
> 
> [algae-] 3.11 Iterative Query
> 
> [algae+] 3.12 Streaming Results
> 
> 
> Design Objectives
> -----------------
> 
> [algae-] 4.1 Human-friendly Syntax
> 
> IMHO of course:
>  - terse keywords ('ns', 'slurp') [although 'namespace' is also legal]
>  - lispy brackets in an XML world :)
>  - '.' triple separator versus more typical ',' and ';'
>  - many operators
>  - unfamiliar graph operators || |& |-
> 
> [algae+] 4.2 Provenance
> 
>   After reading the explanation in
>     http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJun/0519
> 
>   algae provides this by %ATTRIB and can be constrained or assigned
>   to variables in a clause after a triple:
> 
>   ?an a:body ?body { %ATTRIB == <http://example.com/a1.rdf> } 
> 
>   This is also used for constraining literal datatypes, languages
>   and XML encodings (%DATATYPE, %LANG and %ENCODING)
> 
>   [Typo?  document says %ATTRIB, email says %PROV]
> 
> [algae-] 4.3 Non-existent Triples

Hmm, missing some text in the doc for the QTerm [4] production.
That's the meaning of the '!' in:
[17]    	QTerm 	   ::=    	decl
					| '(' graphPattern '.'? ')'
					| '~' QTerm
					| '!' decl

It looks basicly like:
ask (?who foaf:knows ?whom.
     ! ?who foaf2:livesWith ?whom)

> [algae-] 4.4 User-specifiable Serialization

Yah, it was sort of a toss-up whether to include that in the base language.
Instead I bundled it in with the rules extension [2]

ask (?who foaf:first_name ?given.
     ?who foaf:lastName ?family.
     (?who foaf2:continent "Europe" || 
      ?who foaf2:continent "North America" || 
      ?who foaf2:continent "South America"))
require <http://www.w3.org/2004/06/20-rules/#assert>
assert (?who pim:given ?given.
	?who gim:family ?family)

> [algae+] 4.5 Aggregate Query
> 
>   Conjunction, shortcut disjunction, union disjunction and merged
>   union disjunction are present.
> 
> [algae-] 4.6 Additional Semantic Information
> 
> [algae-] 4.7 Bandwidth-efficient Protocol
> 
> [algae-] 4.8 Literal Search
> 
> [algae-] 4.9 Boolean Query
> 
>   But I find this design objective difficult to understand, as written.

[1] http://dev.w3.org/cvsweb/perl/modules/W3C/Rdf/test/Hostname0-alg.sh?rev=HEAD&content-type=text/x-cvsweb-markup
[2] http://www.w3.org/2004/06/20-rules/
[3] http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJun/0615.html
[4] http://www.w3.org/2004/05/06-Algae/#doc-algae-QTerm
-- 
-eric

office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
                        Shonan Fujisawa Campus, Keio University,
                        5322 Endo, Fujisawa, Kanagawa 252-8520
                        JAPAN
        +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +1.857.222.5741 (does not work in Asia)

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Tuesday, 29 June 2004 05:20:24 UTC