Re: Proposed conventions: System Triplestore, turtle Command, Text Embedded Turtle

Nice idea, however I'm not overjoyed about the use of #! as a start/stop comment structure in TET.

My well-embedded visual parser interprets it as the implicit system shell invocation.

A useful addition to /usr/bin/turtle might be an option to clear the graph (optionally before inserting the new data).

- Steve

On 2012-02-03, at 12:22, Danny Ayers wrote:

> Playing around with utility ideas, the following seem like conventions
> I could do with:
> 
> * System Triplestore - an RDF store exposed locally via shell utils
> and as http://localhost/sparql
> * turtle Command - primarily for the above (probably implemented as a
> wrapper around existing utils, e.g. rapper, Fuseki scripts)
> * Text Embedded Turtle - a minimal convention for interpreting Turtle
> data embedded in plain text files, useful with the above
> 
> I roughed these out as below, possibly more legible at:
> http://hyperdata.org/docs/manuel/index.html#Proposed
> 
> Anyone already using anything like these? Any suggestions?
> 
> -----------------------
> 
> SYSTEM TRIPLESTORE
> 
> An RDF store hosted on the local machine with a SPARQL endpoint at
> http://localhost/sparql
> 
> (If there is already a HTTP server running on port 80, that URL should
> transparently proxy to whichever port the SPARQL server is running
> on.)
> 
> It will support a default graph and named graphs.
> 
> In practice within the store, global URIs should be the norm, i.e.
> avoiding http://localhost and file:///
> 
> TURTLE COMMAND
> 
> On *nix systems, the turtle command should be available at /usr/bin/turtle
> 
> Its primary function will be to read RDF data from standard input
> (stdin) and insert this into the Default Graph in the System
> Triplestore.
> 
> It should support a minimum of the following:
> 
> turtle [OPTIONS]
> 
> -h, --help show a summary of available options
> 
> -G --named URI insert any subsequent data (from stdin) into the named
> graph in the System Store
> 
> -i, --input FORMAT set the input format to one of turtle (Turtle,
> default), rdfxml (RDF/XML), tet Text Embedded Turtle (see below)
> 
> -x, --extract set the input format to Lax Text Embedded Turtle
> 
> TEXT EMBEDDED TURTLE
> 
> A simple way of including chunks of Turtle in text documents.
> 
> Example
> 
> If the following were a Text Embedded Turtle (TET) document at
> http:/example.org/example.txt :
> 
> #!/usr/bin/turtle
> 
> # This is Turtle
> @prefix dc: <http://purl.org/dc/elements/1.1/> . #!
> But now here some ordinary text, it may
> appear on several lines as usual.
> #!
> # now back to Turtle
> @prefix foaf: <http://xmlns.com/foaf/0.1/> .
> 
> <> a foaf:Document ;
> 
> dc:title "Example" ;
> #!
> this is now text.
> #!
> # Turtle again.
> dc:description "a little example" .
> 
> It should be interpreted as the RDF (Turtle):
> 
> @prefix dc: <http://purl.org/dc/elements/1.1/> .
> @prefix foaf: <http://xmlns.com/foaf/0.1/> .
> 
> <http:/example.org/example.txt> a foaf:Document ;
> dc:title "Example" ;
> dc:description "a little example" .
> 
> - with the additional text ignored. However systems may extract the
> non-Turtle text to create an additional triple, e.g. from the above:
> 
> @prefix sioc: <http://rdfs.org/sioc/ns#> .
> 
> <http:/example.org/example.txt> <sioc:content> """But now here some
> ordinary text, it may
> flow over several lines as usual.
> this is now text.
> """ .
> 
> Note that relative URIs are interpreted from the source TET document
> (<>) and that Turtle statements may be interrupted by blocks of
> none-Turtle text, although in practice this is probably best avoided.
> 
> All valid Turtle documents are syntactically valid TET documents
> (though the media type differs).
> 
> Definition
> 
> The Text Embedded Turtle (TET) format is defined as Turtle with the
> following differences:
> 
> the media type is "text/plain"
> @@todo check/resolve definition, charset might mess up newlines
> 
> TET has a delimiter string, defined as:
> 
> tetSwitch : '#!' ('\n' | '\r')
> 
> Two states are defined for a TET parser (in addition to any defined elsewhere) :
> 
> IN_TURTLE = true | false
> 
> Interpretation of a TET document begins in the state IN_TURTLE = true
> 
> Every time a tetSwitch token is encountered, the state of IN_TURTLE
> should be inverted.
> 
> A TET document should begin with the line:
> 
> #!/usr/bin/turtle
> 
> (Note that the #! in this line isn't interpreted as a tetSwitch)
> 
> If this line isn't included, the document may be considered Lax Text
> Embedded Turtle in which case the document begins in the state
> IN_TURTLE = false
> 
> Interpretation should follow the procedure -
> 
> if IN_TURTLE == true : what follows is Turtle
> 
> else what follows should be ignored
> 
> --------------------------
> 
> Cheers,
> Danny.
> 
> 
> -- 
> http://dannyayers.com
> 
> http://webbeep.it  - text to tones and back again
> 

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 0535 7233 VAT # 849 0517 11
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ

Received on Friday, 3 February 2012 13:34:22 UTC