W3C home > Mailing lists > Public > semantic-web@w3.org > February 2012

Proposed conventions: System Triplestore, turtle Command, Text Embedded Turtle

From: Danny Ayers <danny.ayers@gmail.com>
Date: Fri, 3 Feb 2012 13:22:09 +0100
Message-ID: <CAM=Pv=Q_yGDbbXajDONY17Y-Qy6Ta8NX_J76r9B6R+sBu_+PpQ@mail.gmail.com>
To: Semantic Web <semantic-web@w3.org>
Playing around with utility ideas, the following seem like conventions
I could do with:

* System Triplestore - an RDF store exposed locally via shell utils
and as http://localhost/sparql
* turtle Command - primarily for the above (probably implemented as a
wrapper around existing utils, e.g. rapper, Fuseki scripts)
* Text Embedded Turtle - a minimal convention for interpreting Turtle
data embedded in plain text files, useful with the above

I roughed these out as below, possibly more legible at:
http://hyperdata.org/docs/manuel/index.html#Proposed

Anyone already using anything like these? Any suggestions?

-----------------------

SYSTEM TRIPLESTORE

An RDF store hosted on the local machine with a SPARQL endpoint at
http://localhost/sparql

(If there is already a HTTP server running on port 80, that URL should
transparently proxy to whichever port the SPARQL server is running
on.)

It will support a default graph and named graphs.

In practice within the store, global URIs should be the norm, i.e.
avoiding http://localhost and file:///

TURTLE COMMAND

On *nix systems, the turtle command should be available at /usr/bin/turtle

Its primary function will be to read RDF data from standard input
(stdin) and insert this into the Default Graph in the System
Triplestore.

It should support a minimum of the following:

turtle [OPTIONS]

-h, --help show a summary of available options

-G --named URI insert any subsequent data (from stdin) into the named
graph in the System Store

-i, --input FORMAT set the input format to one of turtle (Turtle,
default), rdfxml (RDF/XML), tet Text Embedded Turtle (see below)

-x, --extract set the input format to Lax Text Embedded Turtle

TEXT EMBEDDED TURTLE

A simple way of including chunks of Turtle in text documents.

Example

If the following were a Text Embedded Turtle (TET) document at
http:/example.org/example.txt :

#!/usr/bin/turtle

# This is Turtle
@prefix dc: <http://purl.org/dc/elements/1.1/> . #!
But now here some ordinary text, it may
appear on several lines as usual.
#!
# now back to Turtle
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

<> a foaf:Document ;

dc:title "Example" ;
#!
this is now text.
#!
# Turtle again.
dc:description "a little example" .

It should be interpreted as the RDF (Turtle):

@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

<http:/example.org/example.txt> a foaf:Document ;
dc:title "Example" ;
dc:description "a little example" .

- with the additional text ignored. However systems may extract the
non-Turtle text to create an additional triple, e.g. from the above:

@prefix sioc: <http://rdfs.org/sioc/ns#> .

<http:/example.org/example.txt> <sioc:content> """But now here some
ordinary text, it may
flow over several lines as usual.
this is now text.
""" .

Note that relative URIs are interpreted from the source TET document
(<>) and that Turtle statements may be interrupted by blocks of
none-Turtle text, although in practice this is probably best avoided.

All valid Turtle documents are syntactically valid TET documents
(though the media type differs).

Definition

The Text Embedded Turtle (TET) format is defined as Turtle with the
following differences:

the media type is "text/plain"
@@todo check/resolve definition, charset might mess up newlines

TET has a delimiter string, defined as:

tetSwitch : '#!' ('\n' | '\r')

Two states are defined for a TET parser (in addition to any defined elsewhere) :

IN_TURTLE = true | false

Interpretation of a TET document begins in the state IN_TURTLE = true

Every time a tetSwitch token is encountered, the state of IN_TURTLE
should be inverted.

A TET document should begin with the line:

#!/usr/bin/turtle

(Note that the #! in this line isn't interpreted as a tetSwitch)

If this line isn't included, the document may be considered Lax Text
Embedded Turtle in which case the document begins in the state
IN_TURTLE = false

Interpretation should follow the procedure -

if IN_TURTLE == true : what follows is Turtle

else what follows should be ignored

--------------------------

Cheers,
Danny.


-- 
http://dannyayers.com

http://webbeep.it  - text to tones and back again
Received on Friday, 3 February 2012 12:22:41 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:27 UTC