Re: [Turtle] Two formats (was: Re: Turtle, Qurtle, Super-Turtle, N-Triple, N-Quads, Trig - BC and Scope)

On 3 Mar 2011, at 07:51, Steve Harris wrote:
> For one thing, some triplestores have different default behaviours when parsing triples formats than quads formats.
[snip]

This seems like an issue that calls for user education, tool documentation, and/or configuration options in those stores.

I don't think you can argue that users have one firm expectation for the handling of N-Triples and a different firm expectation for N-Quads.

> There's also the question of what to do if you find a N-Triples file in the wild, say as part of a web crawl. Currently it's safe to import any N-Triples file, and it will only affect triples within the graph of the file itself, but someone could deliberately create malicious N-Quads files designed to add data to well known graph URIs, or to deliberately corrupt provenance data in related graphs:

This is a concern I share, and a reason why I'm opposed to multigraph/quad support in “small-scale” formats like TriG, Turtle, RDF/XML or RDF/JSON.

I managed to talk myself into believing that N-Quads are for dumps and you never should just load them when crawling the Web.

> Consequently there are several cases where the user would like to have different behaviours depending on whether the file you're parsing has 3 or 4 columns, so lets make it easy to find out without pre-parsing the whole file.

Not really an answer, but worthy of note: N-Triples currently are valid Turtle *and* valid N-Quads, distinguishable by file extension and (perhaps) media type.

Best,
Richard


> 
> - Steve
> 
> -- 
> Steve Harris, CTO, Garlik Limited
> 1-3 Halford Road, Richmond, TW10 6AW, UK
> +44 20 8439 8203  http://www.garlik.com/
> Registered in England and Wales 535 7233 VAT # 849 0517 11
> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
> 

Received on Thursday, 3 March 2011 14:24:44 UTC