W3C home > Mailing lists > Public > public-rdf-wg@w3.org > November 2012

Re: Possible test case for turtle?

From: Ivan Herman <ivan@w3.org>
Date: Thu, 22 Nov 2012 09:38:22 -0500
Cc: W3C RDF WG <public-rdf-wg@w3.org>, Andy Seaborne <andy.seaborne@epimorphics.com>
Message-Id: <9655B109-15CE-4228-ACBC-B0A50549D7E9@w3.org>
To: Gavin Carothers <gavin@carothers.name>, Eric Prud'hommeaux <eric@w3.org>
Hm. Actually, it turns out that rdflib is wrong, because a '\' is necessary, it should be

> <a> a:b """blablaba bla bla "something here\""""@en .


which raises the issue whether we need test cases for he production of turtle files; here rdflib got it wrong when serializing!

Ivan

On Nov 22, 2012, at 09:34 , Ivan Herman wrote:

> Gavin, Eric,
> 
> this is an issue that came up on the RDFLib mailing list. Essentially, what it says is that the following turtle string
> 
> <a> a:b """blablaba bla bla "something here""""@en .
> 
> seems to be correct per turtle spec (which I think is true) but jena, sesame, or redland fail on it (rdflib does not).
> 
> Maybe worth adding a test case for this then...
> 
> Ivan
> 
> 
> Begin forwarded message:
> 
>> From: Osma Suominen <osma.suominen@aalto.fi>
>> Subject: Quotes not escaped in Turtle long strings
>> Date: November 22, 2012 08:20:28 EST
>> To: rdflib-dev@googlegroups.com
>> Reply-To: rdflib-dev@googlegroups.com
>> List-Id: <rdflib-dev.googlegroups.com>
>> 
>> Hi,
>> 
>> I stumbled on yet another Turtle incompatibility problem. If I use the latest rdflib version to parse this Turtle document (this is a fragment from the NYT People dataset - yes, the text is a bit nonsensical but it's what they've published):
>> 
>> --cut--
>> @prefix skos:    <http://www.w3.org/2004/02/skos/core#> .
>> 
>> <http://data.nytimes.com/13007053909111007903>
>>    skos:definition """'s president from 1983 to 1989. His presidency symbolized the return of democracy in Argentina and other Latin American nations after an era of military dictatorships. He died on March 31, 2009. Argentina Raúl Alfonsín served as
>> Mr. Alfonsín's time in office was one of upheaval that included three failed military coup attempts, hyperinflation and food riots. He won wide praise for prosecuting the military dictators who had preceded him in office.
>> Raúl Ricardo Alfonsín Foulkes was born to a family of shopkeepers in Chascomús on March 12, 1927. His father, an immigrant from Spain, was a passionate supporter of the Loyalists in the Spanish Civil War and a foe of Franco. He graduated from a military academy with a bachelor's degree and the rank of second lieutenant, but he said he \"became fed up with the military.\""""@en .
>> --cut--
>> 
>> and then serialize back to Turtle, the escaped quotation marks near the end of the string become non-escaped, i.e. I get this:
>> 
>> --cut--
>> @prefix skos: <http://www.w3.org/2004/02/skos/core#> .
>> 
>> <http://data.nytimes.com/13007053909111007903> skos:definition """'s president from 1983 to 1989. His presidency symbolized the return of democracy in Argentina and other Latin American nations after an era of military dictatorships. He died on March 31, 2009. Argentina Raúl Alfonsín served as
>> Mr. Alfonsín's time in office was one of upheaval that included three failed military coup attempts, hyperinflation and food riots. He won wide praise for prosecuting the military dictators who had preceded him in office.
>> Raúl Ricardo Alfonsín Foulkes was born to a family of shopkeepers in Chascomús on March 12, 1927. His father, an immigrant from Spain, was a passionate supporter of the Loyalists in the Spanish Civil War and a foe of Franco. He graduated from a military academy with a bachelor's degree and the rank of second lieutenant, but he said he "became fed up with the military.""""@en .
>> --cut--
>> 
>> The latter version cannot be parsed at least by Redland, Jena or Sesame though rdflib itself appears to be able to parse it.
>> 
>> My reading of the Turtle spec is that unescaped quotation marks inside a longString *are* allowed, so rdflib is not strictly wrong here (though I wonder what would happen if you put three of them together inside a literal). But I think it's still pretty bad that many other toolkits cannot parse what rdflib produces. Maybe the quotes should be escaped anyway?
>> 
>> -Osma
>> 
>> -- 
>> Osma Suominen | Osma.Suominen@aalto.fi | +358 40 5255 882
>> Aalto University, Department of Media Technology, Semantic Computing Research Group
>> Room 2541, Otaniementie 17, Espoo, Finland; P.O. Box 15500, FI-00076 Aalto, Finland
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups "rdflib-dev" group.
>> To post to this group, send email to rdflib-dev@googlegroups.com.
>> To unsubscribe from this group, send email to rdflib-dev+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>> 
>> 
> 
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> FOAF: http://www.ivan-herman.net/foaf.rdf
> 
> 
> 
> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Thursday, 22 November 2012 14:38:53 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:53 GMT