Re: 2 RDFa SPARQL Test Harness Issues

Seaborne, Andy wrote:
>> We could remove it - but it's valid[1][2] UTF-8, isn't it? Technically,
>> we should be able to feed that to SPARQL and the engine should deal with
>> it, right?
> 
> I am not an expert on Unicode - but not by my reading of the Unicode 
> - it's in the middle of the URL string.

I found some exact wording in RFC3629 to support your interpretation:

"It is important to understand that the character U+FEFF appearing at
any position other than the beginning of a stream MUST be interpreted
with the semantics for the zero-width non-breaking space, and MUST
NOT be interpreted as a signature."[1]

So, it is valid Unicode, but it's pre-pended to ASK - which is an
illegal SPARQL command per your implementation as you don't treat the
"zero-width non-breaking space" as valid whitespace.

> So, the parser it looks much like: "xASK ..." for some 
> character x and xASK is not legal at this point.

Right. Thanks Andy - we'll change TCs #60 and #108 to remove the BOM.

-- manu

[1] http://www.rfc-editor.org/rfc/rfc3629.txt

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: DB Launches Medical Record Sales Service with Shepherd Medical
http://blog.digitalbazaar.com/2008/02/24/health2trade/

Received on Sunday, 18 May 2008 17:56:31 UTC