- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Fri, 17 Dec 2004 21:21:56 +0000
- To: "Seaborne, Andy" <andy.seaborne@hp.com>
- Cc: Dan Connolly <connolly@w3.org>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
Seaborne, Andy wrote: > > > > Dan Connolly wrote: > >> On Fri, 2004-12-17 at 17:56 +0000, Seaborne, Andy wrote: >> >>> 1/ Character sets >>> >>> I propose SPARQL queries use UTF-8 >> >> >> >> SPARQL queries are sequences of characters; how they're >> encoded is a protocol issue, right? > > > Yes, noting that Content-Type does not apply to the request URI. And > RFC 2396 is a bit vague on the matter as to the charset of what is being > encoded. > > Literals can contain any character from UTF and there is no > distinguishing markers. I think this means we have to choose one. > > As currently stated, the SPARQL query language syntax uses XML 1.1 > qnames which includes a wide range of characters in UTF. > > What does IRI say? Any suggestions from that direction? More mundanely, queries might be written into files on disk (like in the test suite!). A single convention would less confusing. Nearby: N3 is defined for files over UTF-8: "N3 files are encoded in UTF-8" http://www.w3.org/DesignIssues/Notation3.html N-triples uses US-ASCII The Turtle grammar is over UNICODE. It doesn't specific an encoding for a file explicitly but does say: """ the content encoding of Turtle content is always UTF-8. """ I will follow this trend and put in rq23/ that the grammar is over the UNICODE character set, and say that the files are encoded in UTF-8. Andy > >> >> i.e. under the "chair expects editor to respond to each >> proposal to change that editor's spec; others in the >> WG are welcome to advise; chair steps in if consensus >> does not emerge" sort of game, I'm watching for Kendall's response. >> >> >>> This allows multi (natural) language queries. >>> >>> HTTP GET will have to encoded as usual - we do need to decide the >>> string being >>> encoded. >>> >>> In HTTP POST, Content-Type applies to the entity body. >>> A request sent by HTTP POST may use Content-Type to change the charset. >>> >>> Experiences with declaring the charset in the content show >>> this to be very error prone: >>> >>> a/ it may disagree with the HTTP header >>> >>> b/ once opened in one fashion, say the default platform charset, >>> it can be hard to reopen in another fashion: the underlying >>> stream maybe buffered. >>> >>> Aside: as the syntax currently stands (a keyword must be first), it >>> is possible >>> to snoop and tell the difference between UTF-8 and UTF-16. >>> >>> >>> 2/ We will need a URI for SPARQL >> >> >> >> I'm not so sure. >> >> My implementation experience suggests we choose >> a URI for the relationship between >> a KB and a SPARQL query for that KB. > > > In Joseki, there is a URI for the language and this is associated with a > KB/service by a property. This fits with the "query-lang=" parameter > but if you wish to define that as the relationship between SPARQL and > any KB then fine. > > Which ever, "SPARQL" is a concept so we should give it a URI so people > can reference it anyway. > > >> >> >>> Suggestions: >>> http://www.w3.org/2001/sw/DataAccess/SPARQL >>> >>> (We might want to allow for future revisions but I assume a new WG >>> would have a new URI itself so versioning isn't needed here). >>> >>> >>> 3/ Relative URIs >>> >>> Queries would need a base URI to resolve any relative URIs. >> >> >> >> would... subjunctive... >> is this an issue in the current draft? >> >> I can't tell from the grammar... >> http://www.w3.org/2001/sw/DataAccess/rq23/#term-sparql-URI >> $Revision: 1.160 $ of $Date: 2004/12/17 18:16:17 $ >> >> I suggest uriRef as the terminal name, if relative URI references >> are, by intent, allowed. >> >> Hmm... we don't currently specify how the syntactic productions >> relate to the formal definitions, do we? >> >> >> >>> We can either say "no relative URIs" (that might makes the tests >>> harder if we follow the style of the manifests in using relative URIs). >>> >>> For the protocol, "query-uri=" is a natural default base but there >>> isn't a natural one in all situations like local queries from a >>> program or one sent as plain "query=" >>> >>> I suggest a BASE clause in the QL that must be before PREFIXes. It >>> takes a single, <> quoted URI. It is not required in every query. >> >> >> >> Seems reasonable. >> >> >>> Andy >> >> >> >
Received on Friday, 17 December 2004 21:22:25 UTC