- From: Gavin Carothers <gavin@carothers.name>
- Date: Tue, 15 May 2012 08:28:01 -0700
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: public-rdf-wg@w3.org
On Tue, May 15, 2012 at 8:17 AM, Andy Seaborne <andy.seaborne@epimorphics.com> wrote: > > > On 15/05/12 16:01, Gavin Carothers wrote: >> >> On Tue, May 15, 2012 at 7:46 AM, Peter F. Patel-Schneider >> <pfpschneider@gmail.com> wrote: >>> >>> The Turtle editor's draft says that WS is needed to prevent >>> mis-recognition of tokens, but doesn't explicitly define a token. >> >> >> token, terminal, whatever ;) >> >>> >>> If the parsing Turtle depends on tokenizing, then there needs to be >>> an explicit definition of what a token is, and, further, what >>> mis-recognizing a token means. >> >> >> Preposed new language: >> >> White space (production WS) is used to separate two terminals which >> would otherwise be (mis-)recognized as one terminal. Rule names >> below in capitals indicate where white space is significant; these >> form a possible choice of terminals for constructing a Turtle >> parser. >> >> White space is significant in terminal IRIREF and the production >> String. >> >> --- >> >> Also, split that grammar table and identify all terminals with more >> than just all caps. >> >>> >>> For example, is >>> >>> @prefixprefix:<foo>. >>> >>> a valid Turtle statement? >> >> >> Yes. > > > Err - somewhat tricky to integrate in with common tokenizer/parser > approaches given the wonders of language tags. Please ban it ("it" = a > lack of WS after @prefix). > > [disclosure - my parsers don't care but, for speed, they are handwritten > and that means content sensitive tokenizing or messing with pushback > onto the input stream are doable - using tokenizer/parser toolkits may > make this messy] > > I note that the Turtle submission bans it. > > [4] prefixID ::= '@prefix' ws+ prefixName? ':' uriref > > >> >>> >>> Things would get even worse if the @ was allowed to be dropped, >>> which is a good reason to vote against allowing dropping of @. > > > Not true! The present of the ":" is enough to distinguish bareword keywords > and prefixes. > > If @ were not also used for language tags, single leading char might > make a parser writers life easier. But we have language tags starting > with @ and @prefix is a legal language tag (just unregistered). > > Pragmatically, WS after @prefix and @base. Then can have a token type > that is "@alpha-alphanumericsanddash". What you don't like: [19] LANGTAG ::= (BASE | PREFIX | '@' ([a-zA-Z])+ ('-' ([a-zA-Z0-9])+) ? ;) Yes, I think requiring white space between @prefix and the prefix name is a very good idea. More human readable. So much for last call review this week... sigh... --Gavin > > Andy > > >> Yes, while it is possible to create a grammar that allows this in >> general removing @ is likely to reduce human readability. > > > >> >>> >>> peter >>> >>> >> >
Received on Tuesday, 15 May 2012 15:28:32 UTC