- From: Sandro Hawke <sandro@w3.org>
- Date: Wed, 12 Nov 2008 12:22:56 -0500
- To: Chris Welty <cawelty@gmail.com>
- cc: "Public-Rif-Wg (E-mail)" <public-rif-wg@w3.org>
> Sorry for the continued delay getting this started - but let's go. First
> presentation syntax task force meeting will be this Friday (Nov. 14) at 11AM
> EST
> (that's the usual RIF telecon time, but a different day).
>
> Please let me know if you plan to attend so I can reserve enough zakim ports.
+1 (in the sense of rsvp += Sandro)
- s
> I'd like to start by reviewing Hassan's comments below.
>
> -Chris
>
>
>
> Hassan Ait-Kaci wrote:
> > Hello,
> >
> > This is an update on my on-going efforts to produce a working parser and
> > XML serializer for the BLD Presentation Syntax: Action 564, due on
> > October 31, 2008 (http://www.w3.org/2005/rules/wg/track/actions/564).
> >
> > I had already produced such a thing for the original specs - i.e., before
> > several changes were made that have had the effect of introducing several
> > rather nasty ambiguities and context sensitivity, both at the lexical and
> > syntactic levels - even just for the canonical PS (i.e., even w/o the DTB
> > shortcuts and Adrian's Abridged PS).
> >
> > I have been struggling trying to find workarounds to whatever snags have
> > popped up whenever I could figure any. However, there still remain some
> > tricky situations that require our attention (at least so that we produce
> > specs that are not so uselessly complicated to implement without ad hoc
> > hacks).
> >
> > It would be good that the PS Task Force convene sometime soon to discuss
> > these issues and how to resolve them.
> >
> > Here are some examples of what I have puzzled over (this is non exhaustive
> ):
> >
> > 1) Tokenizing the argument of the Prefix and base directives is made
> > uselessy complex by not enclosing the IRI in double quotes (viz., it
> > forces a lexer to *parse* IRI's - as opposed to just read them off -
> > for no purpose whatsoever, making the lexical nature of some characters
> > context-sensitive (for example, ':' is used as a delimiter for CURIE's
> > but not within IRI's; or, '#' is used as class membership, but not
> > within IRI's; etc, ...).
> >
> > A possible workaround is simply to double-quote them in the directives.
> >
> > 2) The minus sign ('-') now appears in some identifiers (e.g., ?diffdays =
> > External(func:days-from-duration(?diffduration)). This would be no
> > problem if we just considered '-' to be part of identifiers like '_',
> > but it must also be seen as a literal character in order to recognize
> > tokens such as "->" and ":-". While this is not a major hitch, it is
> > unnecessary. (Not to mention the fact that '-' is the subtraction
> > operator in the APS.)
> >
> > A possible workaround is simply to disallow '-' in identifiers (say,
> > using '_' instead) - as is the case in most programming languages.
> >
> > 3) The ANGLEBRACKIRI notation can be dealt with declaring '<' and '>' as
> > quote chars, but this precludes them from being used as operators or
> > punctuation.
> >
> > 4) UNITERM's are defined to be either positional or attributed, but not
> > both:
> >
> > UNITERM ::= Const '(' (TERM* | (Name '->' TERM)*) ')'
> >
> > This creates an inherently unliftable reduce/reduce syntactic ambiguity
> :
> >
> > =============================
> > STATE NUMBER: 54
> > =============================
> > This state has conflicts:
> >
> > Unresolved R/R conflict: choosing R82 over R84, on inpu
> t 'IDENTIFIER'
> > Unresolved R/R conflict: choosing R82 over R84, on inpu
> t 'CLOSEPAR'
> > -----------------------------
> > [45] UniTerm --> Const 'OPENPAR' . UniTermBody 'CLOSEPAR'
> > Preceding states: {22, 51, 95, 120, 127, 148}
> > Follow set: {'CLOSEPAR'}
> > [66] UniTermBody --> . Term_star
> > Preceding states: {54}
> > [67] UniTermBody --> . TermAttribute_star
> > Preceding states: {54}
> > [82] Term_star --> .
> > Preceding states: {54}
> > Lookahead set: {'EXTERNAL', 'NUMBER', 'LOCALNAME', 'VARIABLE',
> 'STRING', 'IDENTIFIER', 'ANGLEBRACKIRI', 'CLOSEPAR', 'OPENMETA', 'COLON'}
> > [83] Term_star --> . Term_star Term
> > Preceding states: {54}
> > [84] TermAttribute_star --> .
> > Preceding states: {54}
> > Lookahead set: {'IDENTIFIER', 'CLOSEPAR'}
> > [85] TermAttribute_star --> . TermAttribute_star TermAttribute
> > Preceding states: {54}
> > -----------------------------
> > With UniTermBody, go to state 55
> > With Term_star, go to state 56
> > With TermAttribute_star, go to state 57
> >
> > A possible workaround is to use modify the rule to:
> >
> > UNITERM ::= Const '(' (TERM | (Name '->' TERM))* ')'
> >
> > (i.e., accepting mixed positional and attributed term bodies), and
> > perform a check ex post facto.
> >
> > 5) According to http://www.w3.org/TR/rif-bld/#sec-ebnf-condition-language:
> >
> > An IRICONST is the special case of a Const with the symbol
> > space rif:iri, again permitting the shortcut forms defined in
> > http://www.w3.org/TR/rif-bld/#ref-rif-dtb. One such specialization
> > is '"' IRI '"^^' 'rif:iri' from the Const production, where IRI is a
> > sequence of Unicode characters that forms an internationalized
> > resource identifier as defined by
> http://www.w3.org/TR/rif-bld/#ref-rfc-3987.
> >
> > However, this definition complicates tokenizing as it becomes
> > impossible to distinguish the special case from the general one.
> >
> > A possible workaround is to see an IRICONST as just a fully qualified
> > constant; i.e., accepting even not "rif:iri" symbol spaces and
> > performing the check ex post fact.
> >
> > Again, this is not an exhaustive list of issues. Be those as they may, I
> > will continue working on trying to produce a working [A]PS parser as my
> > time permits while on the road (I have been traveling and will be until
> > Nov. 11).
> >
> > It will be good that the PS Task Force discuss and find resolutions to all
> > such issues.
> >
> > Regards,
> >
> > -hak
> > --
> > Hassan Aït-Kaci * ILOG, Inc. - Product Division R&D
> > http://koala.ilog.fr/wiki/bin/view/Main/HassanAitKaci
> >
> >
>
> --
> Dr. Christopher A. Welty IBM Watson Research Center
> +1.914.784.7055 19 Skyline Dr.
> cawelty@gmail.com Hawthorne, NY 10532
> http://www.research.ibm.com/people/w/welty
> --
> Dr. Christopher A. Welty IBM Watson Research Center
> +1.914.784.7055 19 Skyline Dr.
> cawelty@gmail.com Hawthorne, NY 10532
> http://www.research.ibm.com/people/w/welty
Received on Wednesday, 12 November 2008 17:23:19 UTC