Re: First PS TF meeting from Leora Morgenstern on 2008-11-12 (public-rif-wg@w3.org from November 2008)

From: Leora Morgenstern <leora@us.ibm.com>
Date: Wed, 12 Nov 2008 12:55:54 -0500
To: Chris Welty <cawelty@gmail.com>
Cc: "Public-Rif-Wg (E-mail)" <public-rif-wg@w3.org>, public-rif-wg-request@w3.org
Message-ID: <OF1FB4F101.3E4AA35E-ON852574FF.00626E22-852574FF.00628070@us.ibm.com>
Chris, I am interested in attending. I already have a meeting at 11 AM, 
but I'll do my best to switch it, and to attend this.

Leora



Chris Welty <cawelty@gmail.com> 
Sent by: public-rif-wg-request@w3.org
11/12/2008 10:49 AM
Please respond to
Chris Welty


To
"Public-Rif-Wg (E-mail)" <public-rif-wg@w3.org>
cc

Subject
First PS TF meeting








Sorry for the continued delay getting this started - but let's go.  First 
presentation syntax task force meeting will be this Friday (Nov. 14) at 
11AM EST 
(that's the usual RIF telecon time, but a different day).

Please let me know if you plan to attend so I can reserve enough zakim 
ports.

I'd like to start by reviewing Hassan's comments below.

-Chris



Hassan Ait-Kaci wrote:
 > Hello,
 >
 > This is an update on my on-going efforts to produce a working parser 
and
 > XML serializer for the BLD Presentation Syntax: Action 564, due on
 > October 31, 2008 (http://www.w3.org/2005/rules/wg/track/actions/564).
 >
 > I had already produced such a thing for the original specs - i.e., 
before
 > several changes were made that have had the effect of introducing 
several
 > rather nasty ambiguities and context sensitivity, both at the lexical 
and
 > syntactic levels - even just for the canonical PS (i.e., even w/o the 
DTB
 > shortcuts and Adrian's Abridged PS).
 >
 > I have been struggling trying to find workarounds to whatever snags 
have
 > popped up whenever I could figure any. However, there still remain some
 > tricky situations that require our attention (at least so that we 
produce
 > specs that are not so uselessly complicated to implement without ad hoc
 > hacks).
 >
 > It would be good that the PS Task Force convene sometime soon to 
discuss
 > these issues and how to resolve them.
 >
 > Here are some examples of what I have puzzled over (this is non 
exhaustive):
 >
 > 1) Tokenizing the argument of the Prefix and base directives is made
 >    uselessy complex by not enclosing the IRI in double quotes (viz., it
 >    forces a lexer to *parse* IRI's - as opposed to just read them off -
 >    for no purpose whatsoever, making the lexical nature of some 
characters
 >    context-sensitive (for example, ':' is used as a delimiter for 
CURIE's
 >    but not within IRI's; or, '#' is used as class membership, but not
 >    within IRI's; etc, ...).
 >
 >    A possible workaround is simply to double-quote them in the 
directives.
 >
 > 2) The minus sign ('-') now appears in some identifiers (e.g., 
?diffdays =
 >    External(func:days-from-duration(?diffduration)). This would be no
 >    problem if we just considered '-' to be part of identifiers like 
'_',
 >    but it must also be seen as a literal character in order to 
recognize
 >    tokens such as "->" and ":-". While this is not a major hitch, it is
 >    unnecessary. (Not to mention the fact that '-' is the subtraction
 >    operator in the APS.)
 >
 >    A possible workaround is simply to disallow '-' in identifiers (say,
 >    using '_' instead) - as is the case in most programming languages.
 >
 > 3) The ANGLEBRACKIRI notation can be dealt with declaring '<' and '>' 
as
 >    quote chars, but this precludes them from being used as operators or
 >    punctuation.
 >
 > 4) UNITERM's are defined to be either positional or attributed, but not
 >    both:
 >
 >        UNITERM ::= Const '(' (TERM* | (Name '->' TERM)*) ')'
 >
 >    This creates an inherently unliftable reduce/reduce syntactic 
ambiguity:
 >
 >        =============================
 >        STATE NUMBER: 54
 >        =============================
 >        This state has conflicts:
 >
 >        Unresolved R/R conflict: choosing R82          over R84,  on 
input 'IDENTIFIER'
 >        Unresolved R/R conflict: choosing R82          over R84,  on 
input 'CLOSEPAR'
 >        -----------------------------
 >        [45] UniTerm --> Const 'OPENPAR' . UniTermBody 'CLOSEPAR'
 >                      Preceding states: {22, 51, 95, 120, 127, 148}
 >                      Follow set: {'CLOSEPAR'}
 >        [66] UniTermBody --> . Term_star
 >                      Preceding states: {54}
 >        [67] UniTermBody --> . TermAttribute_star
 >                      Preceding states: {54}
 >        [82] Term_star --> .
 >                      Preceding states: {54}
 >                      Lookahead set: {'EXTERNAL', 'NUMBER', 'LOCALNAME', 
'VARIABLE', 
'STRING', 'IDENTIFIER', 'ANGLEBRACKIRI', 'CLOSEPAR', 'OPENMETA', 'COLON'}
 >        [83] Term_star --> . Term_star Term
 >                      Preceding states: {54}
 >        [84] TermAttribute_star --> .
 >                      Preceding states: {54}
 >                      Lookahead set: {'IDENTIFIER', 'CLOSEPAR'}
 >        [85] TermAttribute_star --> . TermAttribute_star TermAttribute
 >                      Preceding states: {54}
 >        -----------------------------
 >        With UniTermBody, go to state 55
 >        With Term_star, go to state 56
 >        With TermAttribute_star, go to state 57
 >
 >     A possible workaround is to use modify the rule to:
 >
 >        UNITERM ::= Const '(' (TERM | (Name '->' TERM))* ')'
 >
 >     (i.e., accepting mixed positional and attributed term bodies), and
 >     perform a check ex post facto.
 >
 > 5) According to 
http://www.w3.org/TR/rif-bld/#sec-ebnf-condition-language:
 >
 >       An IRICONST is the special case of a Const with the symbol
 >       space rif:iri, again permitting the shortcut forms defined in
 >       http://www.w3.org/TR/rif-bld/#ref-rif-dtb. One such 
specialization
 >       is '"' IRI '"^^' 'rif:iri' from the Const production, where IRI 
is a
 >       sequence of Unicode characters that forms an internationalized
 >       resource identifier as defined by 
http://www.w3.org/TR/rif-bld/#ref-rfc-3987.
 >
 >     However, this definition complicates tokenizing as it becomes
 >     impossible to distinguish the special case from the general one.
 >
 >     A possible workaround is to see an IRICONST as just a fully 
qualified
 >     constant; i.e., accepting even not "rif:iri" symbol spaces and
 >     performing the check ex post fact.
 >
 > Again, this is not an exhaustive list of issues. Be those as they may, 
I
 > will continue working on trying to produce a working [A]PS parser as my
 > time permits while on the road (I have been traveling and will be until
 > Nov. 11).
 >
 > It will be good that the PS Task Force discuss and find resolutions to 
all
 > such issues.
 >
 > Regards,
 >
 > -hak
 > --
 > Hassan Aït-Kaci  *  ILOG, Inc. - Product Division R&D
 > http://koala.ilog.fr/wiki/bin/view/Main/HassanAitKaci
 >
 >

-- 
Dr. Christopher A. Welty                    IBM Watson Research Center
+1.914.784.7055                             19 Skyline Dr.
cawelty@gmail.com                           Hawthorne, NY 10532
http://www.research.ibm.com/people/w/welty
-- 
Dr. Christopher A. Welty                    IBM Watson Research Center
+1.914.784.7055                             19 Skyline Dr.
cawelty@gmail.com                           Hawthorne, NY 10532
http://www.research.ibm.com/people/w/welty
Received on Wednesday, 12 November 2008 17:57:55 UTC