W3C

- DRAFT -

RIF Telecon 14 Nov 2008

14 Nov 2008

See also: IRC log

Attendees

Present
Chris Welty, Harold Boley, Hassan Ait-Kaci, Leora Morgenstern, Michael Kifer, Sandro Hawke, Stella Mitchell
Regrets
Chair
Chris Welty
Scribe
Stella Mitchell

Contents


 

Hassan's email of Oct 30, 2008

ChrisW: There are 2 kinds of issues with PS
... 1. presentability/readability
... 2 ambiguity which makes parsing difficult

... We'll start with Hassan's email about issues he encountered with parsing

Issue 1 - prefix and base directives

Hassan: This has been discussed in previous emails also.
... I propose to put IRI's in quotes in Prefix and Base directives

Sandro: You could use white spaces as a delimiter

<ChrisW> Prefix ::= 'Prefix' '(' Name IRI ')'

Hassan: It's better to make a context-free tokenizer

<Hassan> Prefix ::= 'Prefix' '(' Name STRING ')'

<sandro> <http:...>

MichaelK: Why can't we use an IRI constant here? with anglebrackets

Hassan: Ok, can use angle brackets instead of quotes as delimiters

Sandro, ChrisW: We will either use quotes or anglebrackets, and will decide which later

Issue 2 - minus sign in identifiers

<sandro> sandro: lets disallow hyphen in identifiers

<sandro> switch to: camelcase or underscore

<Hassan> In the RIF specification of the EBNF the Rule Language, it is specified that:

<Hassan> IRIMETA ::= '(' IRICONST? (Frame | 'And' '(' Frame ')')? ')'

<Hassan> Frame ::= TERM '[' (TERM '->' TERM) ']'

<Hassan> TERM ::= IRIMETA? (Const | Var | Expr | 'External' '(' Expr ')')

<Hassan> Const ::= '"' UNICODESTRING '"^^' SYMSPACE | CONSTSHORT

<Hassan> SYMSPACE ::= ANGLEBRACKIRI | CURIE

<Hassan> where CONSTSHORT, ANGLEBRACKIRI, and CURIE are defined (in the DTB shorthand notation for RIF constants) by:

<Hassan> CURIE ::= PNAME_LN | PNAME_NS

<Hassan> CONSTSHORT ::= ANGLEBRACKIRI // shorthand for "..."^^rif:iri

<Hassan> | CURIE // shorthand for "..."^^rif:iri

<Hassan> | '"' UNICODESTRING '"' // shorthand for "..."^^xs:string

<Hassan> | NumericLiteral // shorthand for "..."^^xs:integer,xs:decimal,xs:double

<Hassan> | '_' LocalName // shorthand for "..."^^rif:local

<Hassan> where:

<sandro> I suggest using http://www.w3.org/TeamSubmission/turtle/ whereever reasonable.

<Hassan> ANGLEBRACKIRI ::= '<' ([^<>"{}|^`\]-[#x00-#x20]) '>'

<Hassan> PNAME_LN ::= PNAME_NS PN_LOCAL

<Hassan> PNAME_NS ::= PN_PREFIX? ':'

<Hassan> PN_LOCAL ::= (PN_CHARS_U | [0-9]) ((PN_CHARS|'.') PN_CHARS)?

<Hassan> PN_PREFIX ::= PN_CHARS_BASE ((PN_CHARS|'.') PN_CHARS)?

<Hassan> PN_CHARS_U ::= PN_CHARS_BASE | '_'

<Hassan> PN_CHARS ::= PN_CHARS_U

<Hassan> | '-'

<Hassan> | [0-9]

<Hassan> | #x00B7

<Hassan> | [#x0300-#x036F]

<Hassan> | [#x203F-#x2040]

<Hassan> PN_CHARS_BASE ::= [A-Z]

<Hassan> | [a-z]

<Hassan> | [#x00C0-#x00D6]

<Hassan> | [#x00D8-#x00F6]

<Hassan> | [#x00F8-#x02FF]

<Hassan> | [#x0370-#x037D]

<Hassan> | [#x037F-#x1FFF]

<Hassan> | [#x200C-#x200D]

<Hassan> | [#x2070-#x218F]

<Hassan> | [#x2C00-#x2FEF]

<Hassan> | [#x3001-#xD7FF]

<Hassan> | [#xF900-#xFDCF]

<Hassan> | [#xFDF0-#xFFFD]

<Hassan> | [#x10000-#xEFFFF]

Hassan: DTB points to other specs....parts of the above grammar are from those referenced specs

ChrisW: The - is allowed in the curie notation?
... we are inheriting dash inside identifiers from curie syntax

Sandro: We will inherit additional difficult things from curie grammar also

Hassan: I'm not aiming my prototype to cover all posssible valid cases....but rather most of them

Sandro: We could give a restricted definition of identifiers

<sandro> http://www.w3.org/TeamSubmission/turtle/#name

ChrisW: Then we have to write our own syntax for these, and not use curie

Sandro: Could possibly use the one I pasted above in the IRC

<sandro> nameStartChar | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040]

<sandro> (turtle)

AxelP points to SPARQL grammar

ChrisW: What if we remove dash from other places, outside of identifiers?

Sandro: I can't imagine readable rules without infix subtraction

<ChrisW> ex:-

<ChrisW> -

<Hassan> foo-bar

<sandro> External(func:numeric-subtract(2 1)) 2 - 1 "+" "-" "*" "/" as in programming languages

you could expand all "shortcuts" before parsing

<sandro> (quoting http://www.w3.org/TR/rif-ucr/)

<Harold> The "-" in the abridged 17-4 would not be part of an identifier.

but without the infix operator which is not currently part of the grammar, you can preprocess

<Harold> So omitting "-" from identifiers helps introducing abridged PS.

Sandro: So can we remove - from identifers?

ChrisW: Then we have to reproduce syntax for curies

Hassan: If minus signs are allowed in identifiers in turtle, etc and they allow infix operators, how do they handle that?

ChrisW: They don't have - as a separate operator
... I think it's more important to be in line with the other standards, than it is to have pretty substraction syntax

<Harold> External(func:subtract-dateTimes(?deliverydate ?scheduledate)) OR External(?deliverydate-?scheduledate)

ChrisW: Let's think about this one, and come back to it

<Harold> External(func:subtract-dateTimes(?deliverydate ?scheduledate)) OR External(?deliverydate - ?scheduledate)

Sandro: Another option (that some others do) is to require spaces around the - when used as substraction

Issue 3 - ANGLEBRACKIRI

Issue 4 - UNITERMS, either positional or named but not both

Hassan: For bottom up parser, a problem

Hassan: Could allow mixing at the ebnf level, and then check for additional syntax errors (mixing) later?

is it the prefix notation that cause this ambiguity? Doesn't a "Name" look different than a "TERM"?

ChrisW: It's confusing that syntax for named arguments in uniterms is the same as syntax for slots in frames

<sandro> foo( [color]red ) would work

foo[ color::red size::big ]

<sandro> foo( color : red ) would work

<sandro> (solving Chris's problem, not Hassan.)

Sandro: We are discussing 2 different issues about named args
... Hassan's re parsing, and ChrisW's re readability

<ChrisW> foo{ color::red size::big }

<ChrisW> foo! color::red size::big !

<sandro> foo((a->b))

<sandro> f((a)->b)

<sandro> foo( named color->red)

<sandro> foo(NAMED color->red)

Sandro, why do you think color can be confused as positional argument??

<sandro> StellaMitchell, it's not truly ambiguous, it just requires more look-ahead.

why does it require more lookahead?

For a uniterm with no arguments, it is truly ambiguous (in terms of the grammar, but shouldn't matter in a practical sense? (since the parser will just pick one and that will be ok)). If there are arguments, I'm not clear on where the ambiguity comes in

<mkifer> foo(! color-> red !)

<Harold> We implemented this in ANTLR.

<sandro> foo(-> color -> red ->) :-) :-)

ChrisW: Useful for readability to distinguish named arg predicates from frames

<Hassan> foo ( bar baz -> foo boo -> goo )

<sandro> foo ( color -> red name -> rover ) ?

<Harold> PS is Human-Oriented Syntax.

<Harold> So it's ok if parsers have some harder time.

<Harold> After all, parsing is done only once, before the machine has to start its real (deductive) work..

<Harold> It's not on the XML level.

Issue 5 - IRICONST

Hassan: Cannot tell the difference between special case and general case

Hassan: The ambiguity is that you have a grammar that gives general rules and then within that grammar there is another grammar that gives more specific rules

MichaelK: Where is the difficulty? is it the ^^?

Hassan: We are talking about the rif:iri suffix
... already past the "..."^^curie
... I am talking about specifically rif:iri that complicates things at the lexical level

Sandro: I think a RIF parser should never be looking inside an IRI?

ChrisW: An IRI or a string?

Sandro: Either

Hassan: At the lexical level we require that it be "rif:iri", and when it is an IRI const complicates lexical parsing
... it would be cleaner to accept any curie there and check that it's rif:iri later

I will send an email example to make the problem more clear

AOB

ChrisW: Another meeting same time next week?
... works for everyone

MichaelK: We need to discuss the escape symbols...e.g. for double quotes within double quoted strings

Hassan: And we should discuss the APS

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.133 (CVS log)
$Date: 2008/11/14 17:12:14 $