Re: CURIE proposal ... (was: Re: [DTB] ACTIONs 428 and 292 completed) from Axel Polleres on 2008-04-22 (public-rif-wg@w3.org from April 2008)

From: Axel Polleres <axel.polleres@deri.org>
Date: Tue, 22 Apr 2008 09:20:42 +0100
To: Sandro Hawke <sandro@w3.org>, "Public-Rif-Wg (E-mail)" <public-rif-wg@w3.org>
CC: Michael Kifer <kifer@cs.sunysb.edu>, Chris Welty <cawelty@gmail.com>
Message-ID: <480D9FDA.6030403@deri.org>
Sandro Hawke wrote:
> I'm such a flip-flopper; maybe I should run for President.
> 
> While I liked Axel's proposal, I think I also get Michael's point, that
> (to paraphrase) the Presentation Syntax is not a real rule language.  If
> we want a real (usable) rule lanuage, we should carefully design one,
> not just add random patches to the PS.

ok, does that mean "let's forget about presentation syntax, and write 
all examples in XML"? Indeed I start to get a bit bored about carrying 
around a non-endorsed, pseudo-language which is hardly more readable 
than the XML syntax... it is a pitty because I thought we could get 
something readable close enough to F-logic and Turtle and we were only 
one step away, but well.

If I may recognize a "trend" than let's see what happens with Turtle/N3.
Years and years it had unofficial status while RDF/XML is normative.
But now SPARQL picked up Turlte syntax and there is a team submission 
for giving Turtle and N3 a more official status. For me, this seems to 
indicate that there is a need for some human-parseable syntax, but well.

> I practice, I think it would work to have test cases contain
> non-normative "original" input documents, written in some real rule
> language, and also normative RIF XML translations of those documents
> (hopefully, but not necessarily, generated by machine).  And if some WG
> members want to hack up something like a presentation syntax that lines
> up nicely with the XML syntax but is a usable real language, 

Does this imply that we agree to call the presentation syntax which we 
discuss now since ... (how long?) a "hack by some WG members"? :-)

Quite honestly,

a) yes, I want a syntax to write down examples, at least for BLD in a 
human-parseable syntax, and yes, it should be close to Turtle/N3, and it 
is perfectly fine with me to have some F-Logic mixed with it. We 
shouldn't throw everything over now, but fix what is there.

b) erasing the presentation synax now from all the documents and 
replacing evereything by XML would cause more pain than fixing the 
presenation syntax to something useable


some more comments inline below...


> and it's
> NOT used in the BLD spec, that's a little weird, but... it's okay.
> 
>       -- Sandro
> 
>> Axel Polleres wrote:
>>> Chris Welty wrote:
>>> [...]
>>>>  > Michael Kifer wrote:
>>>>  >> One more comment. All names of the builtins must belong to some symbo
>> l
>>>>  >> space. So, the proper naming would be something like
>>>>  >>
>>>>  >> "pred:isLong"^^rif:iri
>>>>  >>
>>>>  >> rather than pred:isLong.
>>>>  >
>>>>  > The first sentence in the  namespace section explains that this is
>>>>  > exactly how I use prefixed abbreviations here. The long version is
>>>>  > highly unreadable.
>>>>
>>>> While I think we all agree that they are unreadable, keep in mind that 
>>>> people
>>>> will very likely be cutting&pasting from your document, so a comment 
>>>> buried in
>>>> the introduction will easily be missed.  I suggest using the full syntax 
>> and
>>>> we'll try to fix the problem universally.
>>>>
>>>> -Chris
>>> I was starting to expand the IRIs as advised... so far only upto
>>> Section 4.2 of http://www.w3.org/2005/rules/wiki/DTB
>>>
>>> Before I waste more time with something which probably need to revise, 
>>> let me try to make the universal proposal you seem to ask for, first:
>>>
>>>
>>>
>>> Let me remark, that we have an ambiguity/problem already in BLD with our 
>>> sloppy definitions and "informal" use of CURIEs:
>>>
>>> "For brevity, we use the compact URI notation [CURIE], prefix:suffix, 
>>> which should be understood as a macro that expands into a concatenation 
>>> of the prefix definition and suffix. Thus, if bks is a prefix that 
>>> expands into http://example.com/books# then bks:LeRif should be 
>>> understood merely as an abbreviation for http://example.com/books#LeRif. 
>>> The compact URI notation is not part of the RIF-BLD syntax."
>>>
>>> I think that this is ambiguous, given that CURIEs are not syntactically 
>>> different from IRIRefs here. let us ilustrate this by the following 
>>> simple example...
>>>
>>>    "mailto:chris"^^rif:iri
>>>
>>> now, is  mailto:chris a CURIE now or an IRI?
>> If you say that mailto is a prefix then such an example would be
>> ambiguous. But we never do that in our examples.

Exactly, just that we necver do it in the docs doesn't mean that it is 
non-ambiguous. I don't like that.

>> This is why we disclaim any attempt to treat curies as part of the syntax.

Which is a pitty, because it would be easy to fix by my proposal.

>> Personally, I would not use curies at all. They are more trouble than they
>> are worth in a formal spec.

I disagree. Turtle and SPARQL are counterexamples for that which work 
fine and have experienced considerable take-up.

>> Instead, I would define a *real* (separate)
>> language, which would be fairly close to the presentation syntax but more
>> restricted, and use that for the test cases. In that language, I would
>> define curies precisely.

You mean, yet another language *plus* keeping a pseudo presentation syntax??

>>> I would rather suggest the following:
>>>
>>> 1) We use UNQUOTED prefix:ncname to denote CURIEs which expand to QUOTED IR
>> Is
>>> 2) Anything in quotes can NOT be expanded.
>> Since URL strings are going to be quite common not only for IRIs but also
>> for anyURI, xsd:string values, etc., I would prefer to have a more general
>> macro facility in which everything is expanded, even if quoted. 
 >>
>> One
>> possibility would be to escape : if we do not want it to be treated as a macro.

All I wanted was something close to Turtle which people now got familiar 
with a bit already, it was not my intention to invent another nwe 
syntax. If you want a general macro definition, XML already has that wih 
entity definitons... in that case, let's indeed forget about curies 
completely and use entity definitions, i.e.

  "http://www.example.org"^^"&rif;iri"


>> Another is to use a different style of quoting. For instance, '....'^^....
>> Inside '...' things would expand and inside "...." they will not. But
>> otherwise '....'^^... and "...."^^... would mean the same.
>>
>>> 3) For "IRI"^^rif:iri we also allow to write <IRI>
>> no prob.
>>
>>> 4) For symbol space IRIs (i.e. IRIs after the ^^) we only allow either 
>>> the unquoted prefix:ncname writing or the angle bracketed full IRI.

Condition 4) is to not make the definition recursive, BTW (we don't want 
to end up in 
"http://xyz.example.org"^^"http://rif#iri"^^"^^"http://rif#iri""^^"http://rif#iri"...)

>>> ( 5) optionally, in the presentation syntax we drop rif:iri completely 
>>> and ONLY allow the < > notation for iris.)
>> That does not make sense to me. First, there is a uniform syntax for all
>> constants. Shortcuts (like the above, another one for numbers, and yet
>> another one for strings) are fine, but mutilating the language in the name
>> of these shortcuts is a no-no.

fine, I can live without 5) which is why I put it in parentheses.

> Second, as I said, it is desirable to have a
>> more general macro facility.
>>
>> I think a solution lies in defining a general macro facility carefully and
>> in the introduction of well-chosen shortcuts. But I think the document
>> itself should not be muddled with all these. We should define a real
>> parsable language.

pareseable for machines? Then we only need XML.
pareseable for human readable examples? Then I think my suggested fix 
will do.

Let's see whether I get Sandro to flip-flop again now ;-)

Axel


>> 	--michael  
>>
>>
>>> That would bring us fairly close to Turlte:
>>>
>>>
>>> "mailto:chris"^^rif:iri
>>> vs
>>> mailto:chris^^rif:iri
>>> vs.
>>> <mailto:chris>
>>> vs.
>>> mailto:chris^^<rif:iri>
>>>
>>> are 3 different constants where the 2nd and the 3rd are syntactic sugar.
>>>
>>> In whole, I suggest to replace the aove paragraph with the following:
>>>
>>> ------------------------------------------------------------------------
>>> For brevity, we use the compact URI notation [CURIE], 
>>> <tt>prefix:suffix</tt>, which should be understood as a macro that 
>>> expands into a concatenation of the prefix definition and suffix 
>>> whenever not appearing within quotes or angle brackets.
>>>   We use angle brackets to denote full IRIs. Note that for symbol spaces 
>>> only IRIs are allowed.
>>>
>>>   Thus, if bks is a prefix that expands into http://example.com/books# 
>>> then bks:LeRif should be understood merely as an abbreviation for 
>>> "http://example.com/books#LeRif"^^rif:iri, which - in turn -
>>> is an appreviation for
>>>
>>> "http://example.com/books#LeRif"^^<http://www.w3.org/2007/rif#iri>
>>>
>>> As syntactic sugar, we allow to simply write
>>> <IRI> for constants in the rif:iri symbol space, i.e. for our example we
>>> could even shorter write:
>>>
>>> <http://example.com/books#LeRif>
>>>
>>>
>>> The compact URI notation is part of the RIF presentation syntax.
>>> Where we allow to define namespace prefixes following the syntax of
>>> [SPARQL http://www.w3.org/TR/rdf-sparql-query/#rPrefixDecl]
>>> i.e.
>>>
>>> prefix bks: <http://example.com/books#>
>>>
>>> The EBNF grammar for prefix declarations is:
>>>
>>>   PrefixDecl ::=  'PREFIX' PN_PREFIX? ':' IRI_REF
>>>   IRI_REF   ::=  '<' ([^<>"{}|^`\]-[#x00-#x20])* '>'
>>>   PN_PREFIX  ::=  PN_CHARS_BASE ((PN_CHARS|'.')* PN_CHARS)?
>>>   PN_CHARS   ::=  PN_CHARS_U | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | 
>>> [#x203F-#x2040]
>>>   PN_CHARS_U ::=  PN_CHARS_BASE | '_'
>>>   PN_CHARS_BASE ::=  [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | 
>>> [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | 
>>> [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | 
>>> [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
>>> ------------------------------------------------------------------------
>>>
>>>
>>> This proposal solves several problems we have now, I think the 
>>> definition is no longer recursive, and I'd hope that it is acceptable...
>>>
>>> Axel
>>>
>>>>  >
>>>>  >> See, for example, Example 2 in
>>>>  >> http://www.w3.org/2005/rules/wiki/BLD#EBNF_for_the_RIF-BLD_Rule_Langu
>> age
>>>>  >
>>>>  >
>>>>  >
>>>>  >> Alternatively, we could allocate a separate symbol space to the
>>>>  >> builtins. In this case, they would look something like
>>>>  >
>>>>  > I did suggest separatee symbol spaces some time ago in an ealier mail,
>>>>  > resp it was like that on the syntax proposals... with the addition tha
>> t
>>>>  > - if we have a separate symbol space - why then do we need a keyword
>>>>  > "External"?
>>>>  >  If we had separate symbol spaces, I think this sufficiently designate
>> s
>>>>  > builtins syntactically.
>>>>  >
>>>>  >> "isLong"^^rif:predicate
>>>>  >>
>>>>  >>     --michael
>>>>  >>> DTB is ready for review, slightly delayed:
>>>>  >>>
>>>>  >>> http://www.w3.org/2005/rules/wiki/DTB
>>>>  >>>
>>>>  >>> At least, it should be - although not officially part of the
>>>>  >>> published working drafts in this round - be stablilized to comply
>>>>  >>> mostly with the definitions in FLD and BLD concerning external schem
>> as.
>>>>  >>>
>>>>  >>> This also completes ACTION-292
>>>>  >>> "Put links for each builtin to Xquery source URI"
>>>>  >>>
>>>>  >>> best,
>>>>  >>> Axel
>>>>  >
>>>>  >
>>>>
>>>> --
>>>> Dr. Christopher A. Welty                    IBM Watson Research Center
>>>> +1.914.784.7055                             19 Skyline Dr.
>>>> cawelty@gmail.com                           Hawthorne, NY 10532
>>>> http://www.research.ibm.com/people/w/welty
>>>>
>>>
>>> -- 
>>> Dr. Axel Polleres, Digital Enterprise Research Institute (DERI)
>>> email: axel.polleres@deri.org  url: http://www.polleres.net/
>>>
>>> rdfs:Resource owl:differentFrom xsd:anyURI .
>>>
>>>
>>


-- 
Dr. Axel Polleres, Digital Enterprise Research Institute (DERI)
email: axel.polleres@deri.org  url: http://www.polleres.net/

rdfs:Resource owl:differentFrom xsd:anyURI .
Received on Tuesday, 22 April 2008 08:21:26 UTC