Re: CURIE proposal ... (was: Re: [DTB] ACTIONs 428 and 292 completed)

Axel Polleres wrote:
> 
> Chris Welty wrote:
> [...]
> >  > Michael Kifer wrote:
> >  >> One more comment. All names of the builtins must belong to some symbol
> >  >> space. So, the proper naming would be something like
> >  >>
> >  >> "pred:isLong"^^rif:iri
> >  >>
> >  >> rather than pred:isLong.
> >  >
> >  > The first sentence in the  namespace section explains that this is
> >  > exactly how I use prefixed abbreviations here. The long version is
> >  > highly unreadable.
> > 
> > While I think we all agree that they are unreadable, keep in mind that 
> > people
> > will very likely be cutting&pasting from your document, so a comment 
> > buried in
> > the introduction will easily be missed.  I suggest using the full syntax and
> > we'll try to fix the problem universally.
> > 
> > -Chris
> 
> I was starting to expand the IRIs as advised... so far only upto
> Section 4.2 of http://www.w3.org/2005/rules/wiki/DTB
> 
> Before I waste more time with something which probably need to revise, 
> let me try to make the universal proposal you seem to ask for, first:
> 
> 
> 
> Let me remark, that we have an ambiguity/problem already in BLD with our 
> sloppy definitions and "informal" use of CURIEs:
> 
> "For brevity, we use the compact URI notation [CURIE], prefix:suffix, 
> which should be understood as a macro that expands into a concatenation 
> of the prefix definition and suffix. Thus, if bks is a prefix that 
> expands into http://example.com/books# then bks:LeRif should be 
> understood merely as an abbreviation for http://example.com/books#LeRif. 
> The compact URI notation is not part of the RIF-BLD syntax."
> 
> I think that this is ambiguous, given that CURIEs are not syntactically 
> different from IRIRefs here. let us ilustrate this by the following 
> simple example...
> 
>    "mailto:chris"^^rif:iri
> 
> now, is  mailto:chris a CURIE now or an IRI?

If you say that mailto is a prefix then such an example would be
ambiguous. But we never do that in our examples. This is why we disclaim
any attempt to treat curies as part of the syntax.

Personally, I would not use curies at all. They are more trouble than they
are worth in a formal spec. Instead, I would define a *real* (separate)
language, which would be fairly close to the presentation syntax but more
restricted, and use that for the test cases. In that language, I would
define curies precisely.


> I would rather suggest the following:
> 
> 1) We use UNQUOTED prefix:ncname to denote CURIEs which expand to QUOTED IRIs
> 2) Anything in quotes can NOT be expanded.

Since URL strings are going to be quite common not only for IRIs but also
for anyURI, xsd:string values, etc., I would prefer to have a more general
macro facility in which everything is expanded, even if quoted. One
possibility would be to escape : if we do not want it to be treated as a macro.
Another is to use a different style of quoting. For instance, '....'^^....
Inside '...' things would expand and inside "...." they will not. But
otherwise '....'^^... and "...."^^... would mean the same.

> 3) For "IRI"^^rif:iri we also allow to write <IRI>

no prob.

> 4) For symbol space IRIs (i.e. IRIs after the ^^) we only allow eithr 
> the unquoted prefix:ncname writing or the angle bracketted name.
> ( 5) optionally, in the presentation syntax we drop rif:iri completely 
> and ONLY allow the < > notation for iris.)

That does not make sense to me. First, there is a uniform syntax for all
constants. Shortcuts (like the above, another one for numbers, and yet
another one for strings) are fine, but mutilating the language in the name
of these shortcuts is a no-no. Second, as I said, it is desirable to have a
more general macro facility.

I think a solution lies in defining a general macro facility carefully and
in the introduction of well-chosen shortcuts. But I think the document
itself should not be muddled with all these. We should define a real
parsable language.


	--michael  


> That would bring us fairly close to Turlte:
> 
> 
> "mailto:chris"^^rif:iri
> vs
> mailto:chris^^rif:iri
> vs.
> <mailto:chris>
> vs.
> mailto:chris^^<rif:iri>
> 
> are 3 different constants where the 2nd and the 3rd are syntactic sugar.
> 
> In whole, I suggest to replace the aove paragraph with the following:
> 
> ------------------------------------------------------------------------
> For brevity, we use the compact URI notation [CURIE], 
> <tt>prefix:suffix</tt>, which should be understood as a macro that 
> expands into a concatenation of the prefix definition and suffix 
> whenever not appearing within quotes or angle brackets.
>   We use angle brackets to denote full IRIs. Note that for symbol spaces 
> only IRIs are allowed.
> 
>   Thus, if bks is a prefix that expands into http://example.com/books# 
> then bks:LeRif should be understood merely as an abbreviation for 
> "http://example.com/books#LeRif"^^rif:iri, which - in turn -
> is an appreviation for
> 
> "http://example.com/books#LeRif"^^<http://www.w3.org/2007/rif#iri>
> 
> As syntactic sugar, we allow to simply write
> <IRI> for constants in the rif:iri symbol space, i.e. for our example we
> could even shorter write:
> 
> <http://example.com/books#LeRif>
> 
> 
> The compact URI notation is part of the RIF presentation syntax.
> Where we allow to define namespace prefixes following the syntax of
> [SPARQL http://www.w3.org/TR/rdf-sparql-query/#rPrefixDecl]
> i.e.
> 
> prefix bks: <http://example.com/books#>
> 
> The EBNF grammar for prefix declarations is:
> 
>   PrefixDecl ::=  'PREFIX' PN_PREFIX? ':' IRI_REF
>   IRI_REF   ::=  '<' ([^<>"{}|^`\]-[#x00-#x20])* '>'
>   PN_PREFIX  ::=  PN_CHARS_BASE ((PN_CHARS|'.')* PN_CHARS)?
>   PN_CHARS   ::=  PN_CHARS_U | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | 
> [#x203F-#x2040]
>   PN_CHARS_U ::=  PN_CHARS_BASE | '_'
>   PN_CHARS_BASE ::=  [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | 
> [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | 
> [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | 
> [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
> ------------------------------------------------------------------------
> 
> 
> This proposal solves several problems we have now, I think the 
> definition is no longer recursive, and I'd hope that it is acceptable...
> 
> Axel
> 
> >  >
> >  >> See, for example, Example 2 in
> >  >> http://www.w3.org/2005/rules/wiki/BLD#EBNF_for_the_RIF-BLD_Rule_Language
> >  >
> >  >
> >  >
> >  >> Alternatively, we could allocate a separate symbol space to the
> >  >> builtins. In this case, they would look something like
> >  >
> >  > I did suggest separatee symbol spaces some time ago in an ealier mail,
> >  > resp it was like that on the syntax proposals... with the addition that
> >  > - if we have a separate symbol space - why then do we need a keyword
> >  > "External"?
> >  >  If we had separate symbol spaces, I think this sufficiently designates
> >  > builtins syntactically.
> >  >
> >  >> "isLong"^^rif:predicate
> >  >>
> >  >>     --michael
> >  >>> DTB is ready for review, slightly delayed:
> >  >>>
> >  >>> http://www.w3.org/2005/rules/wiki/DTB
> >  >>>
> >  >>> At least, it should be - although not officially part of the
> >  >>> published working drafts in this round - be stablilized to comply
> >  >>> mostly with the definitions in FLD and BLD concerning external schemas.
> >  >>>
> >  >>> This also completes ACTION-292
> >  >>> "Put links for each builtin to Xquery source URI"
> >  >>>
> >  >>> best,
> >  >>> Axel
> >  >
> >  >
> > 
> > --
> > Dr. Christopher A. Welty                    IBM Watson Research Center
> > +1.914.784.7055                             19 Skyline Dr.
> > cawelty@gmail.com                           Hawthorne, NY 10532
> > http://www.research.ibm.com/people/w/welty
> > 
> 
> 
> -- 
> Dr. Axel Polleres, Digital Enterprise Research Institute (DERI)
> email: axel.polleres@deri.org  url: http://www.polleres.net/
> 
> rdfs:Resource owl:differentFrom xsd:anyURI .
> 
> 

Received on Tuesday, 22 April 2008 01:01:39 UTC