Re: Action 299 - removing sorts

Michael Kifer wrote:
>> Unless I'm missing something there is nothing in that section concerning 
>> typed literal data values. There seems to be no mention of them in 
>> either the abstract syntax or the semantics, just a list of xsd types at 
>> the front.  Is that material now somewhere else or is it work in 
>> progress or am I just blind? [*]
>>
>> Similarly whilst the introduction says that IRIs are used to refer to 
>> individuals, predicates and functions the abstract syntax doesn't yet
>> support that.  [Reasonable at this stage, I'm just noting it so we don't 
>> forget it.]
> 
> 
> Dave,
> I have now updated the document to include the discussion of the abstract data
> type in the formal/concrete syntax and semantics. Will appreciate feedback.

Thanks.

I'm not completely happy with those sections yet. I'll outline my 
comments below but if you'd rather I generated some alternative draft 
text sections, to be critiqued in return, then I could do so. It just 
seemed preferable to agree the principles before getting too much into 
word-smithing.

** Summary

My reservations are:
   - treatment of rif:iri (surprise :-) )
   - request for abstract syntax modifications (relates to above)
   - need for XMLLiteral
and at a minor presentational level:
   - separation of abstract, human-readable and XML syntax discussion?
   - degree of tie in to XML Schema datatypes
   - trivial terminology quibble

** Treatment of rif:iri

I'm not convinced we should be treating rif:iri as a datatype.

Setting aside rif:iri for a moment, then the set Const has subsets 
Const\sub{xsd:string} etc and elements which are members of none of 
those subsets. In the example:

    purchase(?Buyer ?Seller book(?Author "LeRif"^^xsd:string)
                     USD("49"^^xsd:long))

then the Consts identified by 'purchase', 'book' and 'USD' are all of 
latter category. These are simply Consts (with associated signatures 
induced by context, e.g. Const(purchase) has signature p4{} as well as i{}).

As we know from previous discussions, I would like to use IRIs as the 
lexical label for all of those Consts, you would like to permit both 
simple strings and IRIs.

My preferred solution to this is to divide Const into three subsets:

ConstL  - semantically unconstrained, labelled by simple strings

ConstW  - semantically unconstrained, labelled by IRIs (w for "web")

ConstD  - typed constants comprising subsets Const\sub{xsd:string} etc, 
each subset constrains I\sub{C} according to the lexical-to-value 
mapping of the associated datatype so that each instance denotes a fixed 
semantic object.

In the abstract syntax (and thus the human and XML concrete syntaxes) 
these would be distinguished.

In terms of the semantics, ConstL and ConstW place no constraints on 
I\sub{C} just as for non-typed Consts at the moment.

The advantages of this are (a) it makes clear that IRIs are here just 
used as identifiers, (b) it makes the treatment of constants labelled by 
local-strings and those labelled by IRIs symmetric, (c) it makes it 
easier for dialects to restrict the syntax to require all Consts to be 
denoted by IRIs as might be desired in a dialect to support exchange of 
RDF rules.

** Abstract syntax modifications

Currently the abstract syntax does not provide for datatype labelling. 
The section "Syntax for Primitive Types" demonstrates a human and XML 
concrete syntax for this but given the WG's decision to maintain the 
asn06 abstract syntax as primary then that needs updating as well.

Given my above suggestion for treatment of IRIs then I would suggest 
replacing:

[[[
class TERM

     subclass Const
         property name: xsd:string
]]]

by something like:

[[[
class TERM
     subclass CONST
         subclass ConstL
             property name: xsd:string [1]
         subclass ConstW
             property iri: xsd:anyURI
         subclass ConstD
             property lex: xsd:string
             property type: xsd:anyURI
]]]

In practice it would be nice to identify the subclasses by context 
rather than explicit class labels so, abusing asn06 notation, an 
alternative might be:

[[[
class TERM
     subclass Const
         subclass CONSTL
             property name: xsd:string
         subclass CONSTW
             property iri: xsd:anyURI
         subclass CONSTD
             property lex: xsd:string
             property type: xsd:anyURI
]]]

For the fully striped XML syntax this would imply:

     <Const><name>USD</name></Const>
     <Const><iri>http://example.com/ISO4217/usd</iri></Const>
     <Const><lex>42</lex><type>http://www.w3.org/2001/XMLSchema#int
                   </type></Const>

For a prettified XML syntax this might imply something like:

     <Const rif:name="USD" />
     <Const rif:iri="http://example.com/ISO4217/usd" />
     <Const xsi:type="xsd:int">42</Const>

For the human readable syntax this might be:

     USD
     <http://example.com/ISO4217/usd>
     "42"^^xsd:int or 42

** XMLLiteral

To improve RDF compatibility I think we also need the datatype 
rdf:XMLLiteral to be added to the list of datatypes for this dialect 
(and the core).

** Presentational: separation of abstract, human-readable and XML syntax

Currently the sub-section "Syntax for Primitive Types" comes as part of 
the "Concrete syntax" section but it covers more than that including 
subtype relationships.

I'd be inclined to split that section up. Put the discussion on the 
value spaces and subtype relationships in the earlier section "Primitive 
data types" and put the discussion on human readable and XML concrete 
syntax in their respective sub-sections.

If you would rather keep the material together then perhaps at least 
raise it one level up the section hierarchy.

** Presentational: degree of tie in to XML Schema datatypes

The semantics section currently says:

[[[
Interpretation of primitive data types. We now explain how primitive 
data types are integrated into the semantics of the basic RIF logic.

The XML Schema Part 2: Datatypes specification defines the value space 
for each XML data type, including the data types such as xsd:decimal, 
which are of interest to RIF. The value space is different from the 
so-called lexical space. Lexical space refers to the syntax of the 
constant symbols that belong to a particular primitive data type. For 
instance, "1.2"^^xsd:decimal and "1.20"^^xsd:decimal are two different 
constants in RIF and in the lexical space of the XML data types. 
However, these two constants are interpreted by the same element of the 
value space of the xsd:decimal type.

Formally, each of the XML data types supported by RIF comes with a value 
space, denoted by Dtype (for instance, Dxsd:decimal), and a mapping, IC: 
Consttype → Dtype. These value spaces and the corresponding mappings are 
defined by the XML Schema Part 2: Datatypes specification.
]]]

Whilst that's OK the phrasing ties the discussion of the nature of 
datatypes specifically to XML Schema whereas we want dialects to be able 
to define other datatypes and even for this dialect I think we also need 
rdf:XMLLiteral.

How about phrasing it like this:

[[[
Interpretation of primitive data types. We now explain how primitive 
data types are integrated into the semantics of the basic RIF logic.

Each primitive datatype _type_ is defined by four components:
   1. a non-empty set of character strings called the lexical space of d;
   2. a non-empty set called the value space of d, denoted by D_type_
   3. a mapping from the lexical space of d to the value space of d
      IC: Const_type_ → D_type_
   4. an IRI which identifies the datatype.

In the case of the datatypes xsd:long, xsd:string, xsd:decimal, 
xsd:time, xsd:dateTime then the lexical space, value space, lexical to 
value mapping and identifying IRI of each are defined by the XML Schema 
Part 2: Datatypes specification [XSD].

Note that lexical space refers to the syntax of the constant symbols 
that belong to a particular primitive data type. For instance, 
"1.2"^^xsd:decimal and "1.20"^^xsd:decimal are two different constants 
in RIF and in the lexical space of the XML data types. However, these 
two constants are interpreted by the same element of the value space of 
the xsd:decimal type.

In the case of the datatype rdf:XMLLiteral then the lexical space, value 
space, lexical to value mapping and identifying IRI are all defined by 
the RDF Concepts specification [XML-LITERAL].

[XSD] http://www.w3.org/TR/xmlschema-2/

[XML-LITERAL] http://www.w3.org/TR/rdf-concepts/#dfn-rdf-XMLLiteral
]]]

** Presentational: trivial quibble

In the subsection "Syntax for Primitive Types" it says:

[[[
The part of such symbols that occurs inside the double quotes is called 
a _literal_ of the symbol.
]]]

could we use _lexical form_ (or something similar) instead of _literal_ 
there? That would tie in with the later semantics section and would 
avoid further overloading of the name literal.


Hope this helps somewhat.

Dave

[1] Minor footnote: for the "name" property of ConstL I've left it as an 
xsd:string, however the current concrete human readable syntax will not 
work with unconstrained strings as const names (it implicitly expects 
names to have no spaces and no punctuation like '(' and ')', we should 
either constrain the abstract syntax, extend the concrete syntax, or 
point out that the concrete human-readable syntax cannot carry all legal 
instance documents.

At some point we should review the human readable syntax, it is not yet 
fully specified. However, it seems like a low priority (compared to the 
abstract syntax, semantics and concrete XML syntax) which is why this 
comment is relegated to a footnote.

-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

Received on Friday, 20 July 2007 15:53:35 UTC