Re: Action 299 - removing sorts from Michael Kifer on 2007-07-23 (public-rif-wg@w3.org from July 2007)

From: Michael Kifer <kifer@cs.sunysb.edu>
Date: Mon, 23 Jul 2007 05:31:45 -0400
To: Dave Reynolds <der@hplb.hpl.hp.com>
Cc: RIF WG <public-rif-wg@w3.org>
Message-ID: <25423.1185183105@cs.sunysb.edu>
Hi Dave,
thanks for your comments. I agree with most of them, but have some questions.
See inline.

> Michael Kifer wrote:
> >> Unless I'm missing something there is nothing in that section concerning 
> >> typed literal data values. There seems to be no mention of them in 
> >> either the abstract syntax or the semantics, just a list of xsd types at 
> >> the front.  Is that material now somewhere else or is it work in 
> >> progress or am I just blind? [*]
> >>
> >> Similarly whilst the introduction says that IRIs are used to refer to 
> >> individuals, predicates and functions the abstract syntax doesn't yet
> >> support that.  [Reasonable at this stage, I'm just noting it so we don't 
> >> forget it.]
> > 
> > 
> > Dave,
> > I have now updated the document to include the discussion of the abstract data
> > type in the formal/concrete syntax and semantics. Will appreciate feedback.
> 
> Thanks.
> 
> I'm not completely happy with those sections yet. I'll outline my 
> comments below but if you'd rather I generated some alternative draft 
> text sections, to be critiqued in return, then I could do so. It just 
> seemed preferable to agree the principles before getting too much into 
> word-smithing.
> 
> ** Summary
> 
> My reservations are:
>    - treatment of rif:iri (surprise :-) )
>    - request for abstract syntax modifications (relates to above)
>    - need for XMLLiteral
> and at a minor presentational level:
>    - separation of abstract, human-readable and XML syntax discussion?
>    - degree of tie in to XML Schema datatypes
>    - trivial terminology quibble
> 
> ** Treatment of rif:iri
> 
> I'm not convinced we should be treating rif:iri as a datatype.
> 
> Setting aside rif:iri for a moment, then the set Const has subsets 
> Const\sub{xsd:string} etc and elements which are members of none of 
> those subsets. In the example:
> 
>     purchase(?Buyer ?Seller book(?Author "LeRif"^^xsd:string)
>                      USD("49"^^xsd:long))
> 
> then the Consts identified by 'purchase', 'book' and 'USD' are all of 
> latter category. These are simply Consts (with associated signatures 
> induced by context, e.g. Const(purchase) has signature p4{} as well as i{}).
> 
> As we know from previous discussions, I would like to use IRIs as the 
> lexical label for all of those Consts, you would like to permit both 
> simple strings and IRIs.
> 
> My preferred solution to this is to divide Const into three subsets:
> 
> ConstL  - semantically unconstrained, labelled by simple strings
> 
> ConstW  - semantically unconstrained, labelled by IRIs (w for "web")

I do not understand what is the difference between what is now in the
document and what you are suggesting. In the document the iri's have a
separate lexical space and their interpretation is unconstrained.
What are you exactly proposing here is not clear to me. How will the URIs
look syntactically in the language?

> ConstD  - typed constants comprising subsets Const\sub{xsd:string} etc, 
> each subset constrains I\sub{C} according to the lexical-to-value 
> mapping of the associated datatype so that each instance denotes a fixed 
> semantic object.
> 
> In the abstract syntax (and thus the human and XML concrete syntaxes) 
> these would be distinguished.
> 
> In terms of the semantics, ConstL and ConstW place no constraints on 
> I\sub{C} just as for non-typed Consts at the moment.
> 
> The advantages of this are (a) it makes clear that IRIs are here just 
> used as identifiers, (b) it makes the treatment of constants labelled by 
> local-strings and those labelled by IRIs symmetric, (c) it makes it 
> easier for dialects to restrict the syntax to require all Consts to be 
> denoted by IRIs as might be desired in a dialect to support exchange of 
> RDF rules.
> 
> ** Abstract syntax modifications
> 
> Currently the abstract syntax does not provide for datatype labelling. 
> The section "Syntax for Primitive Types" demonstrates a human and XML 
> concrete syntax for this but given the WG's decision to maintain the 
> asn06 abstract syntax as primary then that needs updating as well.
> 
> Given my above suggestion for treatment of IRIs then I would suggest 
> replacing:
> 
> [[[
> class TERM
> 
>      subclass Const
>          property name: xsd:string
> ]]]
> 
> by something like:
> 
> [[[
> class TERM
>      subclass CONST
>          subclass ConstL
>              property name: xsd:string [1]
>          subclass ConstW
>              property iri: xsd:anyURI
>          subclass ConstD
>              property lex: xsd:string
>              property type: xsd:anyURI
> ]]]
> 
> In practice it would be nice to identify the subclasses by context 
> rather than explicit class labels so, abusing asn06 notation, an 
> alternative might be:
> 
> [[[
> class TERM
>      subclass Const
>          subclass CONSTL
>              property name: xsd:string
>          subclass CONSTW
>              property iri: xsd:anyURI
>          subclass CONSTD
>              property lex: xsd:string
>              property type: xsd:anyURI
> ]]]
> 
> For the fully striped XML syntax this would imply:
> 
>      <Const><name>USD</name></Const>
>      <Const><iri>http://example.com/ISO4217/usd</iri></Const>
>      <Const><lex>42</lex><type>http://www.w3.org/2001/XMLSchema#int
>                    </type></Const>
> 
> For a prettified XML syntax this might imply something like:
> 
>      <Const rif:name="USD" />
>      <Const rif:iri="http://example.com/ISO4217/usd" />
>      <Const xsi:type="xsd:int">42</Const>
> 
> For the human readable syntax this might be:
> 
>      USD
>      <http://example.com/ISO4217/usd>
>      "42"^^xsd:int or 42


I agree that the abstract syntax should be updated. But I prefer to deal
with BNF first, since this is more familiar and makes examples clear.


> ** XMLLiteral
> 
> To improve RDF compatibility I think we also need the datatype 
> rdf:XMLLiteral to be added to the list of datatypes for this dialect 
> (and the core).

No problem.

> ** Presentational: separation of abstract, human-readable and XML syntax
> 
> Currently the sub-section "Syntax for Primitive Types" comes as part of 
> the "Concrete syntax" section but it covers more than that including 
> subtype relationships.
> 
> I'd be inclined to split that section up. Put the discussion on the 
> value spaces and subtype relationships in the earlier section "Primitive 
> data types" and put the discussion on human readable and XML concrete 
> syntax in their respective sub-sections.
> 
> If you would rather keep the material together then perhaps at least 
> raise it one level up the section hierarchy.

OK, I'll see how to rearrange it.

 
> ** Presentational: degree of tie in to XML Schema datatypes
> 
> The semantics section currently says:
> 
> [[[
> Interpretation of primitive data types. We now explain how primitive 
> data types are integrated into the semantics of the basic RIF logic.
> 
> The XML Schema Part 2: Datatypes specification defines the value space 
> for each XML data type, including the data types such as xsd:decimal, 
> which are of interest to RIF. The value space is different from the 
> so-called lexical space. Lexical space refers to the syntax of the 
> constant symbols that belong to a particular primitive data type. For 
> instance, "1.2"^^xsd:decimal and "1.20"^^xsd:decimal are two different 
> constants in RIF and in the lexical space of the XML data types. 
> However, these two constants are interpreted by the same element of the 
> value space of the xsd:decimal type.
> 
> Formally, each of the XML data types supported by RIF comes with a value 
> space, denoted by Dtype (for instance, Dxsd:decimal), and a mapping, IC: 
> Consttype → Dtype. These value spaces and the corresponding mappings are 
> defined by the XML Schema Part 2: Datatypes specification.
> ]]]
> 
> Whilst that's OK the phrasing ties the discussion of the nature of 
> datatypes specifically to XML Schema whereas we want dialects to be able 
> to define other datatypes and even for this dialect I think we also need 
> rdf:XMLLiteral.
> 
> How about phrasing it like this:
> 
> [[[
> Interpretation of primitive data types. We now explain how primitive 
> data types are integrated into the semantics of the basic RIF logic.
> 
> Each primitive datatype _type_ is defined by four components:
>    1. a non-empty set of character strings called the lexical space of d;
>    2. a non-empty set called the value space of d, denoted by D_type_
>    3. a mapping from the lexical space of d to the value space of d
>       IC: Const_type_ → D_type_
>    4. an IRI which identifies the datatype.


Yes, I agree that this is a better formulation. Will put it in.


> In the case of the datatypes xsd:long, xsd:string, xsd:decimal, 
> xsd:time, xsd:dateTime then the lexical space, value space, lexical to 
> value mapping and identifying IRI of each are defined by the XML Schema 
> Part 2: Datatypes specification [XSD].
> 
> Note that lexical space refers to the syntax of the constant symbols 
> that belong to a particular primitive data type. For instance, 
> "1.2"^^xsd:decimal and "1.20"^^xsd:decimal are two different constants 
> in RIF and in the lexical space of the XML data types. However, these 
> two constants are interpreted by the same element of the value space of 
> the xsd:decimal type.
> 
> In the case of the datatype rdf:XMLLiteral then the lexical space, value 
> space, lexical to value mapping and identifying IRI are all defined by 
> the RDF Concepts specification [XML-LITERAL].
> 
> [XSD] http://www.w3.org/TR/xmlschema-2/
> 
> [XML-LITERAL] http://www.w3.org/TR/rdf-concepts/#dfn-rdf-XMLLiteral
> ]]]
> 
> ** Presentational: trivial quibble
> 
> In the subsection "Syntax for Primitive Types" it says:
> 
> [[[
> The part of such symbols that occurs inside the double quotes is called 
> a _literal_ of the symbol.
> ]]]
> 
> could we use _lexical form_ (or something similar) instead of _literal_ 
> there? That would tie in with the later semantics section and would 
> avoid further overloading of the name literal.

OK


> Hope this helps somewhat.

Sure,thanks.


> Dave
> 
> [1] Minor footnote: for the "name" property of ConstL I've left it as an 
> xsd:string, however the current concrete human readable syntax will not 
> work with unconstrained strings as const names (it implicitly expects 
> names to have no spaces and no punctuation like '(' and ')', we should 
> either constrain the abstract syntax, extend the concrete syntax, or 
> point out that the concrete human-readable syntax cannot carry all legal 
> instance documents.

Yes. Initially I was proposing that all such unclassified constants
should be enclosed within '...' ('' can be omitted if the content is
alphanumeric). But later this was taken up as a shorthand notation for 
strings. I think we should take it back and use '...' for unclassified
symbols.

> At some point we should review the human readable syntax, it is not yet 
> fully specified. However, it seems like a low priority (compared to the 
> abstract syntax, semantics and concrete XML syntax) which is why this 
> comment is relegated to a footnote.

ok.


	--michael  


> Hewlett-Packard Limited
> Registered Office: Cain Road, Bracknell, Berks RG12 1HN
> Registered No: 690597 England
>
Received on Monday, 23 July 2007 09:35:37 UTC