Re: RDF M&S: General comments (1) from Eric J. Miller on 1998-11-25 (www-rdf-comments@w3.org from October to December 1998)

From: Eric J. Miller <emiller@oclc.org>
Date: Wed, 25 Nov 1998 14:40:22 -0500
To: duerst@w3.org, www-rdf-comments@w3.org
Message-Id: <365C5D25.A2B029AA@oclc.org>
Martin,

Thank you very much for your detailed analysis and overall comments
regarding the RDF Model and Syntax specification.  The following is
a response to comments reaised in your attached message.

General comment 2.2.1 #1

Good point. We have discussed this and reworded this. "If a fragment
identifier is included in the URI-reference then the resource
identifier refers only to the subcomponent of the containing resource
that is identified by the corresponding anchor id internal to that
containing resource..."  now uses "... the corresponding fragment id
internal to ...".  In the referenced [Dexter94] there is specific
discussion of the difficulty in choosing a name for the identifier
within the target that represents the subcomponent, and defines what I
mean by anchor: "The anchor value is an arbitrary value that specifies
some location, region, item, or substructure within a component."  It
would be nice to have more widely understood language; perhaps we will
get that from XLink in the future.

General comment 2.2.1 #2

Oops, and as Martin's comment (and subsequent discussion) reminded
us, XML requires that "<" be escaped even within such strings.  The
addition of parseType="Literal" resolves the basic issues and the editos
replaced
the final sentence of the offending paragraph.

General comment 2.2.2 -

Like the other abbreviations, this abbreviation is intended to make
RDF syntax more palateable to some constituencies.  It permits more
DTDs to be interpreted as RDF data models.

General comment 3.1 -

"Alternative requiring at least one value is problematic"
/+Do you define a distinction between the statement "no alternative" and

the statement "no value"?+/

/! This comment, combined with Jim Davis' comment above make me willing
to rethink whether requiring Alt collections to be non-empty is truly
necessary, or whether it is in fact a gratuitous constraint of the
core model.  Perhaps there's a combined answer here: if you want the
(weak) Alt semantics you have to accept that the collection must be
non-empty.  Otherwise, you can fall back to just Bag semantics. !/

General comment 3.2 -

A new kind of container is just a special case of a typedNode.  So, in
fact, one way to infer that new typedNodes are containers is by
observing the presence of rdf:_n properties.  In general, to
*validate* a particular data instance you will need access to the
schema.

General comment 3.4 -

After general discussion no additional prose has been added; the
editors could not decide what was worth saying that was not already
implicit.  It may be infeasible to enumerate all the resources that
*could* be members of the aboutEachPrefix Bag, but once you have the
Bag you can enumerate its members and then the model is no less
specific.  Just as it is not possible to enumerate all the resources
on the Web that have <DC:Creator>Ora</DC:Creator> -- you can only
iterate over a known part of the Web.

Editorial comments -
Our response: "We are grateful to you for the thorough reading and the
many
suggestions as to how to make this document more accessible,
referenceable, and
precise.  We will use as many as we can."

/+
I18N Working Group Comments:
http://www.w3.org/International/Group/1998/10/NOTE-i18n-rev-rdfms-19981023

1. Issue: Escaping of characters in literals

Response: To remove the confusion between markup and content we have
added the 'parseType="Literal"' syntax.  This eliminates the need to
escape markup.

2. Issue: Use of XML markup in literals

Response: We have added the 'parseType="Literal"' syntax to indicate
to the parser that markup in the content of the element should be
treated by RDF as part of the literal.  We look forward to future
discussions that would lead to clear semantics for declaring alternate
roles for namespaces within a document.  We do not wish to add such a
role facility in haste without discussion with other XML applications.

3. Issue: Future use of additional internationalization attributes

Response: It is true that we cannot anticipate the semantics of
future attributes whose names start with 'xml', but if we exclude them
from the model now we guarantee that no RDF-based search/query/decision
tool will be able to act on them.  Whereas, if we (implicitly) allow
such attributes to be represented as properties then we have the
possibility that some operations can be performed even though the
specific semantics are not represented in RDF. An author using RDF 1
semantics will know that such attributes are represented as properties
and can use this knowledge appropriately.  In a future version
of RDF after the semantics of xml:script or xml:bidi are defined it
would be reasonable to change their representation in the model.

4. Issue: The term literal

Response: We hope the addition of syntax to permit markup in RDF
Literals
and the new glossary entry addresses your concern about the connotation
of the word 'literal'.

5. Issue: Normalization of character representations

 "The phrase "using a direct encoding of ISO/IEC 10646 or an encoding
  that can be mapped to ISO/IEC 10646" should be removed. It is unclear
  which encodings would be called "direct" and which would be called
  "mapped"."

Response: This wording was originally offered by Misha Wolf; see
http://lists.w3.org/Archives/Member/w3c-rdf-syntax-wg/1997OctDec/0616.
The intent as we understand it is to indicate that the application
has some flexibility about the storage encoding it uses but that
ultimately this encoding should be transformable to ISO/IEC 10646.
We have retained this sentence, believing that the details of
defining "direct" versus "mapped" should not confuse readers.

The proposed added text noting future W3C I18N WG efforts to define
string identity has been incorporated.

6. Issue: Basing on XML for basic internationalization

Response: The recommendation to remove the word 'superficially' has
been incorporated.

7. Issue: Use of xml:lang

Response: It is out of scope for the RDF work to register the
"unknown language" identifier with IANA.  It is out of scope for
the RDF Model and Syntax specification to recommend APIs for
comparing language-tagged strings.  We welcome further discussion
on the most appropriate forum in which to address these tasks.

8. Issue: Encoding of URIs

Response: We do not think that handling non-ASCII characters in URIs
is in scope for RDF to define.  We would welcome a W3C NOTE from the
I18N Working Group or the XML Syntax Working Group which could be
referenced by the RDF Specifications and by other specifications that
are also based on XML.

9. Issue: Description of equivalence in meaning

Response: We trust that implementors will read all of Section 6
wherein string identity is discussed and will further understand
the issues of evaluating identity between literals.  The intent
of the words in Section 2.1 is to provide motivation for the
subsequent presentation of the syntax-neutral data model.

10. Issue: Applicable version of ISO 10646/Unicode

Response: The wording in the [Unicode] reference is changed slightly
to accomodate the distinction we think you are making.  The link we
use for the XML specification is the "newest version" link, so any
updates to that specification, including a corrigendum will be
incorporated by reference.

11. Issue: Preferred Alternative

Response: Alt was not intended to express ordering of preferences;
such semantics could be added to a subclass of Alt by an application
schema.  The connotation of "default value" is meant in the general
sense -- if the application (or author) has no other basis on which
to make a choice, then it may chose this value.

/! I am, however, thinking that it would appropriate to remove the
phrase "and that is the default value for the Alternatives resource"
from the definition in the end of Section 5.  It is sufficient for
the Model that Alternatives be non-empty.  Any other semantics can
be added by application schemas. !/



> Message-Id: <199810220837.RAA13345@sh.w3.mag.keio.ac.jp>
> Date: Thu, 22 Oct 1998 17:41:37 +0900
> To: w3c-rdf-syntax-wg@w3.org
> From: "Martin J. Duerst" <duerst@w3.org>
> Subject: RDF M&S: General comments (1)
>
> Dear RDF M&S WG,
>
> This mail contains general and editorial comments to
> http://www.w3.org/1998/10/WD-rdf-syntax-19981008, which
> you sent out for last call. I'm making these comments as a member
> of the W3C team.
>
> This is the first batch of comments. I thought I would be earlier,
> but writing this up takes time. I hope I can send in the rest
> of my comments tomorrow.
>
> Comments regarding internationalization will reach you separately
> from the I18N WG.
>
> First, I have to say that I'm very impressed with your work, and that
> in particular most of the changes from the August draft to the October

> draft are very helpful.
> I haven't looked at earlier drafts, and therefore can't comment on
them.
> I'm still fighting understanding some of the issues in the
specification;
> this may be due to the fact that I'm not familliar with all the areas
> from which RDF draws influence. I hope that my comments help to make
> it easier for future readers to understand the specification.
>
>
> General comments
> ================
>
> 2.2.1
>
> Fragment identifiers are set equivalent to anchor ids. This is not
> appropriate. Each format may have its own fragment identifier syntax
> (the W3C is discussing requesting the MIME Content-Type registration
> form to be ammended by an entry describing the appropriate fragment
> identifier syntax). For HTML, ID and Name are equivalent. For XML,
> we may soon have XLL. For audio, video,..., ID-based fragment
> identifiers are not very widely used. For resources that
> are RDF statements, it may make sense to only use preassigned IDs
> and not allow other reference mechanisms such as XLL. The reason
> for this might be that this makes it easier to find out whether
> the reference to two RDF statements refer to the same statement or
> not. If this is indeed so, then it should be expressed explicitly.
> For arbitrary resources, however, a restriction to anchor ids
> as the only case of fragment identifiers, as done in 2.2.1, is
> clearly inappropriate.
>
>
> "or writing the property using the attribute form shown in Section
2.2.2":
> It is unclear to me how this could work. The same character escaping
> mechanisms apply to element content and attribute values. See
> http://www.w3.org/TR/REC-xml#NT-AttValue. [The comments in production
> [16] and production [6.10] should therefore be removed.]
>
>
> 2.2.2
>
> The alternating of properties and types in the third abbreviation
> form seems like a clever idea, but because there is no syntactic
> distinction between properties and types, this is probably rather
> error-prone. It is also rather against how the average computer
> person would interpret things on first sight. As an alternative,
> it may be better to allow rdf:type as an attribute even on elements
> that otherwise take the element form (according to the grammar,
> this is currently not allowed), and to nest only elements.
>
>
> 3.1
>
> Alternative needs at least one value. The reason for this is
> never explained. It introduces undesired asymetries into the
> list of collections. Cardinality constraints of all kinds may
> be desirable, but they should be introduced all together, just
> introducing a single is questionable. For a single Alternative,
> it may seem to make sense to request that there is at least one
> choice, but if the Alternative is used in a regular fashion as
> part of some data structure, it may be extremely desirable at
> some point to say for some of the data items "sorry, no alternative".
>
>
> 3.2
>
> It is unclear how with a "mechanism to declare additional subclasses"
> production [18] can be extended. How will the grammar be able to
> distinguish between "typedNode" and "container"? We cannot assume
> that we have access to the schema. It may be possible to treat
> containers and typedNotes as one and the same thing, but then
> the grammar should be changed to reflect this equivalence, e.g.
> to allow rdf:li to stand as an abbreviation in any kind of type
> element.
>
>
> 3.4
>
> This is a new addition. It is a new addition. It's utility is
> quite obvious. However, it should be noted that this introduces
> quantification and the impossibility to list all statements
> that are actually implied by a piece of XML. This is a very
> serious change with very serious implications, and these should
> be mentionned in the introduction and discussed here. For example,
> it is impossible to draw a graph model of this stuff without
> some background knowledge, which somehow makes the graph model
> a bit useless.
> Also, for the example, please use
aboutEachPrefix="http://foo.org/doc/",
> with a trailing slash; otherwise resources such as
> http://foo.org/doctor are also included, which may be appropriate
> in some cases, but is probably not the intent of the example.
>
>
>
> Editorial comments
> ==================
>
> 1. Introduction
>
> - "only superficially addresses many": This gives the impression that
RDF
>   doesn't care to deal with these issues, and so they are left
uncovered.
>   This is not true; by basing on XML, XML is taking care of these
issues
>   and RDF doesn't have to do it anymore. The wording should be changed

>   to make this clear; in particular the word "superficially" should
not
>   appear anymore.
>
> - "KR community": The work of figuring out what an abbreviation means
>   should not be left to the reader, even if the solution is only a few

>   lines away. The abbreviation should be introduced at the place where

>   the term first appears.
>
>
> 2. Basic RDF
>
> - "ER diagram": See "KR community" above.
>
> - "equivalence in meaning": Section 6 adds some complications to
>   this; at least a forward pointer in a note is necessary.
>
> 2.1.1
>
> - Splitting a chapter into two parts by only giving the second
>   part a subtitle looks bad.
>
> - The term "URI reference" should be used from the start in
>   connection with the term "resource identifier".
>
> - I really like the Subject-Predicate-Object terminology. But saying
>   "This sentence has the following parts" creates confusion, in
particular
>   for somebody with a bit of a linguistic background (or some high
>   school education :-). Neither in the example sentence (where
>   Ora is the subject, and "is" is the predicate) nor in the most
>   obvious formulation of the sentence with "create" as a predicate
>   (i.e. "Ora Lassila create(d/s) http://...") is Ora the object.
>   I guess finding a better example might be a lot of work, but at
>   leaste a comment that "the terms in the grammatical sense and the
>   terms as used here may or may not coincide depending on the wording
>   of the sentence" would help.
>
> - "D": These letters near the figures are rather confusing. Changing
>   it to "(Textual Description)" and putting it after the figure title
>   seems more appropriate.
>
> - "Section 4, Statements about Statements":
>   - The fragment identifier on the link is empty.
>   - There should be a paragraph break after the first sentence in this

>     paragraph, because the second sentence refers to something
different
>    (not whether there is an URI in the oval, but whether everything is

>     made in one statement or not).
>   - The forward reference here, and even more the note in 2.2.2
>     ("The observant reader...") are highly confusing, because the
>     reader has to figure out all the details by herself. To give
>     problems to the reader to solve is good style in textbooks
>     (where the solutions are usually provided), but not in
specifications.
>     Section 4 should explicitly discuss in which cases additional
>     statements are created and in which cases not, and refer to
>     the earlier examples.
>
> - "This specification defines two XML syntaxes": As it turns out,
>   it's many more than two. It would be better to say that first,
>   the basic syntax is defined, and then abbreviated forms.
>
> 2.2
>
> - Various sections all start their title with "basic". Whereas
>   "Basic Model" is okay for the reader, a title such as
>   "Basic Abbreviated Syntax" is confusing. Either the syntax
>   is basic, or it's abbreviated, isn't it? Either change the
>   title, or explain where the "real" abbreviated syntax will
>   be discussed (where actually is this discussed?), or both.
>
> - '"rdf" is used to represent ... "'<' NSprefix ':...'"':
>   This is confusing, as rdf only stands for "NSprefix", and not
>   for the whole thing in quotes.
>
> 2.2.1
>
> - "The interpretation of this graph is defined by the schema
>   designer.": This sentence clearly should appear in 2.1, and
>   not here (or maybe also be mentionned in 2.2.3).
>
> - The same thing applies to the whole paragraph two paragraphs
>   later. The association of a property name with a schema should
>   be an issue of the RDF model, not of the syntax. If it is not,
>   then either the description of the model is incomplete, or the
>   model itself is incomplete. It would probably be best to add
>   a figure 4 to 2.1.1, where "Creator", "Name", and "Email" are
>   expanded with their respective full URIs.
>
> - It would probably help a lot if the XML syntax examples were
>   numbered similar to figures or BNF productions.
>
> - The examples in section 2.2.1 only discuss namespaces, which are
>   not actually relevant (except for the fact that property names
>   never appear in full; they are always abbreviated using namespaces).

>   On the other hand, for example the basic syntax for figure 3 is not
>   given in 2.2.1, it is just assumed later (2.2.2) that the reader
>   somehow figured it out. I would propose to separate the namespace
>   syntax discussion in a subsection of it's own. It could be between
>   the current 2.2.1 and 2.2.2, or it could be combined with or near
>   2.2.3. This would help to concentrate all the material about
>   namespaces at a single place and to make the topic explicit to
>   the user.
>
> - Given the above changes, the "concatenation rule" for namespaces
>   should also be discussed at the same place. It should be explicitly
>   mentionned that a namespace ending in "#" implies that all the
things
>   defined in the namespace are described in the same resource, and an
ending
>   in "/" implies that separate resources are used, that this
difference
>   may be relevant because of the size of certain namespaces, but that
>   it is irrelevant to RDF.
>
> 2.2.2
>
> - It would be good if the chapter had subtitles for each abbreviation
>   form, e.g. "First abbreviation: Attributes instead of Elements".
>   The subtitles don't have to be numbered. However, it might be
>   better to reorganize the hierarchy, e.g. by making each abbreviation

>   an individual chapter (2.2.2/3/4), or by separating the basic
>   model and the syntax for the basic model in two chapters, or so.
>
> - "Allows documents obeying certain well-structured XML DTDs to be
>    directly interpreted as RDF models.": This is not exactly true,
>   in that without an <rdf:RDF> wrapper, they won't conform to the
>   syntax given in the spec, and if the wrapper is present, then
>   the statement is rather tautological.
>
> - "Here is another example of the use of the same abbreviation form":
>   After this sentence, I expected the abbreviated form, not the
>   long form. I suggest to either exchange the two forms or change
>   the introductory sentence.
>
> - In the second abbreviation, there are actually two different
>   abbreviations, which should be separated:
>   - The abbreviation that allows that descriptions can be nested
>     (if that's not considered an abbreviation, then it should
>      not be introduced in this chapter), and the abbreviation
>      that allows to change
>        <A>
>          <rdf:Description about=X Y=Z B=C .../>
>        <A/>
>      to
>        <A rdf:resource=X Y=Z B=C .../>
>      [I think it would be helpful to explain this starting from
>       a description with attributes instead of a description
>       with elements, i.e. after abbreviation 1 already has been
>       applied.]
>
> - For the second abbreviation form, the word "string" is used
>   in one place instead of literal.
>
> - For the fifth code example in 2.2.2, with the structure
>   RDF-Desc-Creator-Desc-Name/Email, would it be possible
>   to move the "about" value on the inner Desc to a "resource"
>   value on the Creator? Or have them both? Why, or why not?
>   What if both are present, but different?
>
> - The third abbreviation starts with the "type" element, which
>   is mentionned as a "common case". It is very difficult for
>   the reader to understand this, because this is the first
>   time type is mentionned. Even later, although mentionned,
>   it is not all that common; I had difficulties finding an
>   example that really uses it.
>
> - For the third abbreviation form, there should be a longer
>   example to explain how to read this form (i.e. properties
>   and types are alternating).
>
>
> 3.1
>
> - Duplicate values are permitted for sequence and bag. The
>   specification should say explicitly whether they are allowed
>   or not for alternative.
>
> - "A common use of containers is as the value of a property."
>   When reading, at this point, I thought "Well, what else could
>   there possibly be?". It would make sense here to say
>   "containers can be used both as values of properties as well
>   as ...".
>
> - "An Alt container must have at least one member. This member...":
>   If there is *exactly* one member, it is possible to say "this
member".
>   If there is *at least* one member, then it is not possible to
>   say "this member".
>
> - In the note starting with "The RDF Schema", the two sentences
>   are not related to each other and should therefore be made into
>   separate notes.
>
> - The first part of the note should make clear that the current
>   RDF Schema spec is under development, and that the availability
>   of other container types is not guaranteed.
>
> - The second (part of the) note should say exactly which part of
>   Section 6 this is discussed. [It looks like it may be easier to
>   give the details here; this may also make it clear that the
>   explicit properties _1, _2, _3,... cannot be used as elements,
>   but can be used as attributes (did I get this right?).]
>
> - 3.2.1: Again a single subsection in a section. It's probably better
>   to make this "3.3 Syntax Examples".
>
>
> 3.3
>
> - "to what objects is the statement is referring": Superfluous "is".
>
> - "or is the statement describing the members of the container?"
>   -> "or is the statement describing each of the members of the
container?"
>
>
> 3.5
>
> - Please mention here that the attribute shorthand cannot be used
>   if there are more than two properties with the same name.
>
>
> Regards,   Martin.
>
>
Received on Wednesday, 25 November 1998 14:41:39 UTC