Re: Comments on http://www.w3.org/TR/2001/WD-CCPP-struct-vocab-20010315/

At 10:38 PM 3/18/01 -0600, Aaron Swartz wrote:
>First, I want to let you know you have a really nice protocol here. I'd also
>like to congratulate you on choosing RDF for your spec, I think it is a
>great fit. Oh, one more thing: thanks for inviting SWAG[2] to comment on
>your draft.
>
>Here are some comments/corrections to your Last Call Working Draft[1]:
>
> > other xml document types.
>
>I believe XML should be capitalized here.

OK.

> > this document is created from merge of
>
>I believe you mean "this document is created from the merge of" (notice the
>added "the").

OK.  Actually, I think this will probably be removed from the final version.

> > HTML <alt> tags
>
>There is no HTML <alt> tag (that I know of). I believe you mean the alt
>attribute.

I think you're right.

> > any new attribute vocabularies defined MUST conform to the RDF schema in
> > appendices B and C.
>
>I don't believe RDF schema supplies any definition of conformance, so you
>will have to explain what you mean by conformance here. How do I make a
>vocabulary that conforms to your schema?

Good catch!  I think it should say:

   CC/PP applications are not required to support features described in the 
appendices,
   but any new attribute vocabularies defined MUST be based on RDF classes and
   properties defined by the RDF schema in appendix B (new CC/PP attributes
   sub-properties of ccpp:Attribute, new client components based on 
ccpp:Component,
   etc.).

> > Section 2 provides [...]
>
>It would be nice if these section references were links.

OK.

> > The term "CC/PP attribute" [...]
>
>It may be a little too late to change this, but the term attribute conflicts
>with the XML syntactical element of the same name. FWIW, Dublin Core and
>SWAG use the term "term" to refer to properties and classes, etc. Since
>attributes appear to be just RDF properties, you may just want to call them
>properties also.

That's why it's always qualified as "CC/PP attribute".

(There are only so many English words with this general meaning, and they 
always seem to get overloaded in technical specifications.  In this case, I 
think that "CC/PP attribute" is more descriptive than something vague like 
"term" -- it means an attribute or feature of the client.)

> > another RDF resource names 'Object-resource'.
>
>I believe you mean "named" not "names".

OK.

>You also use rdf:about attributes with relative URIs that reference other
>documents. (like rdf:about="xxx") I believe this is a clear mistake. These
>should be replaced with either rdf:ID or rdf:about="#xxx".

I think this is one of the difficulties of documenting an RDF-based 
format.  If the examples are constructed as perfectly usable RDF I believe 
they become difficult to read.  So I was trying to use a name to stand in 
for a full URI, which I felt to be easier to read;  also it corresponds 
directly with the "graph" example it purports to represent.

Also, I think using the '#fragment' syntax is confusing for a different 
reason:  when used in a protocol element (as is the intent for CC/PP), it 
is not obviously contained within a document having a base URI against 
which to resolve the relative form.  Typically these would be full absolute 
URIs.

(There's also a note from Ralph Swick about the fact that it's not always 
obvious how fragment addressing should work -- something I hope we'll clear 
up in the RDF review WG.)

Maybe I should italicize these names, to signal that they're not straight URIs?

>In Figure 2-1b, you omit the rdf: prefix on the about attribute of the third
>ccpp:component.

OK.

>In Figure 2-2b (and other examples), you show an rdf:type property with a
>value of "BrowserUA" (and other similar values). This seems like a mistake,
>since such a value would mean that the type of the property would change
>every time the document was parsed by an RDF parser. Surely this is not what
>you want. It seems like it would be more effective if you used the typedElt
>syntax, like:
>
><ccpp:component>
>   <BrowserUA rdf:about="#xxx">
>      <!-- ... -->
>   </BrowserUA>
></ccpp:component>
>
>Of course, you provide no default namespace, so all of your unprefixed
>attributes have no way to live. I'm not quite sure what happens when you
>don't provide this default namespace, but I do not it's not a good idea.

This is a similar situation as above.  These are placeholders for URIs 
rather than actual URIs.

I've added a subsection under document conventions explaining that names 
are used in examples as placeholders for full URIs.

>For section 2.1.3 you again use unprefixed references and do not explain how
>a client would find these defaults and connect them to the references in the
>profile. ... Oh, wait. You explain this later. Perhaps this explanation
>should come sooner.

Well, _an_ explanation should come sooner, just not that one.  See above.

>In section 2.1.4, you describe how proxies can provide their own profiles,
>but do not explain how this should be integrated with the client profile.

I've placed a forward reference to the more detailed explanation in sect 3.2.

(Nit: it's not the place of this spec to describe how to integrate the 
profiles;  rather it needs to explain the meaning of the combined profile.)

>Furthermore, the example shows the proxy passing along its own OS (Linux)
>and other information that seems irrelevant to the server. Is this a mistake
>or is more explanation needed to make this use case clearer?

Well, I agree the particular use case isn't obvious, but it's not really 
important.  Most of the examples were based on existing UAPROF 
vocabulary.  Maybe we can find something more convincing.

(In some cases, the server might need to know about a proxy platform that 
processes the data for similar reasons that it might need to know about the 
client platform;  in a perfect world, that information would all be 
irrelevant.)

> > refer to the RDF Schema specification [4].
>
>It seems that some references are linked, but some (like to [4] in section
>2.3.1 as quoted above) are not. Is this a normative/non-normative reference
>distinction? If so, it should be clearer.

No significance.  Just overlooked in the editing.  (Sometimes, I find the 
HTML tools available for this kind of document editing are really primitive.)

> > see Section 6.2.1.
>
>In the same section, this reference is also unlinked.

That whole sentence was a dangling reference.  Deleted.

>Figure 2-12 is a broken image (404).

Hmmm... it worked for me.

> > RDF Model and Syntax specification [3] defines two ways to name RDF 
> resources,
> > namely "ID" and "about". RDF resources named by "about" are fully 
> identified,
> > whereas those named by "ID" can not be referenced from outside the 
> containing
> > document, unless some additional information is available that allows 
> the full
> > (absolute) base URI to be determined. The RDF specification is not 
> currently
> > clear about how a base URI should be determined [34].
>
>Actually, I think you misunderstood Ralph's letter. RDF clearly defines how
>to determine the absolute URI, and how a base URI should be determined is
>clear. The problem is that there is no mapping from a fragment identifier to
>the ID attribute in RDF. However, most people assume this mapping, although
>it is not officially specified. Especially since the spec refers to the
>value of ID attributes using fragment identifiers itself! So, while you may
>continue to require that the about attribute is used, please correct your
>statements about the RDF spec.

I quote from Ralph's note:

[[[
In fact, the RDF Model and Syntax specification does not tell you
how to construct a full URI for the resource named by an ID or bagID.
The resources are addressable at best only locally within the
same RDF/XML expression.
]]]

He goes on to say:

[[[
...  I admit that this is likely to be a surprising
conclusion to some.
]]]

I think this will need to be resolved in the working group, if at all.


> > the namespace identifier <http://www.w3.org/2000/07/04-ccpp-proxy#>.
>
>You don't really need to wrap that in angle brackets if it's surrounded by
>tags. It'll probably just confuse people.

OK.

>With the proxy chaining described in 3.2.1, how would an implementation find
>the outmost layer in the chain? How should implementations deal with
>nextProxy cycles, dead-ends etc. Perhaps you could provide an algorithm to
>make this easier.

The first in the chain is what you are given (either directly or by URI 
reference).

The last is detected either by its rdf:type (i.e. NOT a "Request-profile"), 
or by the fact that it has no rdf:nextProfile property.

I think there's a slippery slope here:  the purpose of this spec is to say 
what the format is and what it means;  not how to implement it.  There is a 
plan for a companion implementers guide:  I think that would be a more 
appropriate place for the kind of advice you mention.

Possibly, we should say that a proxy chain that loops or does not end with 
a client profile is an error.

>Also, why do you make a separation between request profiles and proxy
>profiles? It seems this just adds unnecessary bulk. It would make more sense
>just to have chains of proxyProfiles pointing down towards the client.

We went round this in the working group...  chaining the proxy profiles 
directly (remembering that the same profile may, in principle, describe 
several different proxies) runs the risk of aliasing causing a loop or 
indeterminacy in the proxy chain.

This structure also helps to separate the dynamic, per-request part of the 
profile structure from the static description of a proxy (though, as we 
found, that benefit is slightly illusory).

>In Figure 3-13b:
> > <rdf:li>text/xml</rdf:li>
>
>I remember some discussion about this, so maybe it's been decided, but why
>don't you just use the content-type URIs and stick them in the schemas
>section. It seems much simpler. The official URIs are:
>     ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/
>as defined in ftp://ftp.isi.edu/in-notes/rfc2048.txt
>
>(I keep track of this at: http://logicerror.com/contentType )

Ah, this can-of-worms ;-)

(FYI:  the main MIME technical specifications are RFC2045 and 2046.  2047 
covers some extensions, and 2048 is the registration procedure for MIME 
content-types.)

As things stand, I do not believe that the assignment of URIs and hosting 
of these specs by ISI has the necessary expectation of stability to use 
these things as protocol elements.  However, there is a work in progress 
(or two) in the IETF to create a stable URI for these and other 
IANA-registered values.  So for now I think we're stuck with the 
text.  Later, I'd hope we can define the appropriate URI equivalences for 
this property.

> > <rdf:li>http://example.org/example/XHTML-1.0</rdf:li>
>
>Are you sure you don't mean:
>
><rdf:li rdf:resource="http://example.org/example/XHTML-1.0" />
>
>These mean two totally different things. One is a string of characters, the
>other is a URI. I'm pretty sure you mean to use a URI here.

You may be right, but there was an issue I picked up that suggested that 
some URIs should be treated as literals rather than resources.  I'm not 
sure if it applies here.

I think this will need to be picked up later and revisited in the WG, or at 
least separately.


>With an example, like:
> >           +--type---------> { "text/xml", "application/xml"}
> >           +--type---------> { "text/html", "application/html"}
> >           +--schema-------> { "http://example.org/example/XHTML-1.0" }
> >           +--uaprof:HTMLVersion--> { "3.2", "4.0" }
>
>I'm not clear about the semantics of a container here.
>     - Why are there two separate lists of types?

I think that's a hang-over from an earlier version of the spec, and should 
be just one.

>     - Are these terms anded or ored together (i.e. text/xml [and|or] HTML
>3.2)?

The collection represents a SET of attribute values that are accepted by 
the client (section 4.1.2.1).  So, yes, they are 'or'ed together.

>     - Does the support of application/xml mean it can read any XML? How is
>it to be interpreted?

This feature is described (appendix C) in terms of HTTP Accept.  In the 
absence of any other constraining information (e.g. schema) I would say it 
means that _any_ application/xml MIME type can be accepted.

(NOTE:  in general, I don't think a finite profile can ever provide a 100% 
accurate description of what data is acceptable;  I say I can accept HTML, 
but memory or other limitations may actually mean I cannot actually render 
any HTML document.  Etc.  So the goal here is to try and cover the 
important cases.  I think the combination of XML + schema identifier plus 
some common device constraint parameters is a fair start in this 
direction.  The framework is extensible precisely so it can be adapted to 
meet real needs.)

>In 4.1.1.1:
> > A URI is represented as a text string, but is subject to comparison 
> rules set
> > out in RFC 2396 [28], which may require 'absolutization' of the URI as
> > described there.
>
>You should be careful there, RDF recognizes URIs as something special, not
>just strings.

This is the same issue as before... I'll pick this up separately.


>WRT 4.1.1: How come there's no decimal? It'd be really nice to say
>HTMLVersion > 2.1 or some such.

That kind of consideration introduces a whole new layer of complexity, so 
we stopped just short of that.  The group's position is that the origin 
server can interpret the value according to its knowledge of the value 
semantics.  The data types here are placing a stake in the group for 
possible future development of the kind of expressive capability you describe.

(BTW, I took this line for the IETF content negotiation work I did:  see 
RCC 2533.)

>In A.1 the term Anonymization isn't bolded like the rest.

I think I've fixed that.

> > Some communication process that provides definite and tamper-proof
> > information about the identity of a communicating party.
>
>Why is there a line break there?

Dumb editing.  Fixed.

> > This term has been the subject of much dispute. Broadly speaking, it is
> > a process that prevents a party to a communication from subsequently 
> denying
> > that the communication took place, or from denying
>
>more line breaks...

more dumb editing...

> > accessed, the parties with
> > whom communication occurs, etc.
>
>yet another!

Yup.

> > rdfs:Literal
> >   ccpp:URI                  {A URI value of a CC/PP attribute}
>
>A URI is not a literal, but a resource!

Actually, a URI is not a resource.  It identifies a resource.

It is quite legitimate to define a literal value that happens to have URI 
syntax, which is what is happening here.

Whether or not this is sensible is open to debate, and I'll pick this up 
separately.

>With Figure B-3, the RDF spec recommends not using entities as you do for
>your namespaces since they might be removed in a future version of XML.

I've never spotted any such recommendation.  Can you give an exact 
citation, please?

>Also, why do you declare your own ccpp:Resource? It doesn't seem you get any
>benefit from that, but simply pollute the namespaces.

The intent is to allow CC/PP profile structure to be distinguishable (by a 
schema-aware RDF processor) from other RDF it might be embedded in, or 
embedded in it.  For example, if future CC/PP attributes have arbitrary 
resource values, this may be needed to determine where the CC/PP profile 
ends and the attribute value begins.

I'm not sure what you mean by "pollute the namespace".

>Also, you may not want to use rdf:ID in these schemas, unless you specify
>the base URI since you'll effectively be defining all these terms with the
>namespace of the specification itself!

Er, no:  they'll be under the URI of the schema document, which is the 
intent here.

> >       A proxy profile has an arbitrary number of ccpp:proxy-behavior
> >       properties, each of which indicates an individual
> >       ccpp:Proxy-behavior value.
>
>Actually, there is no proxy-behavior property. I believe you mean
>proxyBehavior. BTW, why the inconsistent naming of that one property? --
>you're right to be confused.

You're right, it should be proxyBehavior.  I thought it was quite 
consistent:  initial lowercase with "interCaps" for properties;  initial 
capital with hyphens for class names.

(The capitalization convention is as recommended by RDFM&S, appendix C.1)

> >       When this type is
> >       used, the value of the CC/PP attribute is the URI rather than the
> >       resource identified by the URI.
>
>I'm a little confused -- when would you ever use this? I can't expect that
>you'll need to talk about URIs.

Same old issue... I'll take this separately.

> >       This class is used to represent any CC/PP attribute value that
> >       is arbitrary text.
>
>How is this different from literals?

Of course, all literals are text.  The distinction here is that some are
intended to be used to represent particular kinds of value, and as such
are expected to conform to a constrained syntax.

Arbitrary text has no such constraints.

>You may also want to declare that the Defaults property is deprecated, or
>some such.

I think not:  the defaults property is fundamental to the design of CC/PP.


> > This is one of three properties to describe a proxy behavior.
>
>This is unnecessary, and sort of limits extensibility. You can keep it in
>the text, but you don't need it in the schema.

I don't agree.  And why do you think it limits extensibility?


> >       If this property is present, the behavior associated with the
> >       corresponding ccpp:Proxy-behavior resource is applied only if
> >       the outbound request profile indicates capabilities that match
> >       all those of the Component that is the object of this property.
>
>This is a bit confusing. Isn't a request profile inbound, not outbound?
>Second, if this is true, why even bother to specify applicability. The proxy
>should just look at the profile and provide the information that's
>appropriate. I think you really mean something different (i.e. that for this
>type of data, this is what's done) and this should be specified.

The terms are (intended to be) used in the sense they are defined for HTTP 
(as indicated here in section 1.3.1).  The "outbound" request profile means 
the profile arriving from the proxy or client on the outbound side of the 
receiving proxy -- the direction of travel is inbound as you say.  How 
about this:

  ....  This is used to
  add proxy behavior descriptions to a request profile received
  from a proxy or client system on the outbound side.

etc.?

I've made a serious of changes following this pattern, replacing 
"outbound..." with something like
"...from the outbound side".

> > URIs and optional fragment identifiers
>
>According to the URI spec, a "URI reference" includes a fragment identifier,
>so I don't think you need to be so explicit about this.

Ah, but... the intent was to avoid the namespace problem:  I say URI, not 
URI reference, with optional fragment identifier.  The intent was to avoid 
to avoid relative URIs, which are also allowed by URI reference.

Maybe I should be more direct and say "no relative URIs".


> > All properties used as CC/PP attributes must be instances of the class
> > ccpp:Attribute, which itself is a subclass of rdf:Property.
>
>How should this be defined? Should the subclass declaration be included in
>every CC/PP request? Should it be at the namespace? Should it be mailed in
>to the W3C? Please elaborate. I wouldn't complain, but this is a MUST
>requirement, which it seems is effectively useless. ("Yeah, it's a
>subclass." "Where is that defined?" "Well, I wrote it down on this stickie
>note, you see!") Also note that some clarifications of this could prevent
>the use of terminology created for another purpose (and thus, wasn't
>specified as a subClassOf ccpp:Attribute) which would be a bad thing.

The intent was that it is defined in the schema.  I'll make that 
clearer.  (Note this is in a section providing guidance for the definition 
of new vocabulary.)

Here's the proposed text:

   (That is, the schema defining
   CC/PP attribute properties should define them as instances of 
<tt>ccpp:Attribute</tt>.
   Thus, a schema-aware processor can distinguish between properties that 
are part
   of a CC/PP profile, and properties which may be part of an attribute value.)


> > NOTE: the proxy vocabulary described later [...]
>
>Actually, I believe it was defined "above".

So it was!


> >[...] of attribute names in a profile.NOTE: if there a [...]
>
>You're missing a paragraph break and a capital at the beginning of the
>second sentence.

OK.

> > We recommend [interCap style] be used for CC/PP attribute names
>
>Then how come all your attributes are hyphenated?

I think it's the *class names* that are hyphenated.

> > An attribute defined very broadly might be subject to different privacy or
> > security concerns when applied in different circumstances. For example, 
> having
> > a text-to-voice capability on a mobile phone type of device might be a
> > generally useful feature, but a similar feature in a PC might be 
> indicative of
> > a personal disability.
>
>This doesn't make sense. It seems if anything, a specific attribute would be
>more of a privacy concern. supportsScreenReader is a disability giveaway,
>where as the broader textToSpeech is less revealing. Having well-defined
>attributes is good practice, but the reasons provided should be sound.

Hmmm... yes, sort of.  The intent here was that a broadly defined attribute 
was more likely to be widely used and subject to different privacy concerns 
under different circumstances, therefore more likely to be used in a way 
that makes unintended disclosure in conjunction when used in conjunction 
with other features

I've added this:

   Thus a combination of text-to-voice capability and using a PC-type
   platform might indicate private information not necessarily associated with
   any of the features in isolation.

>In Appendix E, you mention a large number of formats without providing URIs,
>or really any information applicable to CC/PP. Why? If there's no good
>reason, this section should be removed. Even so, you may want to say it's
>non-normative.

It is flagged as non-normative (but up at the front of the document).  I've 
added this to the introductory paragraph.

   It is not normative, and is included to give
   an idea of some kinds of client feature that CC/PP might be used to convey.

>Also, why don't talk about CCPP in HTTP, or at least point to the note?

The main reason is that protocol was clearly out-of-scope of the charter 
for this work.

A secondary reason, for me, is that I believe the current CC/PP-in-HTTP 
proposal is seriously flawed.  My comments are on record on the CCPPEXS 
discussion list.


>The use of a special textual syntax to make the RDF graph more clear, but it
>might be better to use an established format like Notation3 (which has
>software to convert it to RDF XML) rather then inventing yet another syntax.
>It may also be smart to eliminate the textual version in some examples, and
>just go with RDF document fragments (not full documents).

If N3 had existed when we started this stuff, I surely would have preferred 
it instead of inventing a new format.  As it happens, I don't believe the 
current N3 documentation is up to strength for  use in a normative 
document.  I hope that will change.

>The use of examples is nice, but there is no need to illustrate every point
>with both a textual and XML representation. It quickly gets repetitive. I
>suggest that you explain as clearly as you can in prose and possibly refer
>to an appendix with more complicated examples that demonstrate many
>features.

Personally, I tend to agree, but the people who are going to develop this 
stuff were quite clear that they wanted full examples all the way.  If N3 
gains currency, that could change affairs because it can be run through an 
XML/RDF generator.

I suppose that, in a sense, we're guinea-pigs on this, as it's the first 
recommendation-track specification to be "pure" RDF.  Based on my 
experience, I think it would be great if N3 went to recommendation track so 
we could use it for normative specification work.

>Furthermore, the seemingly arbitrary separation of "architecture" and
>"structure" makes the spec confusing and hard to follow, as well as very
>repetitive. Actually, it might be best to rethink the whole structure of the
>document. You should probably start with some simple RDF explanations, then
>demonstrate extensibility, and _then_ demonstrate the specific feature of
>CC/PP. I feel this would significantly simplify and shorten the document
>while making it more understandable.

This is a tricky one.  I agree that there is overlap here, and the 
organization doesn't work as well as it might.  But I also feel quite 
strongly that there was too much material here to take at one bite;  I'm a 
strong believer that a relatively broad architectural overview followed by 
detailed specification is usually more approachable.  Maybe coming at it 
fresh will suggest some restructuring options.

>Phew, all done. Does this mean I get to go in the acknowledgements section?
>;-)

Well, I think definitely so.

Thanks for the effort you've put in.

#g

Received on Tuesday, 3 April 2001 16:02:56 UTC