Re: ISSUE-81 (resource vs representation)

On Mon, 28 Sep 2009, Nikunj R. Mehta wrote:
> 
> Here are a few examples of the confusion caused by fungible use of the 
> terms resource, file, and object/document.
> 
> A file cannot have infinite length - a resource can -
> http://dev.w3.org/html5/spec/Overview.html#concept-fetch-total

A file can have infinite length; consider /dev/urandom, for instance.


> A resource cannot be both a bag of bits and be an active object, e.g., make a
> call or request another resource -
> http://dev.w3.org/html5/spec/Overview.html#fetching-resources

Could you quote the exact text you think is problematic here?


> A resource is a bag of bits, but fetching a resource brings back a set of
> "out-of-band" headers as well. How does this make sense? Where does the "bag"
> start or end? What exactly is "out-of-band"?

It makes sense in the same way that a file in a filesystem has out of band 
information, forked streams, ACLs, a filename, etc.


> A resource may not be external - e.g., a javascript: URL? What bits does 
> an "internal" resource imply?

Depends on the scheme, but for example a javascript: URL, when 
dereferened, returns a Unicode string as its resource.


> A resource may be a sub resource. However, what exactly is one - does it have
> a URI without a fragment identifier?

Not sure what you mean here.


> Can I tell based on looking only at the URI whether something is a 
> subresource?

No; in fact a resource could be a resource in its own right as well as 
being a subresource, in the way I've used the term in the HTML5 spec.


> If not, which parts of an HTML user agent be affected by the semantics 
> of subresources?

I don't understand the question.


> Each entry in the list of pending master entries of an application cache
> consists of a resource and a Document object. What is the relation between the
> resource, the master resource, and the resource's Document?

The resource is the master resource in this context; the resource's 
Document is the one that was parsed from the resource by the parser.


> Multiple application caches can contain the same resource. Does this 
> mean they each have their own bags of bits or just the URL of that 
> resource?

They each have their own copy of the data associated with the given URL. 
(Each might have a different resource for each URL, in fact.)


> When selecting an application cache, what is it that user agents look 
> for a bag of bits or the URI of the resource being searched?

Not sure what you're asking here.


> When I think of cookies, they are per host and path, and not per resource.
> What then is the meaning of a resource's cookies as suggested in 3.1.3?

A resource's cookies are the cookies that will be sent when requesting 
that resource using its URL.


> Can "resource" be all of the following at the same time?
> A resource can be either binary or text (per section 2.6.3)
> A resource can be fetched (per section 2.6)
> A resource has a format or type (per section 2.1.1)
> A resource has a host (per section 2.6.2)
> A resource has an identifier (per section 2.5.1)
> A resource has cookies (per section 3.1.3)
> A resource has semantics (per section 2.1.1)
> A resource may be cached (per section 2.6)
> A resource may be external or not (per section 2.1.1)
> A resource may be incrementally processed (per section 2.6)
> A resource may be type sniffed (per section 2.6)
> A resource may have metadata (per section 2.5.1)
> A resource may have metadata (per section 2.5.1)
> A resource may or may not be available (per section 2.6)
> Something may be a subresource (per section 4.2.4)

All of the above are true of resources, yes. (I resorted the above list 
alphabetically for convenience.)


> A resource may generate Request-URIs (per section 2.1.1)

Not sure what this refers to.


On Mon, 28 Sep 2009, Nikunj R. Mehta wrote:
> 
> There is no normative text defining a resource, subresource, or external 
> resource

The word "resource" in the spec is just used in its dictionary sense of 
"digital asset".


> or a resource's Document.

As far as I can tell, it's always unambiguous what this refers to -- e.g. 
in the appcache mechanism, you only ever go into the algorithm after 
having generated a Document clearly in response to having obtained the 
resource in question.


> There is also no normative text around the substitution of file, 
> document, and object for resource.

The term "document" refers to Document objects; I have an outstanding 
issue to go through and try to use Document more consistently. Unless I've 
made a mistake, "object" is used consistently to refer to an instance of 
an IDL interface. "file" and "resource" are more or less interchangeable.


> I have to guess what is the Document for a resource (i.e., bag of bits).

You really shouldn't have to guess very hard.


> However, I don't know what is the value of the document's address given 
> just the resource.

Not sure what you mean here; if the spec talks about a document's address 
in the context of a resource and not a Document, that's a bug.


> What will I get when I create a Document given a URL with a
> fragment identifier?

You create a document given a resource, not a URL. With a URL you get a 
resource, and the fragment identifier is ignored except when navigating.


> Take another look at the following two sentences and you will see why I 
> am complaining.
> 
> [[
> When the user agent is required (by other parts of this specification) to
> start the application cache update process for an absolute URL purported to
> identify a manifest, or for an application cache group, potentially given a
> particular cache host, and potentially given a new master resource
> ]]
> 
> [[
> If these steps were invoked with a new master resource, then add the resource,
> along with the resource's Document, to cache group's list of pending master
> entries.
> ]]
> 
> Can you tell what is meant by the master resource in this statement? Is it a
> URL or a bag of bits or both?

It actually doesn't matter in the context of what you've quoted -- the 
spec could as easily say "and potentially given a new frobinator" and "If 
these steps were invoked with a new frobinator, then add the frobinator, 
along with the frobinator's Document, to...", and it would still make 
sense. When the algorithm is invoked, the spec says:

# Invoke the application cache update process for /manifest URL/, with 
# /document/ as the cache host and with the resource from which /document/
# was parsed as the new master resource.

...which unmbiguously identifies both the document and the master 
resource.


> Another one from the same section:
> [[
> One or more resources (including their out-of-band metadata, such as HTTP
> headers, if any), identified by URLs, each falling into one (or more) of the
> following categories:
> 
> Master entries - Documents that were added to the cache because a browsing
> context was navigated to that document and the document indicated that this
> was its cache, using the manifest attribute.
> ]]
> 
> [[
> Let explicit URLs be an initially empty list of explicit entries.
> ]]
> 
> Is an entry a URL or a bag of bits or both? When did I have to context switch?

That could be written better. Fixed.


> I have done more than enough to prove this point. I think my issue 
> requires a sincere response. The alternative is to wait for another 
> editor to come around some years from now and start yet another blame 
> game about how messed up the terminology is and so how it needs to be 
> totally redone regardless of what came before it.

If there are specific things you think the spec says that are 
contradictory (like the parsing manifests section did), then please do 
raise those points. However, I don't see anything wrong with using the 
term "resource" in a manner consistent with how it is used throughout the 
industry. I continue to think that HTTP, URI, and the handful of related 
specs that insist on considering the term "resource" as excluding files 
are the exception here, not the rule.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 6 October 2009 05:16:27 UTC