More digging. from Anselm Baird_Smith on 1997-04-03 (www-jigsaw@w3.org from March to April 1997)

From: Anselm Baird_Smith <abaird@www43.inria.fr>
Date: Thu, 3 Apr 1997 10:14:22 +0200 (MET DST)
To: Alexandre Rafalovitch <arafalov@socs.uts.EDU.AU>
Cc: "'www-jigsaw@w3.org'" <www-jigsaw@w3.org>
Message-Id: <199704030814.KAA17256@www43.inria.fr>
Alexandre Rafalovitch writes:
 > 
 > Some more interesting, confusing and inconsistant things that come up
 > during browsing through Jigsaw source code. I know that Jigsaw's code is
 > relatively neat; i really don't want to know how Netscape enterprise
 > server or Microsoft web server source code look like. :-}
 > 
 > First easy bits: duplication of code:
 > 1) Client.java duplicates the code in ChunkedOutputStream for (surprise)
 > chunked output. The only difference is that Client collects number of
 > bytes sent. (but see later)

That's correct. Client should use the ChunkedOutputStream (which was
indeed coded for that purpose, but not used - I have probably been
interrupted in the middle of some hack :-(
[entered todo list]

 > 2) unescape function in LookupState and Request look very similar for me.
 > Also the one in Request is even static for those external needs. Should't
 > this code be moved in some separate class along with escape (which does
 > not exist, I think but may be very needed in view of some items below).

Again, correct. What's even worse is that as far as I remember this
code is probably duplicated again un URLDecoder
(w3c.jigsaw.forms). The solution might be to move all that into
w3c.www.http
[entered todo list]

 > Another problem with LookupState:
 > 3) It unescapes the path but never escapes it back, when getRemainingPath
 > is requested. That would create some subtle problems when somebody tries
 > to write a proxy/redirector resources and uses getRemainingPath to add to
 > existing url. I think the solution for this (and problem in previous mail)
 > would be to keep original URI around (already does) and a pointer to
 > where the last segment has finished and just return a substring of
 > original URI. That would spare unescaping/escaping, etc.

Correct. This requires (?) the following (compatible) changes in API:

getRemainingPath(boolean escaped) {
   ...
}

getRemainingPath() {
    return getRemainingPath(true);
}

And also:

getNextComponent(boolean escape) {
   ...
}

getNextComponent() {
    return getNextComponent(true);
}

[entered todo, will implement if I don't hear complains]

 > Another thing that show the problem with escaping/unescaping:
 > 4) Create a resource with a space in its name. (eg directory "test
 > directory"). Try to access it and familiar "document contains no data"
 > pops up. Try to modify MirrorDirectory, so it uses getRemainingPath and it
 > would produce some very interesting results for _some_ of the URLs. Good
 > example is Add+Extension link on Extensions editor. It becomes unescaped
 > when it arrives, but later is not escaped before proxy is doing the
 > request. The only reason MirrorDirectory does not hit this problem is
 > because it uses original URI string. 
 > 

I guess above API changes would overcome that problem ? If I am
missing something let me know...

 > Problem with writing new Fields:
 > 5) Variables are package protected and not protected. The problems arizes
 > when I am trying to subclass a Field (eg. TextField) and set some of its
 > variables. I can't. The only two ways left for me is to try doing what I
 > want by overriding getValue() setValue() only (a bit hard) or copy the
 > code and modify it instead of reusing it (this might just give the
 > problems with variables from _its_ parent). Maybe it is design, but I
 > would like to hear the reason. 

No reason, I'll check that vars there are all protected (this solves
the problem if I didn't miss something ?)

 > Also, setValue and getValue are not
 > complimentary, they are called by different classes with different
 > purposes. That might be an entry for the FAQ.

Could you clarify that point ?

 > Confusion with log parameters. This is mostly request for explanation
 > behind the names/values.
 > 
 > 6) Duration parameter. I am just not sure what it is. Jigsaw seem to
 > define it as time taken to do lookup and perform, but not time taken to
 > send the data over to the client.

Yes, the intent here is to measure performance (or to have the
possibility to measure performance).

 > 7) Bytes sent. Another mistery. Looking at Client.java, I realised that
 > bytes count does not include reply headers, but does include the chunk
 > output headers if chunking is supported. That would mean that the same
 > output produce different bytes count for different clients. But on another
 > hand any other headers are not included(?). 
 > 
 > I am very confused as what these numbers can used for and if statistics
 > generated from them is accurate. :-{

That's correct. There should be two byte counts (control/header bytes,
and data bytes). This will require a (incompatible) change in the
Logger API though, but is required for implementing the HTTP SNMP mib.
[I may have to return the number of bytes in HttpMessage::emit, but
this would be a compatible API change]

Again, thanks a lot,
Anselm.
Received on Thursday, 3 April 1997 03:14:50 UTC