- From: Wayne Myers-Education <wayne.myers@bbc.co.uk>
- Date: Fri, 26 Feb 1999 12:00:21 -0000
- To: w3c-wai-er-ig@w3.org
Hi, > You've hard-coded a lot of details into Betsie... You're right. I have. It's not good and it needs to be fixed. Thankyou for pointing this out. > It therefore needs to be able to cope with just about > anything the > web can throw at it - and if the web throws something at it that it > can't cope with, I have to fix it. While I take your point about the differences between the two gateways, I think that the need to be able to cope with a wide range of dodgy code really applies to both your gateway and Betsie. The thing is that there does have to be a limit somewhere in terms of what either program is prepared to take on, since regardless of how lenient the gateway is with dodgy HTML code, there will always be a set of pages which that gateway cannot fix up for accessibility. There will also always be a large tranche of problems that require editorial fixes rather than automated code munging, and I'm not sure what the point is of moving hell and high water to get code to make sense where the actual content of the page still doesn't and never will until it gets rewritten by a human who understands what an ALT attribute is and that one cannot assume that the user gets what the designer of the page thinks they get. ("On the left of your page you will see a large button marked 'Not If You're Blind You Won't, Mate!'") But thankyou for pointing out five things wrong with Betsie. #1 > ... a style sheet (untouched by Betsie and hence > can override > all font choices... Style sheets. Oops. Ta. #2 > spaces and > new lines at the start of an APPLET tag (Betsie processes the applets > before it remove the spaces, so it won't remove that applet) Argh! Doh! Ta, again, for pointing this out. #3 > an > old-style comment (if the string --> does not occur in the > document then > you should treat > as a comment terminator), and the < and > signs > within quotes in a tag (which may confuse your tag parsing). Ug. Ouch. #4 > It also > contains the common idiom of using /TD to terminate a link; > when Betsie > removes the /TD, the link goes too far because it isn't aware > of this. And... yes, but it's not a common idiom, it's a common error. Mind you, it would be good if Betsie could fix that too. #5 > Also, the link is to a graphics file, which Betsie is unaware of and > tries to redirect through itself. Um.. doh! Again! Hadn't thought of that. (Not that it happens often). > Not only is Perl an interpreted > language, but > the way you've done string parsing seems to introduce > countless passes > through the document. For example, you pass right through the whole > document removing surplus spaces from tags, and you pass > through again > to remove applets, and so on. I don't know much Perl but this is > probably a natural side-effect of the language. If it's given a big > document, that can mean quite a chunk of processing. I take > it the BBC > have a good high-end server then! Firstly, I cannot by any means claim to be the best Perl programmer in the world. One of the luckiest, perhaps, in that I get to work in a very supportive environment on projects that are clearly worthwhile, but I can't pretend to know enough about Perl to be sure or not of whether the way I have written things in Betsie is the best way all the time, and I am more or less certain that there are various things that could be enhanced or tweaked to be a lot smoother and sleeker. And faster. Clearly, you have already pointed out at least five or six potential bugs or issues, which I have noted with thanks and will resolve ASAP. There's another bug you missed, in fact, which is that you can point Betsie at herself and she doesn't mind, and you can keep doing that until your browser crashes with an over-long URL. This has to be fixed too. (My strong suspicion is that you've already sorted that one out in yours.) In terms of multiple passes through the page, though, they're not countless, no, but in order to implement the site specific code manipulation stuff it seemed to be necessary to have the whole page in order to make sure that whatever is supposed to be there is there (it might not be) and then to manupulate it. The idea was to do as little as possible in the 'countless passes through the document' stage and as much as possible in the 'one last line by line pass through the document' stage, although some of the 'delete all instances of this' stuff seemed to sit better in the first rather than the second of the two. Thing is, though, it turns out that Larry Wall and the Perl people are gods and geniuses and that the current Perl regular expressions functionality not merely allows but positively encourages (it seems) scripts such as Betsie to make countless passes through documents because.. you can, and it isn't too slow as long as you start sending as soon as possible.. which Betsie does. If there's a better way of doing all this stuff.. I need to learn C++ and look at your code, Silas, don't I. Meanwhile, as pages get larger the Betsie version of the page seems to actually increase in download speed, proportionately, compared to the speed it would have been anyway. This is the thing. Betsie is fast enough, though obviously, yes, C++ is way faster... true. But Perl is easier. :) > By the way, I'm not sure what you were saying about Java > security. When > it's absolutely necessary for me to get to Java applets, I > can check the > "enable applets" box in my gateway and it seems to work fine (it just > re-directs the code URL to point back to the original page). Um.. oh. It's quite possible that I'm wrong then, but I couldn't get it to work. I'll have to look at how it was you got yours to do it. How *did* you get yours to do it? I think I tried to do what you suggested above and it didn't want to work. Then I just thought sod it and deleted the lot. It would be nice to bring it back though, maybe... Hmm. But thankyou very much indeed for the analysis, the errors and the suggestions, Cheers etc., Wayne
Received on Friday, 26 February 1999 07:03:38 UTC