Re: first questions on validator.nu

On Fri, 1 Aug 2008, Henri Sivonen wrote:

> On Jul 31, 2008, at 14:54, Yves Lafon wrote:
>
>> On Thu, 31 Jul 2008, Henri Sivonen wrote:
>> 
>>> 
>>> On Jul 31, 2008, at 00:29, olivier Thereaux wrote:
>>> 
>>>> not solved as far as I know. Yves seems to think it's not a big problem - 
>>>> which is great -. All I can say is that I've unfortunately been too busy 
>>>> to work on it but still think this should happen. If you and Yves can 
>>>> work together on it that would be perfect.
>>> 
>>> 
>>> OK. I'll put some cycles into this.
>>> 
>>> (All the servlet [features] that the Validator.nu controller uses are 
>>> *very* old, except for the part that requests incoming query strings to be 
>>> decoded as UTF-8. I guess I'll find out soon enough if that feature is in 
>>> the version of servlets supported by Jigsaw.)
>> 
>> I have to look at the code, but remember doing something for this kind of 
>> issue, anyway, we need to check that (and patch if needed :) )
>
> Indeed, Jigsaw is missing the setCharacterEncoding() method on 
> HttpServletRequest objects. I made the servlet catch 
> java.lang.NoSuchMethodError there, so if the servlet is compiled under a more 
> current servlet API, it runs without crashing (but without non-ASCII query 
> string support) under a vintage servlet API. (Jigsaw also lacks 
> getRequestURL(), but having that method wasn't critical, so I removed the 
> call.)

Ok, I will add those methods, just need to find at which level of the 
servlet spec it belongs.

> The other thing in the servlet that I needed to modify was treating a null 
> return value from getPathInfo() as if the return value had been "/". (With 
> longer paths, Jigsaw seems to swallow the trailing slash.)

Hum, that look like a bug, I'll fix that.

> With these changes, the Validator.nu servlet can run inside Jigsaw. However, 
> some of the functionality of Validator.nu is implemented as servlet filters 
> (gzip in, gzip out, file upload, textarea input). These features don't work 
> under Jigsaw. (At least it seems to me that Jigsaw doesn't implement servlet 
> filters.) I have not modified the UI of Validator.nu to hide references to 
> these features when they aren't present.

Jigsaw already has its filters (for example, the CSS validator uses a 
TE/Transfer-Encoding filter on the way out, which is not a servlet filter 
but a Jigsaw one, used in the whole /css-validator/ subspace.)

> There's an additional problem that is significant considering use as a 
> Unicorn back end: When the method is POST, Jigsaw does not provide request 
> URI query string parameters via the getParameter() method on 
> HttpServletRequest. This means that the output format cannot be selected. (In 
> Jetty, getParameter() returns request URI parameters when POSTing a non-form 
> entity body.)

IIRC, Jigsaw adds the two sets of parameters (from the decoded POST if it 
applies + the URI ones), but it may be an issue if the content is an 
uploaded file, for example.

> Here's what is needed to get up and running from a clean installation of 
> Jigsaw (using Java 5 or later):
> 1) Create a directory for Validator.nu.
> 2) cd into that directory.
> 3) Run svn co http://svn.versiondude.net/whattf/build/trunk/ build
> 4) Run python build/build.py dldeps
> 5) Run python build/build.py build
> 6) cd to the Jigsaw directory.
> 7) Replace scripts/jigsaw.sh with the modified script attached to this email. 
> This script adds the jars needed by Validator.nu to the classpath and sets 
> some system properties for configuring Validator.nu. It replaces legacy SAX 
> and Xerces with current versions. (The favicon, script and style URLs point 
> to about.validator.nu, but can be changes here.)
> 8) Edit the variable VALIDATOR_NU_HOME in the script to point to the 
> directory you created in step #1.
> 9) Add line 
> org.w3c.jigsaw.startup=org.w3c.jigsaw.servlet.ServletPropertiesReader to 
> Jigsaw/config/http-server.props
> 10) Run sh scripts/jigsaw.sh
> 11) From another shell, run scripts/jigadmin.sh and connect to the Jigsaw 
> instance.
> 12) In jigadmin, add a ServletWrapperFrame resource under /servlet with the 
> identifier validator-nu and the class nu.validator.servlet.VerifierServlet.
> 13) Restart Jigsaw.
>
> The generic facet of Validator.nu is now at 
> http://localhost:8001/servlet/validator-nu/. The HTML5 facet is at 
> http://localhost:8001/servlet/validator-nu/html5/. (The parsetree viewer 
> should be at http://localhost:8001/servlet/validator-nu/parsetree/, but it 
> doesn't appear to work.)
>
> When the servlet loads, log4j complains to System.err that it hasn't been 
> initialized. I don't know what the best practice for initializing log4j under 
> Jigsaw is.

There is way to initialize a servlet at startup, I never used log4j but 
that should be the best bet.

-- 
Baroula que barouleras, au tiéu toujou t'entourneras.

         ~~Yves

Received on Friday, 1 August 2008 12:59:15 UTC