RE: VALID_MARKUP local DTD catalog

I've committed some code to this end:

(1) I've swapped out the JHOVE validation as the messages coming out
weren't very helpful

(2) I've added a local catalogue of DTDs. The directory "dtd" contains
three subdirectories: www.openmobilealliance.org, www.wapforum.org, and
www.w3.org

The paths to the DTDs we are interested in have been preserved in the
local directory structure.

Should we include the Openwave XHTML MP DTD?

Are there any others?

Ruadhan


> -----Original Message-----
> From: public-mobileok-checker-request@w3.org [mailto:public-mobileok-
> checker-request@w3.org] On Behalf Of Miguel Garcia
> Sent: 29 May 2007 12:58
> To: public-mobileok-checker@w3.org
> Subject: RE: VALID_MARKUP local DTD catalog
> 
> 
> Hi,
> 
> Yes, it is right solution having a catatalog with the common DTDs used
> by the checker.
> 
> As Abel and I pointed in a study about third parties [1], JHOVE uses a
> SAX parser too and include several DTDs as internal resources in
benefit
> of efficiency. (None of mobile DTDs are included).
> 
> To fullfil this, JHOVE uses an adhoc DTDMapper which we should extend
in
> order to add new DTDs.
> 
> On the other hand, in JHOVE is possible to specified the SAX parser
> implementation [2] but we don't know
> 
> If the CatalogResolver can be set in a external manner avoiding modify
> JHOVE source code. (e.g we haven't access to the parser object to do
> this method call:
>
reader.setProperty("http://apache.org/xml/properties/internal/entity-res
> olver", resolver);
> )
> 
> [1] http://docs.google.com/Doc?id=dhbw7zt7_0f8w6bq
> [2] http://hul.harvard.edu/jhove/xml-hul.html
> 
> Regards,
> 
> 
> Miguel
> ________________________________________
> De: public-mobileok-checker-request@w3.org
> [mailto:public-mobileok-checker-request@w3.org] En nombre de Jo Rabin
> Enviado el: martes, 29 de mayo de 2007 12:19
> Para: Ruadhan O'Donoghue; public-mobileok-checker@w3.org
> Asunto: RE: VALID_MARKUP local DTD catalog
> 
> Good point.
> 
> The test is "If the document is an HTML document and it fails to
> validate according to its given DOCTYPE , FAIL"
> So we need a reasonable catalogue of known html and html dtds. We
don't
> need any non-html dtds and I agree that we should not go fetch random
> dtds.
> 
> Jo
> ________________________________________
> From: public-mobileok-checker-request@w3.org
> [mailto:public-mobileok-checker-request@w3.org] On Behalf Of Ruadhan
> O'Donoghue
> Sent: 29 May 2007 11:11
> To: public-mobileok-checker@w3.org
> Subject: VALID_MARKUP local DTD catalog
> 
> Hi,
> 
> I'm not sure if anyone has been looking at this, but for validating
the
> original document, we are going to need a local catalog of DTDs. In
> ready.mobi we use the Xerces CatalogResolver class to map between
> DOCTYPEs and local copies of the DTDs.
> 
> 
> Any thoughts on the following?
> 
> (1) We need to validate the document against its stated DOCTYPE and
> XHTML Basic 1.1 (and maybe 1.2). So the set of DTDs that we wish to
> store locally should include
> XHTML Basic*, MP*, HTML*
> 
> Are there others? And do we store variations like the Openwave XHTML
> DTDs which turn up quite a bit? Perhaps we should compile an
exhaustive
> list of the DOCTYPES that we will recognise.
> 
> 
> (2) The behaviour when a DOCTYPE specifies an obscure DTD not in the
> catalog - fetching a DTD from the wild is not a good idea, so we
should
> just report an "unrecognised DOCTYPE - will not try to validate"
> error... Is this the desired behaviour?
> 
> 
> Ruadhan

Received on Thursday, 31 May 2007 14:13:57 UTC