GRDDL.py, caching, and NoExtDtdReader [was: Atom to N3 ...]

On Tue, 2006-11-28 at 14:53 -0500, Chimezie Ogbuji wrote:
[...]
> > Chime, how do I tell 4suite to not fetch DTDs while it's parsing
> > XML documents?
> 
> I haven't verified this explicitely, but judging from 
> (http://4suite.org/docs/CoreManual.xml#id206094572):
> 
> from Ft.Xml.Domlette import NoExtDtdReader
> self.dom = NoExtDtdReader.parseString(xmlText, baseUri)

Yes, that seems to work.

> That would (perhaps) explain some of the DOS blocking I've had to deal 
> with.  I'll  make that change in the grddl-hg repository.

I made that change and a number of others:

http://homer.w3.org:8000/?cs=2a9860b7b9f9
date:        Tue Nov 28 17:58:34 2006 -0600
files:       GRDDL.py
description:
Factored web access out of GRDDL classes;
a proof-of-concept WebMemo cache class is provided.

NoExtDtdReader is used for XML parsing.

Moved most of the work from __init__() to a load() method per
a Modula-3 convention that __init__() raises no exceptions.

Media types and namespaces all go at the top of the module.

Added a few blank lines here and there; re-formatted XPaths to fit 80
cols.

http://homer.w3.org:8000/?cs=016a42475175
files:       GRDDL.py
description:
wrap to 79/80 lines, per python style guide


-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E

Received on Wednesday, 29 November 2006 00:04:46 UTC