- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Sun, 12 Sep 2004 13:18:20 +0200
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- Cc: public-qa-dev@w3.org
* Bjoern Hoehrmann wrote: > Another issue that needs to be resolved in order to use S::P::O is the >creation of temporary files. While OpenSP generally supports doing IO on >strings, the implementation is unusable as it consumes hundreds of times >the memory the string comsumes. OpenSP also supports reading from file >descriptors, but this is not portable, e.g. on my Win32 system Perl and >the XS/OpenSP would use different runtimes and the file descriptor table >is local to the runtime. Now there are two approaches, one is to rely on >temporary file names and the other would be to optimize this on systems >where reading from <OSFD>s is supported. This should have been something like use strict; use warnings; use SGML::Parser::OpenSP qw(); use File::Temp qw(); # this would be a config setting or somesuch our $SUPPORTS_OSFD_READING = 0; # high security on systems that support it File::Temp->safe_level(File::Temp::HIGH); # new parser my $p = SGML::Parser::OpenSP->new; sub x::new{bless{},shift} sub x::start_element{use Data::Dumper; print Dumper\@_} # null handler $p->handler(x->new); # the html to parse my $html = "<!DOCTYPE html []><p>..."; # create temp file, this would croak if it fails, so # there is no need for us to check the return value my $fh = File::Temp->new(); # ... File::Temp::unlink0($fh, $fh->filename); # store content print $fh $html; # seek to start seek $fh, 0, 0; if ($SUPPORTS_OSFD_READING) { $p->parse_file("<OSFD>" . fileno($fh)); } else { $p->parse_file($fh->filename); } which, I am certain, still contains some minor flaws...
Received on Sunday, 12 September 2004 11:19:14 UTC