- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Sun, 12 Sep 2004 05:45:43 +0200
- To: public-qa-dev@w3.org
Hi, Another issue that needs to be resolved in order to use S::P::O is the creation of temporary files. While OpenSP generally supports doing IO on strings, the implementation is unusable as it consumes hundreds of times the memory the string comsumes. OpenSP also supports reading from file descriptors, but this is not portable, e.g. on my Win32 system Perl and the XS/OpenSP would use different runtimes and the file descriptor table is local to the runtime. Now there are two approaches, one is to rely on temporary file names and the other would be to optimize this on systems where reading from <OSFD>s is supported. use strict; use warnings; use SGML::Parser::OpenSP qw(); use File::Temp qw(); our $SUPPORTS_OSFD_READING = 0; # high security on systems that support it File::Temp->safe_level(File::Temp::HIGH); # new parser my $p = SGML::Parser::OpenSP->new; sub x::new{bless{},shift} sub x::start_element{use Data::Dumper; print Dumper\@_} # null handler $p->handler(x->new); # the html to parse my $html = "<!DOCTYPE html []><p>..."; # create temp file my $fh = File::Temp->new(); # store content print $fh $html; # seek to start seek $fh, 0, 0; if ($SUPPORTS_OSFD_READING) { $p->parse_file("<OSFD>" . fileno($fh)); } else { require Fcntl; # not all systems have F_SETFD eval { Fcntl::F_SETFD }; unless ($@) { fcntl($fh, Fcntl::F_SETFD, 0) or die "Can't clear close-on-exec flag on temp fh: $!\n"; } $p->parse_file($fh->filename); } undef $fh; With $SUPPORTS_OSFD_READING false this would work on both, Linux and Win32, with $SUPPORTS_OSFD_READING it should work on Linux. Are there any major issues with this approach and/or implementation?
Received on Sunday, 12 September 2004 03:46:28 UTC