- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Sun, 24 Feb 2008 18:41:02 +0000
- To: public-xml-core-wg <public-xml-core-wg@w3.org>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
ht writes:
> So, there are a _lot_ of ostensibly broken fragments and anchors out
> there.
>
> No, I did not check what percentage of the data was XML, I'll do that.
So, of my 10404 pages, about 25% (2766) are XHTML. Of these, only
about 1.5% (39) contain <[a-z]+ ... id=.[0-9] ...>. Those 39 pages
come from 30 different sites.
ht
- --
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
Half-time member of W3C Team
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
iD8DBQFHwbo+kjnJixAXWBoRArOyAJ9x8XE/q5KnJmm2eJvy0Brb85gfRwCaA9zx
SL7FQJwc1yfYiatSjL5BHhc=
=zI/T
-----END PGP SIGNATURE-----
Received on Sunday, 24 February 2008 18:41:17 UTC