- From: Christian Schmidt <whatwg.org@chsc.dk>
- Date: Sun, 03 Dec 2006 03:17:39 +0100
Charles Iliya Krempeaux wrote: > Sometimes web developers parse (non-XML) HTML with an XML parser > because it's the tool they have on hand. > > Consider a PHP developer trying to analyse an HTML page. > > If a PHP developer wants to analyse an HTML page; that developer may > try to use SimpleXML <http://php.net/simplexml> because that's what > they have on hand and know how to use. There's no SimpleHTML > available in PHP. > > And while none of this is certainly our fault. This is a situation > some web developers are going to run into. (What else are they going > to use?) PHP developers can parse HTML using DOMDocument::loadHTML(). If they want, they can then convert the DOMDoucment to SimpleXML: $doc = new DOMDocument(); $doc->loadHTML('<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><title>Foo</title> <body>Foo<br>bar'); $simpleXml = simplexml_import_dom($doc); print $simpleXml->head->title; Christian
Received on Saturday, 2 December 2006 18:17:39 UTC