XPath tips from the web scraping trenches

Quite an interesting article that talks about how to get data
systematically from existing pages.

Scrapy is quite a nice tool for doing this, and here's some do's and dont's

http://blog.scrapinghub.com/2014/07/17/xpath-tips-from-the-web-scraping-trenches/

This also ties into some of the openlink spongers / cartridges in terms of
transforming existing web content into structured data.

Received on Thursday, 17 July 2014 16:33:45 UTC