- From: Elias Kaerle <elias.kaerle@sti2.at>
- Date: Fri, 16 Sep 2016 16:54:46 +0200
- To: public-schemaorg@w3.org
Hi Hans, Phil and Thad, thank you all for your answers. Hans: actually I was looking for a library or snippet which does exactly that Phil: I was already looking into rdflib and rdflib-jsonld, but they are "only" translating JSON-LD into RDF and other things, not extracting structured data from a website Thad: right, actually something like a web crawler! A plugin to those would be great, I agree. I also hoped, that maybe BeautifulSoup is the right way to go, but it's still a hack to get the pure JSON-LD out of html (especially when it is wrapped in CData). I keep searching and keep you updated, thanks again for your help! Best, Elias On 16.09.2016 16:26, Thad Guidry wrote: > Your talking about a web crawler or spider. > > There's a few listed here: > https://www.schemaapp.com/60-structured-data-tools-create-test-plugins-more/ > > But none that are open source I see. > Ideally I'd like to see a Apache Nutch an Scrapy plugins for this. Even > BeautifulSoup. > > Thad > +ThadGuidry <https://www.google.com/+ThadGuidry> > > > ______________________________________________________________________ > This email has been scanned by the Symantec Email Security.cloud service. > For more information please visit http://www.symanteccloud.com > ______________________________________________________________________ > -- Elias Kärle, MSc Semantic Technology Institute University of Innsbruck ICT - Technologie Park Innsbruck 2nd Floor, Room 3S02 Technikerstrasse, 21a 6020 Innsbruck Austria Tel.: (+43) 512 507 53738 Skype: elias.kaerle
Received on Friday, 16 September 2016 14:55:38 UTC