- From: carmen <_@whats-your.name>
- Date: Thu, 17 Jul 2014 18:23:15 +0000
- To: public-webize@w3.org
> http://blog.scrapinghub.com/
XPath v CSS-selectors, often overlooked ala XML v JSON..
on the other side of the fence, there is
http://treesheets.org/ CSS selectors -> JSON
would you do a JSON-LD mapping-frame, to get RDF?
or fork treesheets to "graph sheets"
with perhaps a CSS selector of a resources descriptive zone
and tuples of (CSSSelector, PredicateURI) to finish the triple
as some example fodder, convert this (Ruby) twitter-RDF-er to a graph-sheet (and share your graph-sheet github URI with the list? :)
[0] base = 'https://twitter.com' # base URI
nokogiri.css('div.tweet').map{|t| # resource selector
s = base + t.css('a.details').attr('href') # subject URI
yield s, Type, R[SIOCt+'MicroblogPost']
yield s, Type, R[SIOC+'Post']
yield s, Creator, R(base+'/'+t.css('.username b')[0].inner_text)
yield s, Name, t.css('.fullname')[0].inner_text
yield s, Atom+"/link/image", R(t.css('.avatar')[0].attr('src'))
yield s, Date, Time.at(t.css('[data-time]')[0].attr('data-time').to_i).iso8601
content = t.css('.tweet-text')[0]
content.css('a').map{|a| a.set_attribute 'href', URI.join(base, a.attr 'href') }
yield s, Content, CleanHTML[content.inner_html]}
Received on Thursday, 17 July 2014 18:23:45 UTC