- From: Ruben Verborgh <Ruben.Verborgh@UGent.be>
- Date: Fri, 27 Jan 2017 13:50:27 +0000
- To: William Van Woensel <William.Van.Woensel@Dal.Ca>
- CC: "public-lod@w3.org" <public-lod@w3.org>, Sven Casteleyn <sven.casteleyn@uji.es>
Hi William, > After having privately discussed this idea of "queryable websites" (as well as some other related ideas) a while ago with Ruben, mentioning my own (partial) implementation and offering to cooperate on the effort If I recall correctly, the exchange we had was about analyzing what metadata was there, and what metadata would be sufficient (and to find the definition of “sufficient”). That's a question that's indeed not answered yet, you can see I left this as an opening: – https://ruben.verborgh.org/articles/queryable-research-data/#open-questions-p-2 – https://ruben.verborgh.org/articles/queryable-research-data/#open-questions-p-3 – https://ruben.verborgh.org/articles/queryable-research-data/#open-questions-p-4 > I am quite surprised to see this idea reappear here now. Your idea was about building cross-website applications; it's something that we're still very far away of. I have provided one simple way for my own website to make itself queryable, i.e., a TPF interface instead of LD documents. In fact, I've had the https://data.verborgh.org/ruben interface for months, but it just included my FOAF profile as data (https://ruben.verborgh.org/profile/). The only thing that I changed is that it now also includes my RDFa data. > Unfortunately, this means that there are now two separate approaches and implementations, likely with a lot of shared code and duplicated work. I'm afraid you overestimate the complexity of my solution :-) It's just one 40-line Bash script, half of it dedicated to comments and variables: https://github.com/RubenVerborgh/WebsiteToRDF/blob/6bcbbe92/extract-website-data All it does is getting the RDFa out of my website and applying some reasoning on it, such that you don't have to mark up everything with 5 ontologies. Just straightforward execution of existing commands. > We're currently in the process of writing a journal paper on this work. And I just wrote an LDOW2017 that details what the effects are of the 40-line Bash script, and precisely how they affect querying of 1 website. > Regardless, in the same line of research, another major issue is to what extent *useful* embedded structured data are actually present in websites for 3rd party scenarios. Yes, that was your initial mail to me I believe; I still think we should pursue this; I do not have any solution to that expect for trying to cover as much as possible through reasoning. > Consequently, a first useful step would be to study the scope of the available embedded structured data, and for what kind of third-party scenarios they could be useful. The Web Data Commons initiative recently released a new corpus - right on time for this kind of effort :) +1 Best, Ruben
Received on Friday, 27 January 2017 13:51:04 UTC