Re: Loupe - a tool for inspecting and exploring datasets

Hi Olaf,

On Fri, Oct 9, 2015 at 7:22 PM, Olaf Görlitz <olaf.goerlitz@gmail.com>
wrote:

> very nice, indeed. Are you planning to make it available as Open Source.
> Thus one could also install it locally for private datasets.
>

This work was supported by a Spanish national project called 4V (Volume,
Velocity, Variety and Veracity in innovative data management) and we are
now in the process of releasing it as an open source project with proper
licensing etc. according to project guidelines. It will take a bit of time
but our goal is to release it as an open source project.


> +1 for using ElasticSearch and Docker
>
> However, what is your experience with using ElasticSearch for triple
> indexing? Why did you not use a triple store?
>

Well, I use a triple store (OpenLink Virtuoso Open Source) in the indexing
phase but for storing indexed information I use ElasticSearch. Each dataset
has its own index with a set of predefined document type mappings. The main
reason for using in Elasticsearch was that it was much easier and faster to
do auto completions, term searches, and page generations using ES. Also
scaling with cluster of machines is more transparent in ES. Overall, my
experience with Elasticsearch is very positive. In addition, I use an in
memory cache (Ehcache) mainly to optimize the results that are paged etc.

Best Regards,
Nandana

>
> Am 08.10.2015 um 18:09 schrieb Nandana Mihindukulasooriya:
>
>> Hi all,
>>
>> We are developing a tool called Loupe [ http://loupe.linkeddata.es ] for
>> inspecting and exploring datasets to understand which vocabularies
>> (classes, and properties) are used in a dataset and which are common
>> triple patterns. Loupe has some similarities to LODStat, Aether,
>> ProLOD++, etc. but it provides the ability to dig into more details. It
>> also connects the information provided directly to data so that so that
>> one can see the triples that correspond to those numbers.  At the
>> moment, it indexes 2+ billion triples from datasets including DBpedia
>> (17 languages), wikidata, Linked Brainz, Bio models, etc.
>>
>> It's easier to describe what information Loupe provides using an
>> example. If we take the DBpedia dataset, first it provides a summary
>> with the number of triples, distinct subjects, objects, their
>> composition (IRIs, blank nodes, literals), etc. and summary of the other
>> information that we will present below. http://tinyurl.com/loupe-dbpedia
>>
>> The class explorer provides the list of 941 classes used, number of
>> instances per each class, number classes in each namespace etc. It also
>> allows you to search for classes. http://tinyurl.com/dbpedia-classes
>>
>> If we select a concrete class such as dbo:Person, it shows the 13,128
>> distinct properties associated with instances of dbo:Person and the
>> probability that a given property is found in an instance. It also
>> provides a list 438 other types that are declared in dbo:Person
>> instances which can be equivalents classes, superclasses, subclasses,
>> etc. http://tinyurl.com/dbo-person
>>
>> The property explorer provides a list of 60347 properties with the
>> number of triples, number properties in each namespace etc. It also
>> allows searching. http://tinyurl.com/dbpedia-properties
>>
>> Again, if we select a concrete property such as dbprop:name, it looks at
>> all the triples that contain the given property and analyze the subjects
>> and objects of those triples. For subjects, it looks at IRI / blank node
>> counts and also the their types. For objects, it does the same but
>> additionally analyzes literals for numeric, integers, averages, min,
>> max, etc. http://tinyurl.com/dbp-name
>>
>> The triple pattern explorer allows you to search the 3,807,196 abstract
>> triple patterns. http://tinyurl.com/dbpedia-triple-patterns
>> Or you can select a pattern you are interested, for instance what are
>> the properties that connect dbo:Politician to dbo:Criminal
>> http://tinyurl.com/politician-criminal
>>
>> In all these cases, the numbers are directly linked to the corresponding
>> triples.
>>
>> That's a glimpse of Loupe.  We would like to know whether it useful to
>> your use cases so that we can keep improving it. It's still in its early
>> stages so any feedback on improvements are more than welcome. If are
>> interested, we will we doing a demo [1] at ISWC 2015.
>>
>> Best Regards,
>> Nandana Mihindukulasooriya
>> María Poveda Villalón
>> Raúl García Castro
>> Asunción Gómez Pérez
>>
>> [1] Nandana Mihindukulasooriya, María Poveda Villalón, Raúl García
>> Castro, and Asunción Gómez Pérez. "Loupe - An Online Tool for Inspecting
>> Datasets in the Linked Data Cloud", Demo at The 14th International
>> Semantic Web Conference, Bethlehem, USA, 2015.
>>
>

Received on Saturday, 10 October 2015 08:40:59 UTC