- From: Sarven Capadisli <sarven.capadisli@deri.org>
- Date: Fri, 04 Jul 2014 16:22:02 +0200
- To: public-gld-wg@w3.org
On 2014-05-07 08:23, Sarven Capadisli wrote: > SemStats 2014 Call for Papers > ============================= > > Second International Workshop on Semantic Statistics (SemStats 2014) > > Workshop website: http://www.datalift.org/en/event/semstats2014/cfp > Event hashtags: #SemStats #ISWC2014 > > in conjunction with > > ISWC 2014 > The 13th International Semantic Web Conference > Riva del Garda - Trentino, Italy, October 19-23, 2014 > http://iswc2014.semanticweb.org/ > > > Workshop Summary > ================ > > The goal of this workshop is to explore and strengthen the relationship > between the Semantic Web and statistical communities, to provide better > access to the data held by statistical offices. It will focus on ways in > which statisticians can use Semantic Web technologies and standards in > order to formalize, publish, document and link their data and metadata. > It follows the 1st Semantic Statistics workshop held at ISWC 2013 > (SemStats 2013) http://www.datalift.org/en/event/semstats2013 that was a > big success attracting more than 50 participants all along the day. > > The statistical community shows more and more interest in the Semantic > Web. In particular, initiatives have been launched to develop semantic > vocabularies representing statistical classifications and discovery > metadata. Tools are also being created by statistical organizations to > support the publication of dimensional data conforming to the Data Cube > W3C Recommendation. But statisticians see challenges in the Semantic > Web: how can data and concepts be linked in a statistically rigorous > fashion? How can we avoid fuzzy semantics leading to wrong analyses? How > can we preserve data confidentiality? > > The workshop will also cover the question of how to apply statistical > methods or treatments to linked data, and how to develop new methods and > tools for this purpose. Except for visualisation techniques and tools, > this question is relatively unexplored, but the subject will obviously > grow in importance in the near future. > > > Motivation > ========== > > There is a growing interest regarding linked data and the Semantic Web > in the statistical community. A large amount of statistical data from > international and national agencies has already been published on the > web of data, for example Census data from the U.S., Spain or France, > amongst others. In most cases, though, this publication is done by > people exterior to the statistical office (see also > http://datahub.io/dataset/istat-immigration, http://270a.info/ or > http://eurostat.linked-statistics.org/), which raises issues such as > long-term URI persistence, institutional commitment and data maintenance. > > Statistical organisations are also interested in how Semantic Web might > make it simpler for analysts to use well described statistical data in > conjunction with other forms of data (eg geospatial information, > scientific data, "big data" from various sources) which is expressed > semantically. The ability to bring together diverse types of data in > this way should enable new insights on multifaceted issues. > > Statistical organizations also possess an important corpus of structural > metadata such as concept schemes, thesauri, code lists and > classifications. Some of those are already available as linked data, > generally in SKOS format (e.g. FAO's Agrovoc or UN's COFOG). Semantic > web standards useful for the statisticians have now arrived at maturity. > The best examples are the W3C Data Cube, DCAT and ADMS vocabularies. The > statistical community is also working on the definition of more > specialized vocabularies, especially under the umbrella of the DDI > Alliance. For example, XKOS extends SKOS for the representation of > statistical classifications, and Disco defines a vocabulary for data > documentation and discovery. The Visual Analytics Vocabulary is a first > step towards semantic descriptions for user interface components > developed to visualize Linked Statistical Data which can lead to > increased linked data consumption and accessibility. We are now at the > tipping point where the statistical and the Semantic Web communities > have to formally exchange in order to share experiences and tools and > think ahead regarding the upcoming challenges. > > Statisticians have a long-going culture of data integrity, quality and > documentation. They have developed industrialized data production and > publication processes, and they care about data confidentiality and more > generally how data can be used. > > The web of data will benefit in getting rich data published by > professional and trustworthy data providers. It is also important that > metadata maintained by statistical offices like concept schemes of > economic or societal terms, statistical classifications, well-known > codes, etc., are available as linked data, because they are of good > quality, well-maintained, and they constitute a corpus to which a lot of > other data can refer to. > > It seems that after a period where the aim was to publish as many > triples as possible, the focus of the Semantic Web community is now > shifting to having a better quality of data and metadata, more coherent > vocabularies (see the LOV initiative), good and documented naming > patterns, etc. This workshop aims to contribute in these longer term > problems in order to have a significant impact. > > The statistics community faces sometimes challenges when trying to adopt > Semantic Web technologies, in particular: > > * difficulty to create and publish linked data: this can be alleviated > by providing methods, tools, lessons learned and best practices, by > publicizing successful examples and by providing support. > * difficulty to see the purpose of publishing linked data: we must > develop end-user tools leveraging statistical linked data, provide > convincing examples of real use in applications or mashups, so that the > end-user value of statistical linked data and metadata appears more > clearly. > * difficulty to use external linked data in their daily activity: it is > important to develop statistical methods and tools especially tailored > for linked data, so that statisticians can get accustomed to using them > and get convinced of their specific utility. > > To conclude, statisticians know how misleading it can be to exploit > semantic connections without carefully considering and weighing > information about the quality of these connections, the validity of > inferences, etc. A challenge for them is to determine, to ensure and to > inform consumers about the quality of semantic connections which may be > used to support analysis in some circumstances but not others. The > workshop will enable participants to discuss these very important issues. > > > Topics > ====== > > The workshop will address topics related to statistics and linked data. > This includes but is not limited to: > > How to publish linked statistics? > > * What are the relevant vocabularies for the publication of statistical > data? > * What are the relevant vocabularies for the publication of statistical > metadata (code lists and classifications, descriptive metadata, > provenance and quality information, etc.)? > * What are the existing tools? Can the usual statistical software > packages (e.g. R, SAS, Stata) do the job? > * How do we include linked data production and publication in the data > lifecycle? > * How do we establish, document and share best practices? > > How to use linked data for statistics? > > * Where and how can we find statistics data: data catalogues, dataset > descriptions, data discovery? > * How do we assess data quality (collection methodology, traceability, > etc.)? > * How can we perform data reconciliation, ontology matching and instance > matching with statistical data? > * How can we apply statistical processes on linked data: data analysis, > descriptive statistics, estimation, correction? > * How to intuitively represent statistical linked data: visual > analytics, results of data mining? > > > Submissions > =========== > > This workshop is aimed at an interdisciplinary audience of researchers > and practitioners involved or interested in Statistics and the Semantic > Web. All papers must represent original and unpublished work that is not > currently under review. Papers will be evaluated according to their > significance, originality, technical content, style, clarity, and > relevance to the workshop. At least one author of each accepted paper is > expected to attend the workshop. > > Workshop participation is available to ISWC 2014 attendants at an > additional cost, see http://iswc2014.semanticweb.org/registration for > details. > > The workshop will also feature a challenge based on Census Data > published on the web or provided by Statistical Institutes. It is > expected that data from Australia, France and Italy will be available. > The challenge will consist in the realization of mashups or > visualizations, but also on comparisons, alignment and enrichment of the > data and concepts involved. > > We welcome the following types of contributions: > > * Full research papers (up to 12 pages) > * Short papers (up to 6 pages) > * Challenge papers (up to 6 pages) > > All submissions must be written in English and must be formatted > according to the information for LNCS Authors (see > http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0). Please, > note that (X)HTML(+RDFa) submissions are also welcome as soon as the > layout complies with the LNCS style. Authors can for example use the > template provided at https://github.com/csarven/linked-research. > Submissions are NOT anonymous. Please submit your contributions > electronically in PDF format at > http://www.easychair.org/conferences/?conf=semstats2014 and before July > 7, 2014, 23:59 PM Hawaii Time. All accepted papers will be archived in > an electronic proceedings published by CEUR-WS.org. > > See important dates and contact info on the workshop home page. > > If you are interested in submitting a paper but would like more > preliminary information, please contact semstats2014@easychair.org. > > > Chairs > ====== > > * Sarven Capadisli, University of Leipzig, Germany, and Bern University > of Applied Sciences, Switzerland > * Franck Cotton, INSEE, France > * Armin Haller, CSIRO, Australia > * Alistair Hamilton, ABS, Australia > * Monica Scannapieco, Istat, Italy > * Raphaël Troncy, EURECOM, France > > > Program Committee > ================= > > * Phil Archer, W3C > * Ghislain Auguste Atemezing, Eurecom, France > * Jay Devlin, Statistics New Zealand, New Zealand > * Miguel Expósito Martín, Instituto Cántabro de Estadística, Spain > * Dan Gillman, US Bureau of Labor Statistics, USA > * Arofan Gregory, Metadata Technology NA, USA > * Tudor Groza, School of ITEE, The University of Queensland, Australia > * Christophe Guéret, Data Archiving and Networked Services (DANS), The > Netherlands > * Andreas Harth, AIFB, Karlsruhe Institute of Technology, Germany > * Hak Lae Kim, Samsung Electronics > * Laurent Lefort, CSIRO ICT Centre, Australia > * Domenico Lembo, Sapienza University of Rome, Italy > * Vincenzo Patruno, Istat, Italy > * Marco Pellegrino, Eurostat, Luxembourg > * Dave Reynolds, Epimorphics, UK > * Hideaki Takeda, National Institute of Informatics, Japan > * Wendy Thomas, Minnesota Population Center, USA > * Bernard Vatant, Mondeca, France > * Boris Villazón-Terrazas, iSOCO, Intelligent Software Components, Spain > * Joachim Wackerow, GESIS - Leibniz Institute for the Social Sciences, > Germany > * Stuart Williams, Epimorphics, UK "Good news everyone", The submission deadline is extended to 2014-07-21! -Sarven http://csarven.ca/#i
Received on Friday, 4 July 2014 14:22:32 UTC