RE: Publication of scientific research from Michael Hopwood on 2013-04-26 (public-lod@w3.org from April 2013)

From: Michael Hopwood <michael@editeur.org>
Date: Fri, 26 Apr 2013 09:35:10 +0100
To: 'Daniel Schwabe' <dschwabe@inf.puc-rio.br>, "'public-lod@w3.org community'" <public-lod@w3.org>
Message-ID: <631321E01C433541A0DAD1B36386BE0D04905B4440@EX27MAIL03.msghub.com>
The most generic name identifier I know of is ISNI (incorporating VIAF and ORCID): http://www.isni.org/isni_and_orcid

There is a highly generic "open" vocabulary for describing "events" as such: http://www.cidoc-crm.org/

Good point about the need to run a service to actually maintain the data though. One reason why also DOI is so popular...

From: Daniel Schwabe [mailto:dschwabe@inf.puc-rio.br]
Sent: 25 April 2013 20:39
To: public-lod@w3.org community
Subject: Re: Publication of scientific research

All,
two of the major stumbling blocks faced so far in publishing metadata for the WWW series, as well as for ISWC, has been identity management, for authors, events and institutions. It is amazing in how many different ways people refer to the same event... and in how many different ways they are cited as well.
A solution that doesn't somehow make it easier to solve this (e.g, ORCID?) would not be so helpful, especially if we want to link things.
In the longer run, it is also not enough to provide the software for people to download and set up servers, etc..., to run a conference. It's more work, and, more fundamentally, implies a long range commitment that most institutions are not willing or able to make. One the reasons EasyChair is so popular is that it provides the *service*, with high reliablility/availability/backup/performance guarantees. So this, together with support for meta-data curation might be a good opportunity... not to mention integration with services like Mendeley (recently bought by Elsevier), ResearchGate, etc...

What may be feasible is perhaps to try to agree on some common interchange vocabulary, and build a good "business case" why the maintainers of these services should at least export the metadata in this format.

Cheers
D

On Apr 25, 2013, at 15:09  - 25/04/13, Kingsley Idehen <kidehen@openlinksw.com<mailto:kidehen@openlinksw.com>> wrote:


On 4/25/13 10:57 AM, Andrea Splendiani wrote:
Hi,

Ok, let's take a practical step.
Let's assume we are going to open a call for a workshop and there we ask for "structured information". Which steps do we take and what do we need?

If we want to move one step at a time, we would still need a site to handle the submission/review process (you cannot rely on online feedback for accepting/rejecting papers with no bias in a given timeframe).
Something like easychair accepts the upload of extra files, so that could be used already off the shelf.

Second, we need to specify where and how Redfin should be used. If we are in the sw/ld area, what for? We may ask for Uris for:
Citations
Authors
Tools? Ontologies?

What else ?

URIs for:

1. provenance metadata
2. tags
3. subject matter heading / topics.



Take for example the papers here:

http://www.jbiomedsem.com/series/SWAT4LSCSHALS


What would you propose for this kind o research?

## Turtle Snippet Start ##

<http://www.jbiomedsem.com/series/SWAT4LSCSHALS><http://www.jbiomedsem.com/series/SWAT4LSCSHALS>
a <#WebDocument> ;
<#title> "Semantic technologies in healthcare and life sciences" ;
<#comment> "Edited by: Prof Jonas Almeida, Dr Albert Burger, Prof Joanne Luciano, Dr Andrea Splendiani" ;
<#publicationDate> "2012-12-17"^^<http://www.w3.org/2001/XMLSchema#date><http://www.w3.org/2001/XMLSchema#date> ;
<#lastModificationDate> "2013-03-13"^^<http://www.w3.org/2001/XMLSchema#date><http://www.w3.org/2001/XMLSchema#date> ;
<#seeAlso> <http://www.jbiomedsem.com/content/4/1/9><http://www.jbiomedsem.com/content/4/1/9>, <http://www.jbiomedsem.com/content/4/1/7><http://www.jbiomedsem.com/content/4/1/7> .

## Turtle End ##

Just a small snippet showing what can be achieved without the overhead of seeking a perfect subject matter ontology. Ultimately, this description can be enhanced (iteratively) by the b
by all parties involved. This would include cross referencing the terms to those in existing publicly available shared ontologies [1][2].

Links:

1. http://bibliontology.com/specification -- Bibliographic Ontology
2. http://linkeddata.uriburner.com/about/html/http/bibliontology.com/bibo/bibo.php#<http://linkeddata.uriburner.com/about/html/http/bibliontology.com/bibo/bibo.php>


Hope that helps showcase the fact that metadata doesn't need to be perfect. It just needs to exist in some webby structured form to get this whole thing going :-)

Kingsley



Best,
Andrea




Sent from my iPad

On 25 Apr 2013, at 15:38, Kingsley Idehen <kidehen@openlinksw.com<mailto:kidehen@openlinksw.com>> wrote:
On 4/25/13 8:37 AM, Andrea Splendiani wrote:

Well,

I think turtle is very is a a generic language to "write data".
But many people are not even used to a computational language at all... the typical interface for "data" typically being an excel spreadsheet.

Yes, and a spreadsheet too is an awesome tool for the "data scribbling" patterns I am referring to. No disagreement there since, that used to be my initial alternative to Turtle approach i.e., express RDF triples using a spreadsheet via 3 columns by N rows.

At the end, it's in a good part a question of tools that meet users typical practices.

The other good part is actually a question of incentives.
Now we can open an historical digression on how in life sciences some publishers have been functional to use of public repositories for data. The same mechanism could work for embedding metadata (if there is a need or incentive, tools come).

Yes, discoverability via the metadata graphs the emerge from associating out-of-band metadata with a PDF.


Yes another bit, I was just wondering: are we sure that authors embedding metadata in their papers is the best way to go ?

All they need to do is add metadata references (using Linked Data URIs) to the citation sections :-)


They surely know most about their data, but may get shorts of standards and even have some bias. It looks like a (modern) role for publishers could be to actually put order in metadata provided by  users.

Everyone needs to participate otherwise the "egg and chicken" conundrum stalls everything.

Kingsley


best,
Andrea


Il giorno 25/apr/2013, alle ore 11:57, Kingsley Idehen <kidehen@openlinksw.com<mailto:kidehen@openlinksw.com>> ha scritto:

On 4/25/13 2:05 AM, Ivan Herman wrote:
As for the metadata: I think even turtle is too complicated for many (sorry Kingsley). I am not talking about the average readers of this list; I am talking about authors in other disciplines. But, if we bite the bullet and we say that papers are submitted in PDF, we could at least require to include the metadata in the PDF file. After all, the metadata is included in PDF in XMP format, which is (a slightly ugly and restricted version of) RDF/XML. It is ugly, but we have enough tools around to turn it into Turtle, or JSON-LD, or whatever.
Believe me, I used to believe that Turtle was too complicated for the casual user. By that I mean a literate individual (in any natural language) that would like to use the "scribble" approach to data creation, integration, and publication.

The user profile I have in mind certainly isn't scoped to this or any list associated with Linked Data or the the broader Semantic Web etc..

Prefixes and absolute URIs are the two things that create the illusion of Turtle being complex.

I arrived at my conclusions by testing my theory against a whole range of profiles - kids, teenagers, and adults.

Once I dropped prefixes and absolute URIs from the introduction it was smooth sailing. Remember, across all natural languages underlies a form of subject-predicate-object or subject-verb-object sentence structure. Thus, <#this> <#relatesTo> <#that> etc.. becomes easy to understand.

Remember the claim I make on this very day:
Turtle is the key to unleashing the full potential of RDF model based Linked Data that scales to the Web :-)

Note, HTML is too complicated [1], and that's why we don't have a fully functional read-write Web. All we need to do is get people to understand that a text editor is the ultimate starting tool for data curation. Once the basics of structured data curation  -- based on the RDF data model -- are understood, this new profile of data curator will then look to tools to exploit the productivity benefits that they add too the endeavor.

Links:

1. http://bit.ly/ZJSaXP -- TimBL on the subject of HTML and its complications.

--

Regards,

Kingsley Idehen
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com<http://www.openlinksw.com/>
Personal Weblog: http://www.openlinksw.com/blog/~kidehen<http://www.openlinksw.com/blog/%7Ekidehen>
Twitter/Identi.ca<http://identi.ca/> handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen










--

Regards,

Kingsley Idehen
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com<http://www.openlinksw.com/>
Personal Weblog: http://www.openlinksw.com/blog/~kidehen<http://www.openlinksw.com/blog/%7Ekidehen>
Twitter/Identi.ca<http://identi.ca/> handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen








--



Regards,



Kingsley Idehen

Founder & CEO

OpenLink Software

Company Web: http://www.openlinksw.com<http://www.openlinksw.com/>

Personal Weblog: http://www.openlinksw.com/blog/~kidehen

Twitter/Identi.ca<http://Identi.ca> handle: @kidehen

Google+ Profile: https://plus.google.com/112399767740508618350/about

LinkedIn Profile: http://www.linkedin.com/in/kidehen









[]s
D
Received on Friday, 26 April 2013 08:40:26 UTC