Re: Publication of scientific research

I think this is a critical issue. Maintaining a service for a long time
is a pain; still there are tools around that can help. Maintaining a
SPARQL endpoint is difficult. However, maintaining a website is less so.
More over, combining a PURL server and any of the web archives that
exist, provides an ad hoc, but functional mechanism for digital
preservation. 

Which all gets back to the reason for doing this. There are pain points
around, absolute. Perhaps if the web research community suffers these,
then they will start to understand them better and fix them. If we just
ignore them, they remain.

Phil


Daniel Schwabe <dschwabe@inf.puc-rio.br> writes:
> In the longer run, it is also not enough to provide the software for people to
> download and set up servers, etc..., to run a conference. It's more work, and,
> more fundamentally, implies a long range commitment that most institutions are
> not willing or able to make. One the reasons EasyChair is so popular is that
> it provides the *service*, with high
> reliablility/availability/backup/performance guarantees. So this, together
> with support for meta-data curation might be a good opportunity... not to
> mention integration with services like Mendeley (recently bought by Elsevier),
> ResearchGate, etc...
>
> What may be feasible is perhaps to try to agree on some common interchange
> vocabulary, and build a good "business case" why the maintainers of these
> services should at least export the metadata in this format.
>
> Cheers
> D
>
> On Apr 25, 2013, at 15:09 - 25/04/13, Kingsley Idehen <kidehen@openlinksw.com>
> wrote:
>
>> On 4/25/13 10:57 AM, Andrea Splendiani wrote:
>>> Hi,
>>> 
>>> Ok, let's take a practical step.
>>> Let's assume we are going to open a call for a workshop and there we ask
>>> for "structured information". Which steps do we take and what do we need?
>>> 
>>> If we want to move one step at a time, we would still need a site to handle
>>> the submission/review process (you cannot rely on online feedback for
>>> accepting/rejecting papers with no bias in a given timeframe).
>>> Something like easychair accepts the upload of extra files, so that could
>>> be used already off the shelf.
>>> 
>>> Second, we need to specify where and how Redfin should be used. If we are
>>> in the sw/ld area, what for? We may ask for Uris for:
>>> Citations
>>> Authors
>>> Tools? Ontologies?
>>> 
>>> What else ?
>> 
>> URIs for:
>> 
>> 1. provenance metadata
>> 2. tags
>> 3. subject matter heading / topics.
>> 
>>> 
>>> Take for example the papers here:
>>> 
>>> http://www.jbiomedsem.com/series/SWAT4LSCSHALS
>>> 
>>> What would you propose for this kind o research?
>> 
>> ## Turtle Snippet Start ##
>> 
>> <http://www.jbiomedsem.com/series/SWAT4LSCSHALS> 
>> a <#WebDocument> ;
>> <#title> "Semantic technologies in healthcare and life sciences" ;
>> <#comment> "Edited by: Prof Jonas Almeida, Dr Albert Burger, Prof Joanne
>> Luciano, Dr Andrea Splendiani" ;
>> <#publicationDate> "2012-12-17"^^<http://www.w3.org/2001/XMLSchema#date> ;
>> <#lastModificationDate>
>> "2013-03-13"^^<http://www.w3.org/2001/XMLSchema#date> ;
>> <#seeAlso> <http://www.jbiomedsem.com/content/4/1/9>,
>> <http://www.jbiomedsem.com/content/4/1/7> .
>> 
>> ## Turtle End ##
>> 
>> Just a small snippet showing what can be achieved without the overhead of
>> seeking a perfect subject matter ontology. Ultimately, this description can
>> be enhanced (iteratively) by the b
>> by all parties involved. This would include cross referencing the terms to
>> those in existing publicly available shared ontologies [1][2].
>> 
>> Links:
>> 
>> 1. http://bibliontology.com/specification -- Bibliographic Ontology
>> 2.
>> http://linkeddata.uriburner.com/about/html/http/bibliontology.com/bibo/bibo.php#
>> 
>> 
>> Hope that helps showcase the fact that metadata doesn't need to be perfect.
>> It just needs to exist in some webby structured form to get this whole thing
>> going :-)
>> 
>> Kingsley 
>>> 
>>> Best,
>>> Andrea
>>> 
>>> 
>>> 
>>> Sent from my iPad
>>> 
>>> On 25 Apr 2013, at 15:38, Kingsley Idehen <kidehen@openlinksw.com> wrote:
>>> 
>>>> On 4/25/13 8:37 AM, Andrea Splendiani wrote:
>>>>> Well,
>>>>> 
>>>>> I think turtle is very is a a generic language to "write data".
>>>>> But many people are not even used to a computational language at all...
>>>>> the typical interface for "data" typically being an excel spreadsheet.
>>>> 
>>>> Yes, and a spreadsheet too is an awesome tool for the "data scribbling"
>>>> patterns I am referring to. No disagreement there since, that used to be
>>>> my initial alternative to Turtle approach i.e., express RDF triples using
>>>> a spreadsheet via 3 columns by N rows.
>>>>> At the end, it's in a good part a question of tools that meet users
>>>>> typical practices.
>>>>> 
>>>>> The other good part is actually a question of incentives.
>>>>> Now we can open an historical digression on how in life sciences some
>>>>> publishers have been functional to use of public repositories for data.
>>>>> The same mechanism could work for embedding metadata (if there is a need
>>>>> or incentive, tools come).
>>>> 
>>>> Yes, discoverability via the metadata graphs the emerge from associating
>>>> out-of-band metadata with a PDF.
>>>>> 
>>>>> Yes another bit, I was just wondering: are we sure that authors embedding
>>>>> metadata in their papers is the best way to go ?
>>>> 
>>>> All they need to do is add metadata references (using Linked Data URIs) to
>>>> the citation sections :-)
>>>> 
>>>>> They surely know most about their data, but may get shorts of standards
>>>>> and even have some bias. It looks like a (modern) role for publishers
>>>>> could be to actually put order in metadata provided by users.
>>>> 
>>>> Everyone needs to participate otherwise the "egg and chicken" conundrum
>>>> stalls everything.
>>>> 
>>>> Kingsley
>>>>> 
>>>>> best,
>>>>> Andrea
>>>>> 
>>>>> 
>>>>> Il giorno 25/apr/2013, alle ore 11:57, Kingsley Idehen
>>>>> <kidehen@openlinksw.com> ha scritto:
>>>>> 
>>>>>> On 4/25/13 2:05 AM, Ivan Herman wrote:
>>>>>>> As for the metadata: I think even turtle is too complicated for many
>>>>>>> (sorry Kingsley). I am not talking about the average readers of this
>>>>>>> list; I am talking about authors in other disciplines. But, if we bite
>>>>>>> the bullet and we say that papers are submitted in PDF, we could at
>>>>>>> least require to include the metadata in the PDF file. After all, the
>>>>>>> metadata is included in PDF in XMP format, which is (a slightly ugly
>>>>>>> and restricted version of) RDF/XML. It is ugly, but we have enough
>>>>>>> tools around to turn it into Turtle, or JSON-LD, or whatever.
>>>>>> Believe me, I used to believe that Turtle was too complicated for the
>>>>>> casual user. By that I mean a literate individual (in any natural
>>>>>> language) that would like to use the "scribble" approach to data
>>>>>> creation, integration, and publication.
>>>>>> 
>>>>>> The user profile I have in mind certainly isn't scoped to this or any
>>>>>> list associated with Linked Data or the the broader Semantic Web etc..
>>>>>> 
>>>>>> Prefixes and absolute URIs are the two things that create the illusion
>>>>>> of Turtle being complex.
>>>>>> 
>>>>>> I arrived at my conclusions by testing my theory against a whole range
>>>>>> of profiles - kids, teenagers, and adults.
>>>>>> 
>>>>>> Once I dropped prefixes and absolute URIs from the introduction it was
>>>>>> smooth sailing. Remember, across all natural languages underlies a form
>>>>>> of subject-predicate-object or subject-verb-object sentence structure.
>>>>>> Thus, <#this> <#relatesTo> <#that> etc.. becomes easy to understand.
>>>>>> 
>>>>>> Remember the claim I make on this very day:
>>>>>> Turtle is the key to unleashing the full potential of RDF model based
>>>>>> Linked Data that scales to the Web :-)
>>>>>> 
>>>>>> Note, HTML is too complicated [1], and that's why we don't have a fully
>>>>>> functional read-write Web. All we need to do is get people to understand
>>>>>> that a text editor is the ultimate starting tool for data curation. Once
>>>>>> the basics of structured data curation -- based on the RDF data model --
>>>>>> are understood, this new profile of data curator will then look to tools
>>>>>> to exploit the productivity benefits that they add too the endeavor.
>>>>>> 
>>>>>> Links:
>>>>>> 
>>>>>> 1. http://bit.ly/ZJSaXP -- TimBL on the subject of HTML and its
>>>>>> complications.
>>>>>> 
>>>>>> -- 
>>>>>> 
>>>>>> Regards,
>>>>>> 
>>>>>> Kingsley Idehen    
>>>>>> Founder & CEO
>>>>>> OpenLink Software
>>>>>> Company Web: http://www.openlinksw.com
>>>>>> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
>>>>>> Twitter/Identi.ca handle: @kidehen
>>>>>> Google+ Profile: https://plus.google.com/112399767740508618350/about
>>>>>> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> 
>>>> Regards,
>>>> 
>>>> Kingsley Idehen    
>>>> Founder & CEO
>>>> OpenLink Software
>>>> Company Web: http://www.openlinksw.com
>>>> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
>>>> Twitter/Identi.ca handle: @kidehen
>>>> Google+ Profile: https://plus.google.com/112399767740508618350/about
>>>> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> 
>> 
>> -- 
>> 
>> Regards,
>> 
>> Kingsley Idehen	      
>> Founder & CEO 
>> OpenLink Software     
>> Company Web: http://www.openlinksw.com
>> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
>> Twitter/Identi.ca handle: @kidehen
>> Google+ Profile: https://plus.google.com/112399767740508618350/about
>> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>> 
>> 
>> 
>> 
>
> []s
> D
>

-- 
Phillip Lord,                           Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics,             Email: phillip.lord@newcastle.ac.uk
School of Computing Science,            http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,               skype: russet_apples
Newcastle University,                   twitter: phillord
NE1 7RU                                 

Received on Friday, 26 April 2013 09:53:32 UTC