W3C home > Mailing lists > Public > public-wai-rd@w3.org > October 2012

indexing RDWG symposium papers in scientific search engines

From: Shadi Abou-Zahra <shadi@w3.org>
Date: Thu, 25 Oct 2012 11:46:36 +0200
Message-ID: <50890A7C.2010301@w3.org>
To: RDWG <public-wai-rd@w3.org>
CC: daniel.poell@jku.at
Dear RDWG,

Daniel Pöll has done some excellent research for us on how to best get 
our symposium papers better indexed in scientific search engines.

Below is a summary of the main findings, please let us know if you have 
any further thoughts or comments on these findings:

#1. Search Engines

It seems that Google Scholar and Microsoft Academics are the largest 
search engine crawlers. There are several others though most seem to be 
focused on particular domains and others need to be manually pointed to 
the papers in order to index them. A list of search engines in here:

#2. Metadata Formats

Apparently Dublin Core is not as widely supported for this use as we'd 
initially thought. Some of the more widely supported metadata formats 
seem to be:
  - Highwire Press Tags
  - Eprints Tags
  - BE Press Tags
  - PRISM Tags

Of these Highwire seems to be more widely used and documented. It also 
seems that both Google Scholar and Microsoft Academics support it.

#3. Paper Requirements

The guidelines for Google Scholar (which also seem to be supported by 
Microsoft Academics) do not have a strong impact on our current paper 
structure. It seems we only need to add some <meta> elements to the HTML 
code to reflect at least the:
  - Title of the document
  - Year of publishing
  - At least one of the Author´s names

Some useful resources found include:
  - <http://scholar.google.com/intl/en/scholar/inclusion.html>

#4. Symposium Papers

We suggest the following <meta> elements to be added to the current HTML 
for the symposium papers, to get them better indexed:
  - <meta name="citation_title" content="[paper title]" />
  - <meta name="citation_author" content="[author, multiple allowed]" />
  - <meta name="citation_publication_date" content="[symposium date]" />
  - <meta name="citation_online_date" content="[paper online date]" />
  - <meta name="citation_conference_title" content="[symposium name]" />
  - <meta name="citation_journal_title" content="W3C WAI Research and 
Development Working Group (RDWG) Notes" />
  - <meta name="citation_technical_report_institution" content="W3C Web 
Accessibility Initiative (WAI)" />


Shadi Abou-Zahra - http://www.w3.org/People/shadi/
Activity Lead, W3C/WAI International Program Office
Evaluation and Repair Tools Working Group (ERT WG)
Research and Development Working Group (RDWG)
Received on Thursday, 25 October 2012 09:47:03 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:33:43 UTC