- From: Emmanuelle Bermes <manue@figoblog.org>
- Date: Tue, 2 Aug 2011 19:02:35 +0200
- To: romain.wenz@bnf.fr
- Cc: public-lld@w3.org
Dear Romain, Thank you for reviewing our draft report. Your comments are really useful. I've just added the reference of your mail to our list of reviews [1], so that we will be able to process your feedback when updating the report. Best regards, Emma [1] http://www.w3.org/2005/Incubator/lld/wiki/DraftReportReviewerAssignments On Fri, Jul 22, 2011 at 3:20 PM, <romain.wenz@bnf.fr> wrote: > > Hello, > With colleagese, we have been reviewing the draft report at > http://www.w3.org/2005/Incubator/lld/wiki/DraftReportWithTransclusion > > Please find enclosed some comments, section by section, and suggestions. > > All best, > Romain Wenz > > Département de l'Information Bibliographique et Numérique > Bibliothèque nationale de France > Quai François Mauriac > 75706 Paris cedex 13 > 33 (0)1 53 79 37 39 > ---------------------------------------------------- > > 3. Benefits > > 3.2. Benefits of the Linked Data approach > Comment: Libraries produce reliable data, especially vocabularies and > authority data. If they open them as linked data, as soon as they use shared > ontology, they can help structure the Web of Data with data that can be > trusted, with vocabularies that anyone can link to. > > Suggestion The Web needs to be structured with reliable and clean data, and > libraries can provide them. > > 3.2.3. Benefits to Librarians, archivists and curators > Comment: Among the very positive aspects of “linked data” for libraries, > there is the possibility to act at different levels, with various benefits. > > Suggestion Every approach can offer specific benefits, from internal re-use > of data and identifiers to links or services to the end-user. > > 3.2.4. Benefits to Developers > Comment: The general benefit is to get rid of specific library formats, > which are not really interoperable (e.g. various MARCs). This is very > important, so as to break barriers between libraries and between library > data and other types of data. But the transition from library-specific data > to LD won't be straightforward. > > Suggestion It will be possible to work step by step, with Web protocols. > Suggestion A section that could be added as “3.2.5.”: > “Benefits to service providers, software vendors and external developers: > These developers will work with other important players: service providers, > software vendors and external developers. > The consequences are: > - Research and development could be enhanced through these players. > They could also work with research laboratories. > - Libraries will still work with external vendors. > - A new market emerges for industrials, developers and service > providers, which can increase their financial benefits. For instance, using > interoperable RDF formats enable other actors to re-use structured data > provided by libraries.” > > 5. Relevant technologies > Comment: We are talking about building structure in Web content, so that > data from the Web can be used by machines, the way it would be in databases. > > Suggestion Building a « Linked data » infrastructure does not imply to > create yet another silo. > > 5.5 Microformats, Microdata and RDFa > Comment: Linked Data can go one step further from the work that has been > done, for instance for OAI sets. > > Suggestion RDFa can be a step for using existing information by distilling > it into a Web structure. > > 6. Implementation challenges and barriers to adoption > The whole section is clumsy because it makes no difference between various > situations. We can find more or less advanced projects: as the “use case” > section shows, libraries can be very innovative. > http://www.w3.org/2005/Incubator/lld/wiki/UseCaseReport > > 6.1 Designed for stability, the library ecosystem resists change > Comment: The library ecosystem has been changing since Zenodotus. Semantic > Web techniques are different from traditional computer services, and budgets > are not on a comparable basis. Furthermore, today libraries data are digital > data and it’s not necessary to program retrospective conversion of printed > catalogues. Data are already digital data, structured with digital formats. > The historical depth of the libraries and librarian data is a very important > asset in the frame of the semantic web, for which the notion of trust is > essential. Libraries improve the quality of their data by constant > revisions. > > Suggestion Even if designed for stability, the library ecosystem moved > early to computer systems and keeps adapting to technological changes. > > 6.1.2 Library Data is shareable among libraries, but not yet with the wider > world > Comment: Librarians often work, for instance, with the archival community. > For instance, XML DTD EAD (Encoded Archival Initiative) was jointly created > by librarians and archivists in order to encode descriptions of archival > collections. > > Suggestion Through cooperation with Archives and Museums, libraries already > share data and standards with a “Wider world”. Moving to Linked Data is a > natural continuation. > > 6.1.3 Libraries are understaffed in the technology area > This part is overstrong and rude to libraries who actually recruit and work > in the technology area. > Suggestion It is not just a matter of recruiting “IT people”, but of > training librarians so that they are aware and efficient in Web > technologies, and making sure Computing departments and librarians work > together. This is what libraries do. > > 6.2 Libraries do not adapt well to technological change > Comment: Libraries will need to manage the legacy of MARC format-based data > for a long period of time even if they manage to shift to LD strategies and > tools for their current practices. > This means that before enjoying all the benefits of LD (listed in the scope > document), libraries will need to maintain parallel systems, which means an > increase of costs and efforts in software and format development and in data > management. > In the short term library developers will still have to deals with these > formats, which are renewed. > > Suggestion When convincing examples are shown, Libraries adapt very well to > technological changes. > > 6.2.2 Library standardization process is cumbersome > Comment: But possible! > Libraries are used to transform their formats, to map them with other > formats, to make them evolve when they work on new projects, new > technologies, and new types of documents. > > Suggestion It takes time, so that the formats fits to the need, but it is > part of the libraries’ culture. > > 6.2.4 Library standards are limited to the library data > Comment: Library data are not only bibliographic data. Libraries catalogues > contains also authority records with many pieces of information about > persons, families, corporate bodies, works, and subjects. Authority data > provide nominated entities and may provide permanent identifiers for these > entities (such as ARK identifiers in BnF catalogues). > > Suggestion With reliable identifiers, Authority data are also key elements > for the semantic web. > > 6.3 ROI is difficult to calculate > Comment: Benefits are as difficult as cost to estimate precisely, but some > can and must be underlined. Mutualisation of the creation of data reduces > redundancies, increases staff efficiency, and allows librarians to focus on > other tasks like research on collections or conservation. > Linking the data of a library to cooperative metadata produced by reliable > institutions adds value to its data. Opening library linked data may create > economical value for a country, by allowing commercial reuses of that data > (Open data). Opening library data increases the users traffic and the > visibility of collections (through reuse, SEO, etc.), and thus the > possibilities of their ROI. > Using richer, more flexible, more relevant data improves the accessibility > and the services to users: in public institutions, public utility is a ROI > by itself. Helping researchers is another one. > > Suggestion It is difficult to calculate ROI precisely, but it is easy to > see financial benefits (re-use, links, cuts of redundant tasks). > > 6.3.3 Vocabulary changes in library data are costly > Comment: With an Authority File providing permanent identifiers and links, > it is relatively easy to update any field linked with it. All changes in > authority records can be automatically transferred into related > bibliographic records. > > Suggestion Moving to linked data implies to rely on authority files and > identifiers. > > 6.4.1- Some data cannot be published openly > Comment: In some countries, there is a distinction between “public > information” and “information that can be processed by machines”. In that > case, information that is available for individuals needs to be justified > and declared for massive use in computer programs. > > Suggestion There can be national specificities. They have to be clearly > stated by the publishers. > > 6.4.2- Rights ownership can be unmanageably complex > Comment: Copied and extracted records are one thing. There is also a > question about the “linked data itself”. The need to quote also means, for > the provider, being able to report about the use. In some countries > (including France) the use of the tax-payer’s money has to be justified. You > have to report for the money: the only way to do it is to have metrics. This > implies knowing who is using the data, even for free. > > Suggestion Thanks for feedback and quoting if you use our data! > > 7. Recommendations > > 7.1.1 Identify sets of data as possible candidates for early exposure as LD > Comment: Structured data rely on the use of identifiers. Publishing early > authority files and controlled vocabularies as linked data will make easier > further publication of bibliographic records as linked data, by allowing > links to them as a backbone for bibliographical information. > Suggestion Authority files can be a basket for the "low hanging fruits" > from other libraries. > > 7.1.2. For each set of data, determine ROI of current practices, and costs > and ROI of exposing as LD > Comment: Determining costs and ROI of exposing sets of data will help > choosing witch value vocabularies and datasets could have priority. > Therefore, determining ROI has to be done globally. > Suggestion Not necessarily “for each set of data”. > > 7.1.3. Consider migration strategies > Comment: Using Semantic Web technologies inside the library “catalogue” > seems very promising, because it will allow a very more flexible and > interoperable use of data: modelling, linking, merging, querying, removing > redundancies, integrating external data from various formats and publishing > as various formats, etc. > This is obviously a great aim for libraries, but it is much more difficult > than only publishing data as linked data. It must not be an obstacle: it may > be better for a library to publish first some sets of data as liked data > than trying from the beginning to migrate its entire catalogue. > Therefore, the migration of data does not need to cover all possible data. > It can be only the useful part. This is obviously the case when commercial > services use RDFa for SEO, with the subset of products which people will be > looking for. In fact, when we are just putting data into RDF, it is not > useful if there are no links. > > Suggestion Libraries can “pick and choose” what is relevant and migrate it. > Suggestion Using RDF inside the systems themselves is another question that > has to be advocated. > > 7.2.2: Identify Linked Data literacy needed for different staff roles in the > library > Comment: In fact, when using the current datasets so as to use them in RDF, > we see that cataloguing still has to address the creation of links. Mainly > for reconciliation and alignments of concepts (for instance: “do those two > books tell the same story?”). There, the data obviously still needs to be > curated by humans. > But by re-using links and data produced by others, we can expect the > cataloguing work to be: > - more centralized; > - more about creating links (less about writing dates, names or page > numbers…). > > Suggestion These evolutions have to be clear on the business side. > > 7.4 Identify and Link. > 7.4.1 “Create URIs for the items in library datasets” > Comment: Providing identifier is the only way to make links. For big > libraries permanent identifiers are already being used (e.g. ARK identifiers > for all resources at the BnF). > > Suggestion This is the basis. > > ________________________________ > > Exposition Enluminures en terre d?Islam entre abstraction et figuration - > jusqu'au 25 septembre 2011 - BnF - Richelieu / Galerie Mansart > > Avant d'imprimer, pensez à l'environnement.
Received on Tuesday, 2 August 2011 17:03:13 UTC