- From: Arthur Secret <secret@www5.cern.ch>
- Date: Wed, 28 Sep 1994 19:09:38 --100
- To: www-vlib@www0.cern.ch
Dear maintainers, Some of you will receive mail from the www-vlib for the first time. Welcome! This is the list dedicated to the administration of the World-Wide Web Virtual Library. some news... 1/ the stars At the current stage, they are not based on anything serious. I overlooked the VL quickly, and my aim was also to embellish it. However, I was thinking of setting up an automated tool, that could provide an idea of the quality of a document. I intend to base this tool on many features: - the number of links in the document - the proportion of text per link - the number of different icons used in it - does it provides several types of classification ? - how many times was it accessed last month ? - the usage of HTML tags ... When the document is split into several parts: On the top document .../Overview.html, when the parts are not refered to individually, they will be considered as part of the document On the summary .../Overview2.html, the document will be analyzed by itself I'm still thinking about this, so please give me your opinion on it If your service doesn't have as many stars as you feel it should, forgive me, this is only a temporary situation :) 2/ The classification Thanks to your comments, I've looked at several types of classification, such as Dewey, Library of Congress, Universal Decimal Classification... The main problem with these well-known classifications is that they are a century old, and occidental. + Occidental take the Dewey, section 2: Religion Sections 21 to 28 are dedicated to Christianity Section 29 is for other religions Not to speak of the Geography of the United States compared to the one of Rwanda :) + A century old Computers didn't exist, so they are relegated to some obscure part. As for recent technologies... + Based on letters and numbers Their goal is to provide a short sequence of figures (and letters for the Library of Congress), so that people can find their book quickly by saying: I want books on 791.430.944 (Dewey), or JK 9661-9993 (LC) To me, this looks a little bit like the configuration file of sendmail. But we could (should ?) take bits here and there. I found two rules that will guide me on this: - Look at what people access most, and provide a short way for these documents - Classification should occur when there is enough content to justify it (so I don't think we should start apply a whole calssification with billions of sub-parts, most of them empty :) I have discussed with several librarians in Geneva on this, and will meet tomorrow UN experts. 3/ The Form I've written a sample form at http://info.cern.ch/hypertext/DataSources/bySubject/NewReg.html It will send mail messages directly to you. I'm also testing a way to analyze these mail messages, and add automatically the new entries in some part of the document, between a <!-- begin_adds --> and a <!-- end_adds -->. So, at least in my documents, there will be a part like <H3>Automatic Registration</H3> Below, you will find new entries that are not yet included ... You will be welcome to copy my script once it is ready. The special place given to commercial entries comes from the fact that they will be added automatically on my document without moderation from me (I will only proceed to post-moderation). I'll do this because I'm sure plenty of commercial companies want to get in, and I don't want to have to add them one by one. Once we all agree on the form, I would like this form to become the preferred way of adding new entries, because it collects keywords given by the authors that will allow us to build an efficient index. 4/ info.cern.ch We have two new computers that will help info.cern.ch support its load. I have good hopes that in the next days info.cern.ch will suddenly become much much faster, as we also switch from NFS to AFS! Any comment appreciated, either on this list or directly to me Regards, Arthur
Received on Wednesday, 28 September 1994 18:09:43 UTC