- From: Thomas Baker <tbaker@tbaker.de>
- Date: Sat, 4 Dec 2010 19:37:52 -0500
- To: "Tillett, Barbara" <btil@loc.gov>
- Cc: Karen Coyle <kcoyle@kcoyle.net>, public-lld <public-lld@w3.org>
On Sat, Dec 04, 2010 at 06:48:09PM -0500, Barbara Tillett wrote: > This very much ties into the notions that helped VIAF > to evolve - IFLA had thought one string established by a > national bibliographic agency (NBA) would be used by everyone > in the world for the "authors" in each country and that > one bibliographic description for everything published in a > country would be provided by the NBA and that those bib records > would be used everywhere. What's wrong with this picture? > We don't all use the same languages or scripts. SO the idea > of the linked clusters of authority records evolved for VIAF, > where all names used for a person, corporate body/conference, > uniform title - that is all the text strings, plus the set > of other attributes associated with each of those entities, > would together represent that entity (be the surrogate) and > we could display the context appropriate form to an end user > based on their preference/profile/ etc. > > However, since not all use cases have an end user with a > preference/profile, we still have "default " values for a > particular name/text string that libraries will display as > their authorized access point. RDA is trying to eventually > get us out of that mind set of the authorized access point, > by first giving all the identifying elements needed for each > entity (person, corporate body, etc.) so again the context > appropriate set of elements could be displayed as needed, > but for now with limitations of MARC and no new systems yet > developed to make the visions real, we have to continue with > authorized access points/aka headings. - Barbara This is really a helpful summary! I do not see the use of identifying elements, such as text strings, which together represent an entity, discussed in the Use Case for VIAF [1]. This reinforces my sense that there is a gap in our use-case coverage on this issue. Tom [1] http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Virtual_International_Authority_File_%28VIAF%29 > ________________________________________ > From: public-lld-request@w3.org [public-lld-request@w3.org] On Behalf Of Th > omas Baker [tbaker@tbaker.de] > Sent: Saturday, December 04, 2010 10:23 AM > To: Karen Coyle > Cc: public-lld > Subject: Re: SemWeb terminology page > > Karen, > > On Fri, Dec 03, 2010 at 03:15:23PM -0800, Karen Coyle wrote: > > In her book "The intellectual foundation of information organization" > > Svenonius has a section on controlled and uncontrolled vocabularies. > > Her statement about controlled vocabularies says: > > > > "[Controlled vocabularies] are constructs in an artificial language; > > their purpose is to map users' vocabulary to a standardized vocabulary > > and to bring like information together." (p.88) [1] > > > > Do we agree that this is the role of our #1 group? I ask because I > > perceive this to be different from the original proposed definition: > > > > "These describe concepts that are used in actual metadata." > > > > If you look at FRAD [2] you see that the assignment of terminology to > > the concept is of equal or greater importance than any description of > > the concept itself. In fact, that's what I would emphasize as the role > > of a controlled vocabulary: that it is a method to *control* *language > > terms*. Many controlled vocabularies have minimal information about > > the concepts, but all exist to make a selection of particular terms of > > use. > > This introduces an interesting angle! > > My first thought was along the lines of Antoine's: Linked > Data is about using URIs when possible, and since this group > is specifically about Linked Data, we should explain that > values are not just string literals. > > But to reuse the metaphor I suggested in [1], URIs are also the > "words" of RDF's "language of data". If that is so, then I > would argue that the goal "MAP" [2], which is essentially about > mapping URIs, is analogous to the mapping of natural-language > words (string literals) that Svenonius has in mind. > > In natural language, people coin different words, or > variants on the same word, to talk about the same thing, and > "controlled vocabularies" as described above are for mapping > those diverse words to an artificial set of "controlled" words. > > In the Linked Data context, people are coining URIs for the > things they need to talk about, and the "MAP" goal is about > creating links among those URI-words. The only thing missing > from the "MAP" goal, as defined, is the notion of mapping to > one particular "authoritative" URI. > > I would argue that since we are viewing these things from > a linked data perspective, we should maintain the emphasis > on URIs. However, it does make me wonder whether there are > potential uses of linked data in leveraging literal values > that are not addressed in our LLD use cases. Two possibilities: > > -- Google Squared uses EAV (entity-attribute-value) "triples" > in their internal index -- triples composed not of URIs but > of strings extracted from Web searches. That's all I know > about it, but to me it suggests interesting possibilities > for getting from the analysis of unstructured text data > to URIs with triples. > > -- The other notion is that Linked Data could be used to pull > together a set of (natural-language) words and (string literal) > names -- a constellation of information which, taken together, > could be used to infer more information about the things > described, in support of the sort of disambiguation that > librarians engage in when they use birth and death dates, > occupations, and locations to disambiguate between people > with the same name. > > Tom > > [1] http://lists.w3.org/Archives/Public/public-lld/2010Oct/0088.html > [2] http://www.w3.org/2005/Incubator/lld/wiki/Goals > [3] http://bit.ly/hN76wK > > -- > Tom Baker <tbaker@tbaker.de> > > -- Tom Baker <tbaker@tbaker.de>
Received on Sunday, 5 December 2010 00:38:31 UTC