- From: Tillett, Barbara <btil@loc.gov>
- Date: Mon, 6 Dec 2010 07:51:35 -0500
- To: "'Karen Coyle'" <kcoyle@kcoyle.net>, Thomas Baker <tbaker@tbaker.de>
- CC: public-lld <public-lld@w3.org>
I still see there being an "identifier" for the entity (a URI or URL or id number unique to a particular system) apart from text strings meant for displays to humans. - bt -----Original Message----- From: Karen Coyle [mailto:kcoyle@kcoyle.net] Sent: Sunday, December 05, 2010 11:14 AM To: Thomas Baker Cc: Tillett, Barbara; public-lld Subject: Re: SemWeb terminology page Quoting Thomas Baker <tbaker@tbaker.de>: > On Sat, Dec 04, 2010 at 06:48:09PM -0500, Barbara Tillett wrote: SO the idea >> of the linked clusters of authority records evolved for VIAF, where >> all names used for a person, corporate body/conference, uniform title >> - that is all the text strings, plus the set of other attributes >> associated with each of those entities, would together represent that >> entity (be the surrogate) and we could display the context >> appropriate form to an end user based on their preference/profile/ >> etc. > > I do not see the use of identifying elements, such as text strings, > which together represent an entity, discussed in the Use Case for VIAF > [1]. This reinforces my sense that there is a gap in our use-case > coverage on this issue. Barbara and Tom, are you saying that the text strings taken as an aggregation are the *identifier* for the entity? If so, I'm not sure how that would work in practice. VIAF as structured assigns a VIAF identifier that I thought was used to identify the entity. If I have mis-understood and the text strings are to be considered a surrogate, then I wonder what functions that surrogate plays in the use of VIAF in applications. The other option is that each text string is a 'surrogate' or label for the entity in the context in which it is used. kc > > Tom > > [1] > http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Virtual_Internation > al_Authority_File_%28VIAF%29 > > >> ________________________________________ >> From: public-lld-request@w3.org [public-lld-request@w3.org] On Behalf >> Of Th omas Baker [tbaker@tbaker.de] >> Sent: Saturday, December 04, 2010 10:23 AM >> To: Karen Coyle >> Cc: public-lld >> Subject: Re: SemWeb terminology page >> >> Karen, >> >> On Fri, Dec 03, 2010 at 03:15:23PM -0800, Karen Coyle wrote: >> > In her book "The intellectual foundation of information organization" >> > Svenonius has a section on controlled and uncontrolled vocabularies. >> > Her statement about controlled vocabularies says: >> > >> > "[Controlled vocabularies] are constructs in an artificial >> > language; their purpose is to map users' vocabulary to a >> > standardized vocabulary and to bring like information together." >> > (p.88) [1] >> > >> > Do we agree that this is the role of our #1 group? I ask because I >> > perceive this to be different from the original proposed definition: >> > >> > "These describe concepts that are used in actual metadata." >> > >> > If you look at FRAD [2] you see that the assignment of terminology >> > to the concept is of equal or greater importance than any >> > description of the concept itself. In fact, that's what I would >> > emphasize as the role of a controlled vocabulary: that it is a >> > method to *control* *language terms*. Many controlled vocabularies >> > have minimal information about the concepts, but all exist to make >> > a selection of particular terms of use. >> >> This introduces an interesting angle! >> >> My first thought was along the lines of Antoine's: Linked Data is >> about using URIs when possible, and since this group is specifically >> about Linked Data, we should explain that values are not just string >> literals. >> >> But to reuse the metaphor I suggested in [1], URIs are also the >> "words" of RDF's "language of data". If that is so, then I would >> argue that the goal "MAP" [2], which is essentially about mapping >> URIs, is analogous to the mapping of natural-language words (string >> literals) that Svenonius has in mind. >> >> In natural language, people coin different words, or variants on the >> same word, to talk about the same thing, and "controlled >> vocabularies" as described above are for mapping those diverse words >> to an artificial set of "controlled" words. >> >> In the Linked Data context, people are coining URIs for the things >> they need to talk about, and the "MAP" goal is about creating links >> among those URI-words. The only thing missing from the "MAP" goal, >> as defined, is the notion of mapping to one particular >> "authoritative" URI. >> >> I would argue that since we are viewing these things from a linked >> data perspective, we should maintain the emphasis on URIs. However, >> it does make me wonder whether there are potential uses of linked >> data in leveraging literal values that are not addressed in our LLD >> use cases. Two possibilities: >> >> -- Google Squared uses EAV (entity-attribute-value) "triples" >> in their internal index -- triples composed not of URIs but >> of strings extracted from Web searches. That's all I know >> about it, but to me it suggests interesting possibilities >> for getting from the analysis of unstructured text data >> to URIs with triples. >> >> -- The other notion is that Linked Data could be used to pull >> together a set of (natural-language) words and (string literal) >> names -- a constellation of information which, taken together, >> could be used to infer more information about the things >> described, in support of the sort of disambiguation that >> librarians engage in when they use birth and death dates, >> occupations, and locations to disambiguate between people >> with the same name. >> >> Tom >> >> [1] http://lists.w3.org/Archives/Public/public-lld/2010Oct/0088.html >> [2] http://www.w3.org/2005/Incubator/lld/wiki/Goals >> [3] http://bit.ly/hN76wK >> >> -- >> Tom Baker <tbaker@tbaker.de> >> >> > > -- > Tom Baker <tbaker@tbaker.de> > -- Karen Coyle kcoyle@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Received on Monday, 6 December 2010 12:52:56 UTC