Introduction to controlled vocabularies: Terminology for art, architecture, and other cultural works, by Patricia Harpring.
Reviewed by Elizabeth Knazook
, Robert Bothmann, News Editor
Carlo Revelli on the (Non)Autonomy of Cataloging
A Research Agenda for Cataloging: The CCQ Editorial Board Responds to the Year of Cataloging Research
Richard P. Smiraglia
ABSTRACT: The cataloging and classification community was called to highlight 2010 as "The Year of Cataloging Research," and specifically was challenged to generate research ideas, conduct research, and generally promote the development of new research in cataloging. Cataloging & Classification Quarterly has become the most influential journal of research in cataloging and classification since its inception in 1981. The idea behind the research reported here was to give the CCQ editorial board an opportunity to present its point of view about research for cataloging. A Delphi study was conducted in three stages during the 2009-2010 academic year. Members were asked to define the key terms "cataloging," "evidence," and "research," and to develop a research agenda in cataloging. The results reveal a basic core definition of cataloging perceived as a dynamic, active process at the core of information retrieval. An eight point research agenda emerges that is forward-looking and embraces change, along with top-ranked calls for new empirical evidence about catalogs, cataloging, and catalog users.
Hidden in Plain Sight? Records for On-Demand Academic Public Lectures in OCLC WorldCat: A Survey
ABSTRACT: This study examines record creation in OCLC WorldCat for the public and other non-curricular lectures that many colleges and universities in the United States post on the Internet as streaming or downloadable video or audio. It presents research indicating that few libraries catalog this lecture material in WorldCat but also that some libraries catalog heavily in this area. It suggests reasons for the relative neglect of this material by libraries, discusses the merits of making these lectures accessible through library catalogs, and concludes by identifying areas for future research.
KEYWORDS: On-demand academic public lectures, online resources, cataloging, collection development, OCLC WorldCat
Transgender Subject Access: History and Current Practice
ABSTRACT: This article evaluates representation of transgender people and experiences in Library of Congress Subject Headings (LCSH). It compares LCSH treatment of transgender topics to that of controlled vocabularies developed to describe GLBT collections, as well as their treatment by scholarly GLBT encyclopedias. The appraisal of these knowledge domains demonstrates the continued relevance of subject descriptors as a mode of knowledge production both for information professionals and for those we serve. It also suggests strategies available to librarians to render transgender people more visible and accessible in library catalogs, including incorporating new technologies as well as modifying established cataloging instruments.
KEYWORDS: Library of Congress Subject Headings, information retrieval thesauri, transgender people, LGBT studies, gender studies
The DeathFlip Project: Automating Death Date Revisions to Name Headings in Bibliographic Records
Michael Kreyche, Peter H. Lisius, and Amey Park
ABSTRACT: The 2005 revision of Library Congress Rule Interpretation 22.17, allowing the addition of death dates to personal name headings with open dates, had a significant impact on the maintenance of bibliographic records. The decision not to include open date forms as "see from" fields in revised authority records destined catalog maintenance staff to higher levels of manual review and editing. At Kent State University this prompted a search for better options and resulted in the creation of a shadow authority file intended solely for automated "flipping" of open date headings to the new forms. The file is available to other libraries for use and experimentation.
KEYWORDS: Automated authority control, name authority records, personal names, death dates, bibliographic maintenance
Metadata Quality Control in Digital Repositories and Collections:
Criteria, Semantics, and Mechanisms
Jung-ran Park and Yuji Tosaka
ABSTRACT: This article evaluates practices related to metadata quality control in digital repositories and collections as revealed by an online survey of cataloging and metadata professionals in the United States. The survey questions address the perceived importance of metadata quality, metadata quality evaluation criteria and issues, and mechanisms for building quality assurance into the metadata creation process. The results reveal wide-spread recognition of the essential role of metadata quality assurance. Accuracy and consistency are found to be the primary criteria for evaluating metadata quality. Semantics affects consistent and accurate metadata application. There was a strong awareness that metadata quality correlates with the widespread adoption of various quality control mechanisms, such as staff training, manual review, metadata guidelines, and metadata generation tools. Metadata guidelines are reported by respondents to be used less frequently as a quality assurance mechanism in digital collections involving multiple institutions.
KEYWORDS: Metadata quality control and evaluation, metadata semantics, metadata guidelines, semi-automatic metadata generation, digital repositories
Metadata Practices in Academic and Non-Academic Libraries: A Survey
ABSTRACT: This article presents the results of a survey examining and comparing the metadata practices of academic and non-academic libraries regarding digital projects. It explores the types of metadata and vocabularies utilized, issues of interoperability, end-user created metadata, and staffing for metadata planning and creation. Participants from 87 academic libraries and 40 non-academic libraries responded to the survey. The survey found that, despite their different environments, academic and non-academic libraries engage in similar metadata practices. The majority of the participating libraries have metadata librarians, who are the primary staff members responsible for all metadata activities. Academic libraries tend to use more metadata schemes, plan for metadata interoperability more frequently, and are more likely to have created new positions responsible for metadata for digital projects.
KEYWORDS: Metadata, academic libraries, non-academic libraries, cataloging, metadata specialist/librarian, digital projects
International Observer (A column in honor of Carlo Revelli)
David Bade, Column Editor
For many years I have been amazed at the number of people, including some catalogers, who sincerely believe that with a couple keywords-in English-typed into a search box, everything that needs to be considered will be handed to us. In an environment saturated with claims about access to all the world's knowledge and demands for international standards and cooperation, the universe of discourse is more often than not reduced to whatever is online and in English: nothing else counts. An international approach to libraries ought at least to seek out ideas expressed beyond the confines of the English language. Unfortunately, getting to that literature-in spite of our wonderful search engines and specialized databases-is not always easy.
A few years ago in his review of "Le catalogue," a special issue of the Revue de la Bibliothèque nationale de France, Michael Carpenter wrote "For reasons that are not entirely clear, finding information on the cataloging practices of other countries, even Western European countries such as France, is always a difficult task for those working in American libraries."He concluded his review with the remark "It is perhaps a sign of linguistic provincialism that only eleven American libraries are currently recorded as holding issues of the Revue. Given the utility of the material in journals such as the one under review, such collection development failures cannot help American librarianship learn from other traditions." While book review editor of Cataloging & Classification Quarterly Carpenter brought to the attention of the readers of this journal a number of important works on cataloging which were written in European languages and this new column is an attempt to take a step further in that direction.
The International Observer will be an occasional column in Cataloging & Classification Quarterly modeled on and named after Carlo Revelli's column Osservatorio Internazionale. Revelli's column has appeared regularly in Biblioteche Oggi since 1994, and in it he reviews the non-Italian periodical literature on a particular topic, weaving a diversity of perspectives and problems together in a running commentary and questioning of that topic. It is a column that represents the thinking of a distinguished scholar of cataloging theory and history as well as the perspectives and problems of a public library director. Revelli does not suffer from provincialism but pays attention to the special needs of "provincial" and special libraries of all kinds, as well as special classes of library users. More than just a survey of a topic and the literature about it, each article combines Revelli's careful selection from the literature with his intelligent discussion, the result being something very different from (and an excellent supplement to) any set of items captured by keyword. It is a column that I have enjoyed so much that I have often wondered why no such column appears in any American or other English language publication. Hence my proposal to CCQ and the initiation of the column you are now reading.
The nature of this column will differ slightly from that of Osservatorio Internazionale. Whereas Revelli's column appears in a journal devoted to librarianship in general, the primary goal here will be to discuss recent literature on particular topics relevant to cataloging and classification, whether managerial issues, standards, tools, practices, cooperation, or any topic at all that may appear in the literature related to librarianship, information science, or any other field that an open mind can relate to cataloging.
Furthermore Revelli generally devotes a great deal of space to publications from the Anglo-American provinces whereas here the object will be to focus on publications not in English and published outside the Anglo-American world. In a world in which globalization is often assumed to be Americanization, a knowledge and appreciation of the differences that actually exist among us appears to me to be an urgent necessity. Michael Carpenter wrote in his review of the first edition (1996) of Revelli and Visintin's Il Catalogo, "For those whose outlook on librarianship is parochially American, I definitely recommend the book as a sovereign antidote to the unjustified preconception that all the world holds to the same views on cataloging that the English-speaking world has.">It is my hope that this column will function in a similar manner by introducing readers from around the world to the great range of perspectives, problems, research, and experiences that exist today.
The most appropriate topic for the first column are the writings of Carlo Revelli himself. Although Revelli has been writing about cataloging since 1960 I will focus on three books published during the past decade: Il Catalogo (with Giulia Visintin, 3rd ed., 2008), Citazione bibliografica (2002), and La biblioteca come teoria e come pratica: antologia degli scritti (2006). Rather than review these books independently, I will follow Revelli's method of looking at them together with a particular topic in mind. The topic I have chosen arose directly out of my reading these books in close succession, and I will formulate it as a question:
Is it possible (or desirable) to understand cataloging as an autonomous activity?
The question took a while to form in my mind; initially I reflected on the experience of déjà vu while reading his Citazione bibliografica. Nearly all of the issues discussed in that book were familiar as topics in cataloging, but the treatments discussed by Revelli varied. He begins by distinguishing quotation from citation (citazione is used for both in Italian), limiting himself in this book to the latter. He then proceeds to discuss footnotes, again noting that his topic is limited to bibliographical footnotes. Having established the limits of his study, he differentiates between bibliographical description and citation in the section "The description of the document."
We know that the description of a document, whether this be physically independent or contained within the publication, in itself does not present problems of availability but only of identification and comprehensibility.iv
This "double necessity of describing and identifying" a document in the practice of citation differs from the practice of descriptive bibliography in being more modest "in so far as the minimum of information that allows identification and finding the document is sufficient."v
From here he moves on to the elements of citation, discussing these in the context of ISBD, the Italian cataloging norms (RICA, or Regole italiane di catalogazione per autori), and the Anglo-American cataloging standard AACR2. The many descriptive elements that one may include in a citation vary in their importance, and "Therefore the opportunity of offering greater detail depends upon the importance given to the text, to its characteristics."vi He proceeds to discuss the various elements that may and variously are included in bibliographical citations, concluding his remarks with a discussion of punctuation, capitalization, the use of italics, etc. Punctuation in ISBD, he notes, is not really punctuation but a system of signs to qualify the meaning of what follows. ISBD punctuation conceived in this manner calls into question not only many discussions of punctuation but the very different status of punctuation in RDA. I will not pursue the implications of this interpretation here, but his final observation in this section "We ought to seek not an absolute coherence, but coherence within the document, and this coherence should be compatible with comprehension and legibility"vii hints at the role of punctuation as a system of signs that work in an international setting.
We move on to problems regarding the order of elements within a citation and access to the citation within a document such as a bibliography or catalog. Description by itself, he notes
does not offer points of access, or in other words, it does not function to find information relative to the document, except in the case of the online catalog within which one can find it by means of any word the description contains. If we consider a bibliographical compilation or even more simply the bibliographical citations at the end of a text, these conditions require the descriptions to be organized according to fixed criteria. Every description can be ordered according to a determinate access point, compatible with the other entries assigned to the other descriptions in the same bibliography. The criteria can be alphabetic by author, alphabetic by subject, systematic, chronological, by type of material and so on.viii
Here Revelli is reprising an argument from one of his most brilliant papers, "L'intestazione principale: un reperto archeologico?" [Main entry: an archeological artifact?], published originally in 1996 and reprinted in La biblioteca come teoria e come pratica. In that article Revelli examines citation practices in the contexts of printed bibliographies, card catalogs, and online databases. He discusses a number of writers from Domanovszky to Gorman and beyond on the obsolete and "arbitrary distinction between main and added entries"ix (quoting Nora Tamberg from a 1974 paper), and among the quotations he provides we read R. Conrad Winke's remark "Catalogers no longer have the luxury of continuing out-dated practices solely for the sake of tradition."x To this unanimous chorus Revelli responds:
In fact, we are not dealing with a luxury that is perhaps even superfluous and devoid of utility: the issue here is to evaluate whether this norm [the concept of main entry] retains any meaning when transported into an environment different from the one for which it was established. Olivia M.A. Madison notes the frequent doubts concerning the utility of the principal of main entry in the online catalog, but in spite of the costs associated with it, considers it useful for controlled access . as the principal element of the citation that identifies a publication since author-title form of citation is preferred. The alternative would be to present incoherent solutions: "deleting authorship as a prominent part of the citation process would only create greater confusion with catalog organization."xi
With this argument, the question which forms the topic of this first installment of The International Observer hit me with full force. Is cataloging an autonomous, purely technical operation carried out in libraries, to be changed at will to accommodate whatever technical possibilities arise without regard to practices or contexts outside the library? Are the historical theories, practices and norms of bibliography, citation, and cataloging irrelevant in all contexts simply because in one technical context they are no longer important for certain kinds of activities?
With these questions in mind I looked at a number of textbooks, handbooks, and general treatises on cataloging and metadata to see how the activity of cataloging was related to practices of citation and bibliography in the world outside the library. The result was a revealing look at what may be the chief source of disconnect between cataloging practice and the world of library users; it is not the conservatism of catalogers that holds back progress in cataloging theory and cataloging practices, but the common conception of cataloging as an autonomous activity unrelated to other practices, the view of cataloging as a technical operation that follows technical developments while ignoring existing (and longstanding) social practices arising from literacy.
Bakewell in his Manual of cataloguing practice (1972) began his book with a chapter on "The nature and purpose of catalogues" in which we read "The principles of cataloguing apply equally to the entry of items in catalogues, bibliographies, indexes and abstracts."xii In the paragraph which follows this commendable statement Bakewell mentions Andrew Osborn's call for considering "bibliographies and book-trade lists, as well as library catalogues, when formulating codes of cataloguing rules"xiii and with that we are done with the relationships between cataloging and any other activity.
In his 1983 monograph Katalogkunde: Formalkataloge und formale Ordnungsmethoden, Klaus Haller offered the reader a few paragraphs on bibliographies and their relation to catalogs on page 21-22xiv; in even fewer words Bolognini and Pedrinixv begin with formal definitions of cataloging as distinct from bibliography, and Isabelle Dussert-Carbone and Marie-Renée Cazabon (1988) offer no remarks on bibliography or citation at all.xvi
Arlene Taylor (Wynar's Introduction to cataloging and classification, revised 9th edition, 1994) begins right away with chapter one "Cataloging in context", the purpose of which is "to set the context in which cataloging takes place."xvii Neither bibliography nor citation nor any user practices are mentioned; instead we are introduced immediately to "bibliographic control": "Cataloging is a subset of the larger field that is sometimes called bibliographic control, or organization of information." She defines bibliographic control in the words of Elaine Svenonius first ("the skill or art . of organizing knowledge (information) for retrieval") and then Smiraglia ("the creation, storage, manipulation, and retrieval of bibliographic data").xviii The only practices and context considered is that of the librarian working with a technical system, not with a user or user practices.
The literature on metadata is too vast and my knowledge of it too limited for me to make any kind of generalizations. What I have read, browsed, and checked the indexes to reveals a literature situated entirely within the context of computer architecture and programming; citation and bibliography are words nowhere to be found, and cataloging is treated as an irritating and limiting term fit for past situations, not the present much less the future. Even where "the users" are trotted out on nearly every page (e.g. in Weinberger's Everything is Miscellaneous), this literature is theoretically grounded in a technical process and "the users" are considered only as users of some particular technical system.
Perhaps the most revealing remarks appear in Lois Mai Chan's Cataloging and Classification: an introduction (2nd edition, 1994). Chan describes both cataloging and classification as 'operations', not as practices, and neither are related to any practices outside the library. At the very beginning she notes that she will discuss "cataloging and classification in terms of three basic functions: descriptive cataloging, subject access and classification"xix and a couple pages later we read that "one cannot prepare a bibliographic description of a document without resorting to AACR, nor can one classify an item without using a classification scheme."xx This is an astonishing claim, and Revelli offers us abundant evidence of alternatives, both historical and theoretical, to such a truly insular view of cataloging and classification.
Revelli's discussion of cataloging is remarkable for the extent to which it is founded upon an understanding of cataloging as one of many practices, past and present, that must be integrated in order to be able to use a catalog (of whatever form) or bibliography, as well as when reading and writing. Cataloging is a practice that presupposes a literate culture and all of the practices associated with and arising from the activities of writing and reading.
For Revelli, citation is both the historic justification for such principles as main entry and a continuing practice itself arising from our concepts of authorship, which is in turn rooted in the practice of writing. In an essay written before the online catalog appeared in his library ("Divagazioni sul concetto di autore" [Remarks on the concept of the author], originally published in 1976) Revelli made this point clear: "The principal card is compiled entirely on the basis of the concept of the author . while the secondary cards are composed on the basis of the probable reasons for searching."xxi In the aforementioned essay "L'intestazione principale: un reperto archeologico?" Revelli returns to this theme, remarking "The conflict between bibliographical entity and literary entity persists and assumes a new vitality appropriate to the alternatives facilitated by the online catalog."xxii Revelli concludes by noting that he is averse to predicting the future and joining the chorus agreeing that the principal of main entry "no longer has any reason to exist in a catalog, whether a card catalog or online."xxiii In certain forms of organizing information, in the case of musical compositions with generic titles, depending upon the design of and policies regulating the implementation of the technical system and the data that we put into it, we may find that the concept of main entry still has a role to play in a wide variety of situations. The future, after all, is not something any of us know with any certainty.
Revelli was a much better prognosticator than that chorus he would not join. Technical systems and their requirements seem to be more clearly understood now than when Revelli was a rather lonely voice arguing for the significance of the concept of the author and the principle of main entry. The desirability of indicating the nature of a person's relationship to a work is now acknowledged, as is the technical necessity of making that relationship explicit. Just having a name in a bibliographical record is not enough; if we want to know why that person is associated with a particular item, that information has to be provided for it to be available to the searching system: automated means for discovering and identifying those relationships are wholly inadequate. With RDA we have not only a relator code for author-"A person, family, or corporate body responsible for creating a work that is primarily textual in content" (RDA, Appendix I)xxiv-which is the old principle of main entry, but for hundreds of other kinds of relationships that may hold between a person and a particular item.
Cataloging rules (e.g. for the determination of main entry) are not arbitrary rules invented by catalogers and retained only because we are conservative, but practical responses to those readers who still insist on referring to and searching for publications by their authors and their titles. The long list of writers who have insisted on the uselessness of any concept of main entry have simply thought about catalogs in only one form-as electronic databases-and in no relation to any of the social practices surrounding the creation and use of bibliographical information. This is a mistake Revelli never makes. For him, citations "should be comprehensible and permit one to search for the corresponding document in a bibliography or in a library catalog."xxv That is, the citation practices of authors, the cataloging practices of librarians and the searching practices of library users cannot be considered in isolation.
In Il catalogo Revelli comments on the connection between automation and the fate of the principle of main entry, and brings us back to the user and the world in which the library user lives:
We will not consider the changes as a path towards an actual or future perfection, but as corresponding to the changing exigencies of a changing culture. The one who consults the catalog has already his own information for verifying the existence or not of a publication, a work or simply the name of a person: the reader has therefore the necessity of finding a publication or of retrieving a series of documents and has his own knowledge on the basis of which he searches the catalog.xxvi
Citation, bibliography, cataloging, reading, and searching a library catalog using that citation form a single complex of a wide variety of persons, practices, and tools that must be integrated in our understanding as well as in those practices. That brings us to a second characteristic feature of Revelli's approach to cataloging.
In regard to the question of autonomy, Revelli and many writers of the past decades are in agreement rather than disagreement on one matter, and that is that the library and its catalog must not be considered autonomous with respect to the library's users. While few writers on cataloging and metadata pay attention to bibliography and none discuss citation, it seems that everyone these days is referring to library users, their desires and practices, often scolding anyone who disagrees with them-real or imagined-and holding up "the users" as proof that they are right. Here again, however, reading Revelli one comes up with a very different attitude and approach to 'the users.' Instead of claiming that 'the users' want this or that and hauling them out to justify his approach, Revelli always brings us back to the users to remind the reader that his (Revelli's-or anyone else's) way is not universally binding nor eternal but may be adapted or abandoned according to the needs of those users in the reader's library whom Revelli does not know and about whom he makes only one claim: their practices and needs should inform the way you do things in your library. The development of norms, international or otherwise, and their adoption and application are very different matters.
This approach to the users of the library was in fact the most overwhelming feature of Il catalogo, and is no less evident in the essays in La biblioteca come teoria e come pratica. It is perhaps most spectacularly revealed in the appendices to Citazione bibliografica, where Revelli reproduces about fifty pages of various forms of bibliographic citations. At one point in the text Revelli makes the casual remark that his examples M and N in the appendices "confirm the great variety of solutions" to the problem under discussion.xxvii At another point he asks "And if it were necessary to indicate a title in the absence of an author? Well, whether outside or inside the parentheses, in whatever manner it has to be present . Let's try to have some faith in the intelligence of the reader."xxviii Perhaps the most direct statement to reflect this approach is his remarks on his own practice of citation in the book itself:
In homage to the liberty so often recommended, the method of citation applied in this publication is not intended to be prescriptive. Apart from being coherent with this publication itself, it is only intended to offer ease of using the citations in the text and the bibliography at the end, to render the documents recognizable for the purpose of finding them. In order to give an idea of the variety of criteria for citations, I have thought it convenient to present a certain number of examples furnished with notes.xxix
Revelli's awareness and acceptance of "the great variety of solutions" is in stark contrast with an attitude prevalent these days and recommended by the Library of Congress Working Group on the Future of Bibliographic Control.xxx According to this opposite view, all the "players" in the bibliographic universe should be encouraged to adopt our standards (in some matters at least), and in other matters librarians are called upon to accept-all of us-the standards in use by some other group of "players." In this view of things what is important is that the technical system function as it is designed to function; standards are created or changed to promote technical interoperability, and cooperation means everyone must adapt to that system now, and change with it as it changes. Revelli-no stronger nor more intelligent advocate of standards and international cooperation can be found-does not think about standards and cooperation in the manner of the aforementioned Working Group.
New instruments create new exigencies and the modalities with which information about documents are formed and modified, but the necessity of collecting and distributing information about documents persists, and the recognition of that necessity is common to the compilers of bibliographies and of catalogs of all times. Cataloging norms themselves are the result of the recognition of an exigency and it has always been the result of a professional deformity to consider them as ends in themselves. Without them, we could not bring together information in a coherent manner. From the necessity of fitting a norm to new situations and exigencies follows the ruptures in the coherence of the catalog, the convenience of constructing new catalogs when they reveal themselves incompatible with the old ones, and the opportunity for establishing new norms more convenient in the new situation. These are contradictions that it would be dishonest to ignore, but which will be more easily surmounted if considered in light of the purposes of the catalog.xxxi
For Revelli, cataloging does not exist to enable technical systems to operate; on the contrary our technical systems-and they are many and varied, not single-as well as our cataloging practices have always been, are now and must remain rooted in practices such as citation, a practice which, as he demonstates so clearly in Citazione bibliografica, reveals an enormous variety at present and historically. In the practice of citation an author describes a resource in a manner which permits the reader to evaluate the source, locate it, and read (or listen to) it; a technical system does not describe anything for anyone. A technical system follows its user's actions according to a program, and offers its users whatever that programmed response produces. Citing a resource using a URL, for example, may sometimes give the reader some clues as to what and where to find the resource to which it "points" but often does not. A doi tells the reader nothing, and a broken link only reminds the reader of the limitations of a technical system poorly integrated with the practices it is supposed to support.
No other writer on cataloging or metadata has integrated the library user so completely into the theoretical foundations of cataloging; Revelli does not bring library users into his discussion to justify his arguments but as the ones whose practices should provide us with the objects of our endeavors as well as determine which among the many possible alternatives, orientations, methods, and policies available to the library will be chosen. Revelli's users are not abstractions, any more than his norms or his technologies. They figure prominently not only in the arguments he makes, but in his bibliography, and we learn in his preface that his relationship to the library's users is the experience that has formed his theorizing. He acknowledges
the librarians, in large part not known to me personally, but who have conversed with me through their writings on cataloging questions and on the relations between the catalog and the public, that public being the end and reason for being of library catalogs. . And I offer my thanks to those who frequent the Torino Public Library, whose uncertainties, observations and requests have convinced me to consider the value of the catalog as one of the essential components of the library.xxxii
The library's users are not creatures of no particular time, of no particular place, and engaged in no particular practices. Nor are library users a homogenous group: "for the library public no single necessity exists, but the necessities vary" according to the kind of library.xxxiii The public for any particular library, Revelli insists, "does not correspond to a single person cloned thousands of times, but is composed of individuals with different personalities, needs and knowledge."xxxiv Nor is any library such a no place in no time for no particular purpose: "In other words the public uses the catalog not only according to the needs they have, but according to what information is there."xxxv People come to a library because of what they think the library offers them, and that may or may not be what we think we have to offer them or even want to offer them. We may find we are dealing with
contrasting, when not directly contradictory, needs: what information about what materials for what users? Even if the problems relative to the catalog are presented on the basis of general interests common to all, the actual realization of the catalog is strictly conditioned by the specific characteristics of the library. . Each library has its own reason for being, on which depend its complex organization, the acquisition of materials, their availability to the public to which it is devoted, and the information related to that material.xxxvi
Thinking about the library's mission requires "the recognition of a strict relationship between the materials it possesses, its growth and its public."xxxvii In cataloging, "not all documents are of equal value for everyone and once again we find ourselves faced with the question of why this library exists, what its public desires" and this means that "each library must formulate cataloging policies appropriate to its own nature and its own public."xxxviii In cataloging, access points "permit communication between the user and the catalog: the cataloger provides the information and the user finds it."xxxix
For Revelli it is clear that cataloging cannot be autonomous with respect to library users, nor with respect to the wide range of social practices associated with a literate culture. And there is yet another sense in which cataloging must not be understood in isolation, as an autonomous activity. In the paragraph "Indexing and cataloging" we read:
Let us recall that the cataloger cannot call himself such if he limits himself to describing isolated documents and establishing relative access points, without evaluating the accumulation of information within a catalog in which the products of such individual operations must result in compatibilities among them. The catalog in fact is a joining of information and not simply the sum of isolated information, and therefore it must be homogenous and not present contradictions: "we are not faced with the isolated cataloging of a single document: the real problem is how it is to be integrated into the particular information system of which it is to become a part."xl
Again, Revelli's approach to the problem of catalogers working with a technical system differs radically from that which we have come to expect. Here the cataloger takes responsibility for seeing that the whole system-cataloger, cataloging data, cataloging system-is to be coherently and effectively integrated for the users of that system. That integration, bringing coherence and meaningfulness for the library user is not something the information system does but something the cataloger and the users themselves do, and experience with the library users ought "to lead the cataloger to modify his own behavior."xli In a discussion of the differences between alphabetic and systematic organization Revelli remarks that this is a matter
"Cataloging descriptions do not function autonomously, but are ancillary instruments for identifying documents in order to find them."xliii The catalog, he wants us to understand, "is one of the means for bringing together the materials of a library or library system, but not the only means. . there are other means for informing readers."xliv Catalog maintenance "cannot be considered in isolation and is open to and connected with problems related to other aspects of library service."xlv
The papers in La biblioteca come teoria e come pratica provide us with many further perspectives on the non-autonomy of cataloging. There we find cataloging related to library management, cooperation, censorship, conservation, minorities, the disabled, special libraries, and public libraries. With so many diverse activities, constituents and problems, Revelli is not easily convinced by the slogans coming from other quarters. "We say that each publication should be cataloged only once? Well, let's try to keep our feet on the ground."xlvi Such an attitude, he suggests, is based on the belief that everyone needs the same thing, that needs do not differ from one community or from one kind of user or from one time to another. The irony here is that this "once and forever"-the ultimate dream of the ultimate conservative-approach to cataloging is being preached by folks who think of themselves as the progressive members of the library world. To be sure many of these prophets assume that the catalog record will change automatically, or that nothing need be cataloged at all: everything will be done on the fly by software operating over a multitude of autonomous but always available and flawlessly interoperable databases. This scenario assumes that all problems of interpretation and human interaction with the system will be adequately dealt with in a reasonable amount of time by improvements in software, as for example by a subject access system the structure of which would be "independent of language, having instead a multilingual vocabulary that would not be based on any vocabulary but on conceptual differences."xlvii Revelli on the other hand argues that "the diversity of languages and above all cultural variety" impede the rigid adoption of international norms and "make the possibility of putting together terms and lists of authorized headings for different linguistic communities" unlikely unless local situations (geographical and temporal) as well as minority populations are disregarded.
Revelli argues for standards that should be adapted to local circumstances and must change as often as the world changes, for catalogs and cataloging practices that must take account of local circumstances and must change as often as the world around us changes, and for catalogers who adapt accordingly. Revelli's discussions about how these changes should be considered, planned, implemented, and evaluated are based not on fantasies, futurology, and prognostications of technologies to come, nor are they tied to existing technologies. They are based on a deep and intimate knowledge of the great range of past and present practices associated with recording and studying the human experience using all the means we have had at our disposal, a knowledge which brings with it an expectation of change and the necessity of integrating that past world of practices and expectations into a new situation that we are making one move at a time.
Everywhere people are raising questions about the future of paper, the book, of libraries, of librarians . but we cannot, today, abandon paper, the book, the library or even the librarian because these exist and function today. . That is not to say that things are destined to remain the same into some distant future, but in view of any future we cannot abandon current reality: I do not see why in this crisis of values and of certainties we must regard the library as something eternal.xlviii
Manos a la obra!
This brief romp through three books by Revelli has not touched upon his 1970 monograph on subject cataloging (a reprint of which will appear soon), nor on most of the nearly 250 items mentioned in the bibliography in La biblioteca come teoria e come pratica. By writing so much of liberty I risk misrepresenting him by barely mentioning his stress on cataloging norms as necessary "for making possible the compatibility and integration of cataloging information without limiting their area of application" (email to the author, 12 July 2010). My only defense is that my focus was on the necessity of seeing these practices-citation and cataloging, as well as bibliography and catalog searching-as related activities, and that the role of norms in cataloging as well as library cooperation is another topic, too much to deal with here. I hope I have written enough to turn the attention of many more librarians-not just catalogers-towards the work of one of the most extraordinary librarians and theorists of cataloging of the past century.
Readers are invited to participate in this column by sending suggestions for topics and materials for consideration and review, whether electronic documents, citations, links or paper publications. Readers are also encouraged to contribute to this column as guest editor on any relevant topic, whether limited to the literature in a particular language or a particular region of the world. While the editor intends for the column to emphasize topics and literature relevant to current practice and interests, historical topics will be considered. Since the chief mark of really excellent writing and scholarship is that it remains relevant even though the world has changed, bringing older work that has been neglected or remains inaccessible to many will also be one of the objectives of this column. Suggestions, material for discussion, and inquiries for writing as guest editor should be directed to:
David Bade, Senior Librarian
Joseph Regenstein Library Room 170
University of Chicago
1100 East 57Th Street
Chicago, IL 60637
i Michael Carpenter, Review of: Revue de la Bibliothèque nationale de France no. 9. "Le catalogue." Cataloging & Classification Quarterly 36, no. 2 (2003): 102.
ii Ibid.: 106.
iii Michael Carpenter, Review of: Carlo Revelli, Il Catalogo, The Library Quarterly 70, no. 3 (2000): 403.
iv Carlo Revelli, Citazione bibliografica (Roma: Associazione italiana biblioteche, 2002), 8.
v Ibid.: 10.
vi Ibid.: 13.
vii Ibid.: 26.
viii Ibid.: 26-27.
ix Carlo Revelli, "L'intestazione principale: un reperto archeologico?," in his La biblioteca come teoria e come pratica (Milano: Editrice Bibliografica, 2006), 204.
x Ibid.: 205.
xii K.G.B. Bakewell, A Manual of Cataloging Practice (Oxford: Pergamon Press, 1972), 1.
xiv Klaus Haller, Katalogkunde: Formalkataloge und formale Ordnungsmethoden (München: K.G. Saur, 1983).
xv Pierantonio Bolognini and Ismaela Pedrini, Manuale del catalogatore (Milano: Editrice Bibliografica, 1986).
xvi Isabelle Dussert-Carbonee and Marie-Renée Cazabon, Le catalogage: method et pratiques (Paris: Éditions du cercle de la librairie, 1988).
xvii Arlene Taylor, Wynar's Introduction to Cataloging and Classification, revised 9th edition (Westport, Connecticut: Libraries Unlimited, 1994), 3.
xix Lois Mai Chan, Cataloging and Classification: An Introduction, 2nd edition (New York: McGraw-Hill, 1994), xix.
xx Ibid.: xxi.
xxi Carlo Revelli, "Divagazioni sul concetto di autore", in his a biblioteca come teoria e come pratica: antologia degli scritti (Milano: Editrice Bibliografica, 2006), 99.
xxii Carlo Revelli, "L'intestazione principale: un reperto archeologico?" in his La biblioteca come teoria e come pratica: antologia degli scritti (Milano: Editrice Bibliografica, 2006), 208.
xxv Carlo Revelli, Citazione bibliografica, (Roma: Associazione italiana biblioteche, 2002), 54.
xxvi Carlo Revelli, in collaborazione con Giulia Visintin, Il Catalogo, Nuova edizione con aggiornamenti (Milano: Editrice Bibliografica, 2008), 185.
xxvii Carlo Revelli, Citazione bibliografica, (Roma: Associazione italiana biblioteche, 2002), 37.
xxviii Ibid.: 43.
xxix Ibid.: 50.
xxx Revelli's "great variety of solutions" refers to the practice of citation, not to cataloging which was the object of the Working Group's report. Revelli does not suggest that the same kind of variety of solutions available in citation is possible in cataloging. My intention here is to contrast the general openness in Revelli's work for the necessity of local institutions to develop policies appropriate to their mission, where the great variety of institutions means that there are indeed a "great variety of solutions," with the approach of the Working Group which seems to argue that theirs is a single solution and it is not a local solution.
xxxi Carlo Revelli, in collaborazione con Giulia Visintin, Il Catalogo, Nuova edizione con aggiornamenti (Milano: Editrice Bibliografica, 2008), 18. In an email to me Revelli noted that he does not feel that his approach to standards and cooperation contrast so sharply with the recommendations of the Working Group. Since I agree with what Revelli has written, I assume this means that Revelli and I disagree on how to understand the report of the Working Group.
xxxiii Carlo Revelli, "Il catalogo per soggetti e le aspettative dei bibliotecari nei confronti dell'automazione" in his La biblioteca come teoria e come pratica (Milano, Editrice Bibliografica, 2006), 125.
xxxvi Ibid.: 22.
xxxvii Ibid.: 23.
xxxviii Ibid.: 39.
xxxix Ibid.: 51.
xl Ibid.: 55, quoting Teresa Grimaldi.
xlii Ibid.: 295.
xliii Ibid.: 125.
xliv Ibid.: 27.
xlv Ibid.: 427.
xlvi Carlo Revelli, La biblioteca come teoria e come pratica (Milano: Editrice Bibliografica, 2006), 224.
xlvii Carlo Revelli, in collaborazione con Giulia Visintin, Il Catalogo, Nuova edizione con aggiornamenti (Milano: Editrice Bibliografica, 2008), 481.
xlviii Carlo Revelli, La biblioteca come teoria e come pratica (Milano: Editrice Bibliografica, 2006), 233.
Welcome to the news column. Its purpose is to disseminate information on any aspect of cataloging and classification that may be of interest to the cataloging community. This column is not just intended for news items, but serves to document discussions of interest as well as news concerning you, your research efforts, and your organization. Please send any pertinent materials, notes, minutes, or reports to: Robert Bothmann, Memorial Library, Minnesota State University, Mankato, ML 3097, PO Box 8419, Mankato, MN 56002-8419 (email:, phone: 507-389-2010. News columns will typically be available prior to publication in print from the CCQ website at .
We would appreciate receiving items having to do with:
Research and Opinion
Cataloging and Beyond: Publishing for the Year of Cataloging Research
The latest event in the "Year of Cataloging Research," this panel was organized and moderated by Allyson Carlyle of the University of Washington, who served on the ALCTS (Association for Library Collections & Technical Services) Implementation Task Group on the Library of Congress Working Group Report responsible for initiating the declaration of 2010 as the Year of Cataloging Research. (More information can be found on Carlyle's web page,.)
Sponsored by the Cataloging and Classification Section of ALCTS and co-sponsored by the ALA Library Research Round Table, LITA Next Generation Catalog Interest Group, and RUSA RSS Catalog Use Committee, the panel featured four speakers who offered ideas for research on cataloging and metadata to a standing-room only crowd on June 27, 2010 at the American Library Association Annual Conference in Washington, D.C. Slides from the presentations are available at.
Cataloging & Classification Research
Sara Shatford Layne, Principal Cataloger, UCLA Library Cataloging and Metadata Center
The panel's theoretical framework was introduced by Sara Shatford Layne, who defined cataloging as connecting users to the bibliographic world. She described this connection in two parts - first "getting the user to what we are describing" (retrieval and navigation, or the FRBR "find" task) and then "explaining to the user what it is that he/she has found" (display, or the FRBR "identify" and "select" tasks). Cataloging research "attempts to measure the effectiveness of the connection that cataloging makes between the user and the bibliographic world." Cataloging research should help to inform decisions about catalog data and system design.
Catalogers would like research to tell them what practices are useful and what ought to be done differently, but "usefulness" might not be measurable. Layne pointed out that what is easy to measure isn't necessarily what should be the subject of research, and that research should be driven by what we need to learn, not what can be readily measured or counted. There is a need for research not just on how library data is currently accessed and presented, but on how our data could be used. While it is possible to measure what users say (through interviews, surveys, transaction logs, focus groups), measure evidence of what users need (through current or published research results), or to analyze the data created by catalogers outside the context of specific users or systems, taken individually each of these approaches has pitfalls. Returning to the two-part connection, Layne advocated for research that blends these approaches by combining analysis of cataloging data (find) with measurement of use (identify and select).
Make It As Easy As a Google Book Search: Learning How to Make the Catalog Usable
Lynn Silipigni Connaway, Senior Research Scientist, OCLC
Lynn Silipigni Connaway presented highlights from "The Digital Information Seeker," an analysis of twelve user behavior studies conducted in the US and UK by OCLC, JISC, and RIN and published within the last five years. A synthesis of the studies was undertaken to provide a better understanding of user information-seeking behavior and to identify issues for development of user-focused services and systems. Common findings that were identified included
Discussing the implications of these findings for libraries and library systems, Connaway emphasized they serve many constituencies with different needs and behaviors. She argued that libraries need to improve and expand seamless access to an ever greater variety of digital content, provide high-quality metadata based on user needs, make library systems look and function more like search engines, and do better at advertising library resources. The full report, co-authored by Lynn Silipigni Connaway and Timothy Dickey and published February 15, 2010, is available at
Celebration and Opportunity: The Year of Cataloging Research-2010
Jane Greenberg, Professor and Director, SILS Metadata Research Center, School of Information and Library Science, University of North Carolina at Chapel Hill
Jane Greenberg identified three target areas for research: automatic metadata generation, creator/author generated metadata, and metadata theory. Research on automatic metadata generation is necessary because traditional, manual cataloging practices are overwhelmed by current demands. Automatic applications could free catalogers from routine activities and allow them to dedicate their skills and knowledge to tasks requiring human intellect. Potential directions for this research include improving metadata generation algorithms (advancing from experimental to operational, developing genre/content driven methods suitable for some disciplines), and developing workflows to integrate manual and automatic methods. Creators and authors have always produced metadata, but research is needed on how catalogers/metadata experts can help them do it better and take advantage of their subject expertise. We need to cultivate a collective effort by recruiting and educating all those with a contribution to make to the "cataloging enterprise" (such as taggers, college students, digital repository contributors). Theoretical research is needed to transform our traditional philosophical and theoretical understanding of cataloging to an understanding of cataloging/metadata for digital information.
In conclusion, Greenberg noted that daily experiences with information systems can shape potential research questions, and that this along with the desire to enable users should motivate metadata research. Finally, she reported on the first of three "Cataloging Research Blitz" events held at the University of North Carolina in celebration of 2010 as the Year of Cataloging Research (for more details, see).
Research on the Next Generation Catalog
Amy Eklund, Catalog Librarian and Instructor, Georgia Perimeter College Libraries
In the final presentation, Amy Ecklund began by discussing characteristics that define a "next-gen" catalog, such as web 2.0 features, faceted browsing, and the ability to handle several metadata schemas. Research is needed because next-gen catalog features and functionality have not been based on large-scale evidence - a "build it and they will come" approach has been used.
From formal literature and library discussion lists (especially the NGC4Lib mailing list - see), Ecklund identified functionality and features, cost-benefit, and system design as three areas in which research is needed. Research questions about functionality and features included "Are there things users expect next-gen catalogs to do that they don't? What are those things?" and "Has the faceting of subject headings in next-gen catalogs improved results for users?" Suggested approaches to questions in this area emphasized examining the usefulness of various features through observing user behaviors, search logs, and comparisons of use and search results. Cost-benefit questions ranged from "Do next-gen catalogs increase usage of the catalog? Do they increase circulation? What other impacts on collection usage do next-gen catalogs have-e.g., use of full-text, interlibrary loan, etc.?" to "What barriers currently exist for libraries to switch to a next-gen catalog?" The bulk of the research questions proposed by Ecklund were in the area of system design. They included questions about the indexing and display of MARC fields, authority control in faceted browsing, IFLA's Guidelines for Online Public Access Catalogue Displays, interface customization, emerging technologies (mobile, widgets, touch screen), default search settings, display of cross-references, use of relator terms to improve display of relationships, display of multiple records, use of first subject heading or classification number to designate "primary topic" (and use of primary topic in relevance ranking), indexing/display for genre vs. subject headings, indexing/display for works by and works about an author, and search limiting features.
Catalog Librarian / Assistant Professor
Learning Resources & Technology Services
St. Cloud State University
The Metadata Blog <>, the official blog of the Association for Library Collections and Technical Services (ALCTS) Metadata Interest Group has expanded its scope to include reports on research, projects, and events related to the metadata community at large. Interested bloggers should contact Kristin Martin, Blog Coordinator, at .
In 2006 the Library of Congress' Director of Acquisitions and Bibliographic Access (ABA) requested a review of the pros and cons of pre- and post-coordination of Library of Congress Subject Headings. The Cataloging Policy and Support Office (now the Policy and Standards Division or PSD) responded in 2007 with the paper entitled, "Library of Congress Subject Headings: Pre- vs. Post-Coordination and Related Issues."
The paper concluded that it is desirable to continue to assign pre-coordinated heading strings because they provide context, disambiguate between terms, suggest other searches, provide precision in searching, and allow for browse displays. The sophisticated syntax can express concepts better than single words can, but systems can also break them into facets for post-coordinated displays if desired. On the other hand, post-coordinated terms are single terms or phrases and are seriously limited in terms of recall, precision, understanding, and relevance ranking.
The paper recommended several initiatives and projects that if undertaken would reduce the costs of pre-coordination. It has now been three years since the paper was written, and PSD has completed a review of the status of the initiatives and projects. PSD's report, "The Policy and Standards Division's Progress on the Recommendations made in 'Library of Congress Subject Headings: Pre- vs. Post-Coordination and Related Issues'" was recently approved by the Acquisitions and Bibliographic Access Directorate (ABA) managers and is now available to the public through LC's web site at.
Questions and comments may be addressed to.
Janis L. Young
Senior Cataloging Policy Specialist
Policy and Standards Division
Library of Congress
The cataloging treatment of reproductions at the Library of Congress is being reconsidered as part of a full-scale reevaluation of cataloging policy decisions necessitated by the upcoming test of Resource Description and Access (RDA). The basic approach to reproductions is the same in RDA as it is in AACR2, but LC and many other US libraries continue to follow an AACR1 approach as documented in the Library of Congress Rule Interpretation (LCRI) for Chapter 11 (microform reproductions) and LCRI 1.11A (non-microform reproductions).
In order to perform a more accurate test of RDA's provisions, those LC catalogers participating in the US National Library Test of RDA will follow RDA as written for the period of the RDA test (Oct. 1-Dec. 31, 2009), which entails basing the record for a reproduction on the item in hand and providing information about the original in the record when the decision is to have separate records for the original and the reproduction.
A discussion paper is available atthat provides some background information on how LC's policies came to differ from AACR2's treatment of reproductions, possible approaches to implementing an AACR2/RDA-compatible treatment, and LC's decisions on how its RDA testers will treat reproductions during the US RDA test.
Questions and comments may be sent to
The Library of Congress' Policy and Standards Division started work in 2007 on the creation of genre/form terms for moving images. The development of more terms for recorded sound and cartography have been added to this body with LC anticipating the addition of genre/form terms for law, literature, music, and religion during the next two years.
During the development phase, the genre/form terms have been assigned MARC 21 coding indicating the terms are part of the Library of Congress Subject Headings thesaurus.
The New England Technical Services Librarians (NETSL) held their Spring Conference on April 15, 2010, entitled Crosswalks to the Future: Library Metadata on the Move. Below is a summary of some of the sessions. Most of the presentation slides are available from the NETSL conference Web site at:.
Barbara Tillett, Chief, Policy and Standards Division, Library of Congress
The opening keynote speaker, Barbara Tillett, began by stating that the efforts to lay some of the building blocks for linked data in the web environment have been underway for several years now. The term "cloud computing" was mentioned - our information systems can now be part of the Internet cloud computing environment that exists with Google and other systems where library resources are available not just from an institution's computer, but shared and available to anyone on the web. She discussed three controlled vocabulary projects the Library of Congress developed to help in building a linked web environment.
Virtual International Authority File (VIAF)
The Virtual International Authority File (VIAF) was established as a free service on the web to share authority data created by libraries all over the world. One objective is that libraries can use VIAF to reduce cataloging costs associated with authority control. The service was also designed to make authority control easier on an international scale by providing libraries with the ability to help each other maintain the data. Longer term, the VIAF partners hope the data can be used in linked data services to enable display of bibliographic data in the form, language, and script that the end user wants. For example, say user needs information about Anton Chekov in Cyrillic, but they have set up their profile to see English. A system application could use VIAF to display the Cyrillic script plus link to other things like the cover art of publications by or about Chekov.
The VIAF project began in 2003 and was originally a partnership between the Library of Congress, the Deutsche Nationalbibliothek (the National Library of Germany), and OCLC; later (2007) the Bibliotheque nationale de France joined the partners. The group considered various models, and they decided to use a centralized model with links to different national authority files. OCLC developed algorithms to use information in the bibliographic records and information in authority records to match names in the two authority files.
As of 2009, VIAF became available as linked data. That means universal resource identifiers (URIs) for everything. The scripts are in Unicode, data can be submitted as either UNIMARC, MARC 21, or MARCXML, and in addition to names of persons, there has been some preliminary work on geographic names. VIAF now has 16 participants and two more about to join. There are 20 authority files with about 13 million names for about 10 million identities or persons and 4.5 million clusters.
Tillett spent a fair amount of time describing how the VIAF works, deriving authority records from bibliographic records. More information about this piece can be obtained from her PowerPoint slides ().
Demo of VIAF
Tillett gave a brief demonstration of the interface to the Virtual International Authority File. To try searching the VIAF, follow these steps:
Go to. There is a search box at the top with options to select what file and what type of information you wish to search. Search for "Chekhov."
The result is a display that shows information derived from the cluster of linked name records and associated bibliographic records. You will see the various names and you can scroll through the cover art associated with Chekhov's publications. There are over 200 alternate forms of Chekhov's name in numerous scripts. Here are a few other things you can do:
Currently the Program for Cooperative Cataloging (PCC) and Library of Congress catalogers are not required to search VIAF before creating a Name Authority record to contribute to LC/NACO Name Authority File, but this may change when VIAF moves out of the prototype stage. Currently catalogers are encouraged to use the VIAF to resolve conflicts.
Next Steps for VIAF
Future enhancements include: better searching, more linked data (such as "related persons" as is possible in WorldCat Identities), participants beyond libraries (e.g., publishers, museums), and more name types (corporate and family names, uniform titles, geographic names, etc).
The id.loc.gov Web Site
This Web site provides the Library of Congress' own controlled vocabularies, like the Library of Congress Subject Headings (LCSH), in the format known as SKOS (Simple Knowledge Organization Schema). If you are not familiar with SKOS, it "provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other similar types of controlled vocabulary" (SKOS Primer -). The intent of this project is to provide not just access by humans but also to provide access by machines to commonly found standards and vocabularies developed by LC. Benefits are that other servers can download entire controlled vocabularies and the values within them, in multiple formats, and they are available for free on the web. LCSH is the first offering (it includes subject headings, genre/form headings, children's subject headings, subdivision records, and validation records). Future additions include the Thesaurus for Graphic Materials (TGM), MARC geographic area codes, MARC language codes, and MARC relator codes.
RDA Controlled Vocabularies
Because RDA will be available on the web through the RDA Toolkit, the creators of RDA made the decision to make the lists of terms (controlled vocabularies) freely accessible on the Web through a contract with The Metadata Registry. The controlled vocabularies include the values for naming the types of content (sound, text, still images, etc), types of carriers (film reel, computer disc, volume), and other elements in RDA that have controlled lists of values. The RDA vocabularies are hosted by the National Science Digital Library (NSDL). It is important that the RDA controlled vocabularies be accessible on the web so third party service providers and web applications can easily access and make use of this data.
Jean Godby, Research Scientist, OCLC
Sometimes records are in one format in a particular database or on the web, but they need to be in a different format to be used effectively by some other system. Imagine a black box that magically transforms data from one format to the other. That's what Jean Godby (and other OCLC Research staff) has been working on at OCLC Research. She discussed the Crosswalk Web Service implemented at OCLC. The user specifies an input and an output, and the system does the rest. She discussed their ONIX to MARC crosswalk, Dublin Core to MARC crosswalk in Connexion Browser, and the WorldCat Digital Collection Gateway Tool. These are all examples of tools currently available that make use of the behind the scenes system architecture that OCLC has developed.
In OCLC's implementation, they first translate to MARC, then to the output format. Though they need two translations (input to MARC; MARC to output), Godby explained that this model reduces the total number of translations required and permits more reusability. Right now OCLC only works with bibliographic metadata, though other kinds of data are under discussion.
ONIX to MARC crosswalk
Publisher's use the ONIX format to display their publisher data on the web and in their systems. This crosswalk has been published on the web, and it is the most comprehensive crosswalk available. Godby spoke about her OCLC colleague Renee Register's project (now an OCLC service called Metadata Services for Publishers at) that utilizes this crosswalk. The intent of this service is to provide a way to make publisher metadata available earlier (and get the records into WorldCat for use). ONIX records are obtained from publishers and mapped to MARC. If a match is found in WorldCat, fields from the MARC record are applied to it. If no match is found, the record is populated from fields in the closest FRBR cluster. The result is mapped back to ONIX to be delivered to publishers. It is also made available to the library community as an enhanced MARC record via WorldCat.
Dublin Core to MARC crosswalk
If you have ever used OCLC Connexion Browser, there is translation feature that allows you to view the MARC record as a Dublin Core record. This crosswalk is also used in OCLC's Digital Collection Gateway, which allows users of OCLC CONTENTdm to customize their map from Dublin Core to MARC through an editing interface.
Some Problems/Challenges of Mapping Bibliographic Metadata
Godby stated that we are dealing with two paradigms in metadata processing. She showed a slide in which on the left was a schematic representation of non-MARC metadata streams such as ONIX and Dublin Core terms (e.g., Subject, Publisher, etc.). These are designed to be taken apart, enhanced, and merged as processing needs change.
On the right of the slide was a schematic view of a MARC record. When OCLC creates a crosswalk, they are mapping pairs of elements. But MARC has additional packaging that can get in the way, making the mapping task more complex. For example, extra effort is required to add, validate, and dismantle ISBD and AACR2 rules, adding complexity to the process. AACR2 introduces concepts that the other standards do not have, and the ISBD and AACR2 layers are not a worldwide standard. As one example, Godby displayed a record that describes an audio CD of arias from Puccini's operas.
What are the problems with this? Godby explained that the relationship is over-specified, the information is redundant, and the maps between coded values and textual values are not reliable. Below is another example of the MARC and non-MARC paradigm, taken from Godby's slides.
|MARC Standard||Modern non-MARC standards|
|Record-oriented||Element or field-oriented|
|Tailored to applications in the library community||Agnostic about how the data will be used|
|Designed for storage||Designed for transmission|
What's next for this project?
OCLC plans to participate in RDA testing. The ONIX/RDA Framework solves some of the problems with physical descriptions by proposing a registered common vocabulary that both standards share.
OCLC wants to apply lessons learned from studies of MARC field usage. A recent OCLC Research Study, "Implications of MARC Tag Usage on Library Metadata Practices," () pointed out that:
Leslie Straus, President, SkyRiver
SkyRiver is a new competitor to OCLC as they provide a cataloging interface and a database of MARC records to be used for copy cataloging. Leslie Straus, SkyRiver President, framed her talk by stating her belief that competition is good for the market. There used to be other "competitors" like RLG, WLN, and others besides OCLC and now only OCLC remains as a bibliographic utility. Why does SkyRiver matter? They believe: Bibliographic metadata is in the public domain, and that libraries can share that metadata with whomever they want.
The SkyRiver database has 25+ million records (Library of Congress and member records), and their aim is to build and maintain a database of quality records, not necessarily quantity of records. The rest of her talk focused on the mechanics of the cataloging software, and many questions were asked. I've tried to capture this below:
Find more information at:
Jon Orwant, Google - Engineering Manager of Google Books, Magazines, and Patents
Google has a bottom up approach to dealing with the mess (and metadata) of online content out there today. Google's mission is to organize the world's information, which is why Google scans books. Orwant gave a brief overview of the Google Books Project.
Google first approached publishers, saying "Give me everything you have." They stripped off the binding and sheet fed everything into scanners. Then they started to work with libraries. They can't strip off the binding of library books, so they take photos of book pages. They capture the image and adjust the image (as taking a photo of an open book displays a curve at the inside margin) so that the image looks better as if it was sheet fed (no curvature). They perform optical character recognition (OCR) work, then tag the images for determining copyright. For the items still within copyright they only show snippets of the items. So far they have scanned:
12 million books
4 billion pages
2 trillion words
They have metadata for:
174 million books
4 billion records
1 trillion metadata fields
They collect data from 100+ sources (libraries, commercial aggregators, etc.), then they parse records (MARC, ONIX, others) into their own internal format. After that they cluster records into expressions (similar to expressions in FRBR) and manifestations, create a "best of" record for each cluster, and finally index and display elements of that record on.
Other features of Google Books mentioned included:
Sara Ring, Coordinator
Bibliographic and Technical Services
Benjamin Abrahamse, Head, Serials Cataloging Section, Cataloging and Metadata Services, MIT Libraries
In this presentation Benjamin explored the use of MarcEdit's integrated Z39.50 client to query bibliographic databases on the Web and ways to manipulate MARC data using various MarcEdit functions. In particular, he looked at the MarcEdit function of exporting MARC as tab-delimited text. He also took a brief look at various ways of exporting MARC records from MarcEdit into other applications.
He began by saying not all publishers have MARC records . Go Fish! One can take MARC data from these sources via their Z39.50 interface and put it into MS Excel. There is no crosswalk for publisher data that may not be in a data format, so you use the publisher supplied metadata (PSM). The skills needed to harvest the publisher provided metadata are the following: 1) know how to form basic Z39.50 queries, 2) use a text editor that has support for regular expressions (MarcEdit), and 3) have spreadsheet skills (sort and filter functions, formulas).
The quality of publisher provided metadata varies greatly. It may or may not exist in MARC format. The key fields to look for are: any standard numbers (ISBN, LCCN, doi), complete title information, and URLs. One may have to go beyond what is on the WebPage or scrape the HTML. Once you have found the data, open it in a spreadsheet, select the fields to query, and export or cut/paste to the text editor. He uses Notepad Plus++ (). Save it as a text file for batch processing.
If it is MARC data, retrieve it into MarcEdit and convert it into text with the save as type 'tabbed delimited text files (*.txt)'. Ver 5.2 of MarcEdit has the tab-delimited export utility.
MIT uses it to harvest Web provided publisher records as well as for e-book collections. For bibliographic records that are both print and electronic, they have one bibliographic record for the print and then pull in another MARC record for the electronic version. The purpose of all of this for MIT is to have a method for assembling MARC record sets using non-MARC publisher-supplied metadata.
Mary Ann Greenwald
PALS Support and Training
Minnesota State University, Mankato
A Report from the Spring 2010 NOTSL - ALAO/TEDSIG Joint Meeting
Northern Ohio Technical Services Librarians (NOTSL), and the Academic Library Association of Ohio/Technical, Electronic and Digital Services Interest Group (ALAO/TEDSIG) held a jointly sponsored event on May 14, 2010 at the Shisler Convention Center, Ohio State University, Wooster Campus, at Wooster, Ohio. The theme of the conference was Getting Ready for RDA: Preparing for the Transition. The program featured two guest speakers: Dr. Athena Salaba, Assistant Professor at the School of Library and Information Science, Kent State University, and Rick J. Block, Head, Special Collections and Metadata Cataloging, Columbia University, and adjunct professor at the Pratt School of Library and Information Science and the Palmer School of Library and Information Science at Long Island University. Dr. Salaba's presentation centered on Functional Requirements for Bibliographic Records (FRBR) and its potential impact on library catalogs. Mr. Block's presentation focused on how libraries can begin to prepare staff for the transition from AACR2 to RDA.
Dr. Salaba's presentation was titled "FRBR Family." She provided an excellent theoretical overview of the FRBR model, gearing her presentation toward cataloging and technical services staff. The FRBR Family consists of three groups of entities: FRBR (Functional Requirements for Bibliographic Records), FRAD (Functional Requirements for Authority Data), and FRSAD (Functional Requirements for Subject Authority Data). Dr. Salaba discussed the impact of the FRBR model on the bibliographic universe, specifically how the traditional object-oriented model of cataloging, based on AACR2 and current integrated library systems, can be improved by the use of the FRBR content-oriented model. A brief history of the IFLA FRBR working group was provided, which culminated in the publication of IFLA's Functional Requirements for Bibliographic Records, Final Report (1998).
The three groups of entities were individually explained. Group 1 entities (FRBR) are based on the products of intellectual or artistic endeavor. They are labeled as works, expressions, manifestations, and items. Group 2 entities (FRAD), labeled as persons, corporate bodies, and families, are defined as those responsible for the intellectual or artistic content, physical production, or custodianship. Group 3 entities (FRSAD) are labeled as concepts, objects, events, and place. They serve as the subjects of the intellectual or artistic endeavor. Drawing a parallel to current cataloging practice, Group 1 entities correlate to the intellectual creation, Group 2 to name authorities, and Group 3 to subject authorities. Entities have attributes. For example, in Group 1, the date of a work is an attribute of the work entity. Entities can have relationships between groups; for example a work can have as a subject an entity such as an event (Group 3) or person (Group 2). RDA is being developed to allow local systems to exploit these FRBR relationships and improve the experience of the catalog user. Dr. Salaba, with her colleague Dr. Yin Zhang (Associate Professor, Kent State University, School of Library and Information Science), recently co-authored a publication titled Implementing FRBR in Libraries: Key Issues and Future Directions (New York: Neal-Schuman, 2009) which describes these relationships in greater detail.
After Dr. Salaba's presentation, the main speaker, Rick Block, delivered the keynote presentation, titled "Getting Ready for RDA: Preparing for the Transition." Mr. Block's first discussed the reasons why a new cataloging code was needed to replace AACR2 and why the term AACR3 was not adopted. These reasons were given for adopting RDA instead of a revised version of AACR2:
After running through a brief history of the evolution of AACR1 and AACR2, Mr. Block reminds us that unlike AACR2, RDA is a content standard, not a display or presentation standard. Since RDA is based on the FRBR model, when preparing catalogers for RDA training, catalogers must familiarize themselves with FRBR terminology and relationships. Staff should begin training by using the new terminology on a daily basis. Cataloging workflows should be developed to guide catalogers in making decisions on RDA options. Local documentation will need to be updated for staff. Training options for staff will include combinations of in-house sessions, web-based courses, local/regional/national workshops, one-on-one training, and train the trainer. Concrete examples that compare RDA cataloging to AACR2 will be helpful for staff as a reference tool.
Mr. Block then discussed the structure of the online RDA Toolkit. Comparing it to AACR2, the RDA section on Attributes (Sections 1-4) will correspond to Part 1 (Description) of AACR2, while the RDA section on Relationships (Sections 5-10) will correspond to Part 2 (Headings, Uniform Titles, and References). Mr. Block provided a brief glossary of new RDA terminology, comparing AACR2 to RDA (one example being uniform title in AACR2 and preferred title for a work in RDA. He touched on other differences between AACR2 and RDA, such as the elimination of abbreviations; replacement of the GMD with Content type, Media type, and Character type; and the elimination of the rule of 3 for main entry from AACR2. Another advantage that RDA has will be its ability to make use of persistent resource identifiers, maintained in registry/authority files, when tracking relationships between entities.
The most important point stressed by Mr. Block was that RDA was a way of getting to the future, but everyone should remember that we are not there yet, and that we should not panic. In the meantime, we will continue using MARC 21 while it continues to evolve to support RDA or as it migrates into an XML-base standard. Most ILS systems have no plans yet to implement support for RDA functionality. Mr. Block left us with these thoughts on what catalogers should be doing at this time:
The presentation slides for the NOTSL - ALAO/TEDSIG Spring 2010 Meeting have been posted and made available at.
Roman S. Panchyshyn
Catalog Librarian, Assistant Professor
Kent State University
The OCLC Board of Trustees approved WorldCat Rights and Responsibilities for the OCLC Cooperative (). The new policy will go into effect on 1 August 2010 and replaces the existing guideline, Guidelines for Use and Transfer of OCLC Derived Records, which has been in place since 1987.