Welcome to the news column. Its purpose is to disseminate information on any aspect of cataloging and classification that may be of interest to the cataloging community. This column is not just intended for news items, but serves to document discussions of interest as well as news concerning you, your research efforts, and your organization. Please send any pertinent materials, notes, minutes, or reports to: Sandy Roe; Memorial Library; Minnesota State University, Mankato; Mankato, MN 56001-8419 (email: email@example.com; phone: 507-389-2155). News columns will typically be available at the CCQ website (http://catalogingandclassificationquarterly.com/) prior to their appearance in print.
We would appreciate receiving items having to do with:
Abstracts or reports of on-going or unpublished research
Bibliographies of materials available on specific subjects
Analysis or description of new technologies
Call for papers
Comments or opinions on the art of cataloging
Notes, minutes, or summaries of meetings, etc. of interest to catalogers
Description of grants
Description of projects
Announcements of changes in personnel
Announcements of honors, offices, etc.
RESEARCH & OPINION
New Technology: SFX
The Caltech Library System is currently a beta test site for an exciting new Internet linking product named SFX. Originating from research by Herbert Van de Sompel of Ghent University in Belgium, SFX is a “context-sensitive reference linking solution” that allows librarians to define local electronic collections and the way those collections interact and are presented to users. SFX is currently owned by Ex Libris, provider of the ALEPH integrated library system and other applications.
SFX extracts the metadata from a given bibliographic citation in an SFX-aware resource and passes it through to an SFX server, where it is matched against a set of pre-defined relationships to other resources, and returns a set of extended services that would be available for that set of metadata. For example, a user performs a search in Web of Science, locates a relevant article, and clicks on the SFX button next to that citation. The user sees a new window spawned by the browser that contains all the pre-defined relationships for the metadata from that citation. Depending on what the local collection allows, the user could select “Go to the full-text of this article at the publisher’s website”, “Find local holdings in the OPAC for this journal,” or “Find the impact factor of this journal in the ISI Journal Citation Reports.” Possibilities are only limited by the local collection and the imagination of the librarians who define the services.
The database that drives SFX includes tables that define Sources, which are resources where a user would be starting their search; Source Services, such as get full-text, find author, show local holdings, etc; Targets, resources where a user would want to go; Target Services, such as get full-text, find author, etc.; “Colli”, which are conceptual links between Sources and Targets; and Object Portfolios that define what Sources, Targets, and Services that object (usually a journal) is associated with.
In order for a resource to be an SFX Source, it must be able to provide users with an SFX button on each bibliographic citation record. This button activates an OpenURL, a standard format proposed by SFX’s developers (available: http://22.214.171.124/OpenURL/opeurl.html). The OpenURL encodes the set of SFX metadata in a way that the SFX server can understand. It may also contain instructions for fetching additional information about the citation. The SFX server parses the OpenURL and finds the set of conceptually related target services.
URLs for each target are built on the fly when a service is selected. Once the user clicks on the service a simple Perl program takes the metadata from the original citation and builds a URL that includes the resource’s domain and its standard link-to syntax.
Most information providers already have standard link-to syntax for their resources. Many include standard numbers, volume numbers, issue numbers, starting pages, or author names in the URL. For example, Wiley Interscience’s syntax includes the domain name and a unique journal identifier, while the Royal Society of Chemistry’s syntax includes an abbreviated journal name followed by the journal year and issue number. Syntaxes can either be found as published standards (see Catchword’s syntax at http://www.catchword.co.uk/cgi-bin/fs?pg=/liok.htm) or by simply analyzing a given resource’s URLs.
Caltech is currently working with a locally loaded version of ISI’s Web of Science as an SFX source, and have been successful in creating links from citations in that resource to the full text for over 375 journals, the local ILS (Innopac), and PubMed. Future plans include adding major electronic resources such as OVID, Innopac, SilverPlatter, and the major CrossRef publishers as SFX Sources. In addition to full-text journals and other A&I databases, targets could include such local services as document delivery. Caltech uses the ILLiad document delivery software and hopes to enable that resource as a target in order to effortlessly pass bibliographic data in user initiated ILL requests.
For more information see SFX’s homepage: http://www.sfxit.com/ and two articles by Herbert Van de Sompel in D-LIB magazine: http://www.dlib.org/dlib/april99/van_de_sompel/04van_de_sompel-pt1.html
John McDonald, Acquisitions Librarian
Betsy Coles, Manager of Digital Library System
Caltech Library System, California Institute of Technology
[Editor’s note: Reader’s should note that this technology has the potential to enable the use of already existing metadata (i.e., ISSN, SICI, author/title names, etc.) to facilitate known item searching across heterogeneous electronic resources (article databases, A&I services, online catalogs, etc.). While currently only in beta test at several North American institutions like Caltech, it has been up and running at the University of Ghent since spring 2000.]
From the ALA/ALCTS Preconference: Metadata: Libraries and the Web—Retooling AACR and MARC21 for Cataloging in the Twenty-first Century, Chicago, IL, July 6-7, 2000
This preconference included 30 speakers and was organized to present information about major metadata standards on day one and applications of metadata on day two. Jennifer Younger gave the keynote address providing a history of cataloging from the Greeks at Alexandria forward. She encouraged us to influence and participate in the development of metadata schemes and left us with the phrase, “Let 100 metadata schemes bloom.”
Session one on Methods of Providing Access to Web Resources began with Brian Schottlaender speaking of AACR complexities and specifically the Delsey Report, The Cardinal Principle, and (ER) Harmonization. Rebecca Guenther spoke on MARC21, the Dublin Core, and crosswalks between the two. The next three speakers focused on seriality. Jean Hirons summarized the current state of AACR2 and seriality. Regina Romano Reynolds, Head of the National Serials Data Program, addressed ISSN as a link to data – in the online catalog, as a link out to other metadata, and as a link between publishers and libraries. In Struggling Toward Retrieval, Sheila Intner asked if we were cataloging the chunks that patrons want and encouraged the audience to expand bibliographic options for an increasingly diverse group of users. Matthew Beacom brought Session one to a close with his presentation on the use of AACR2R to catalog web resources, concluding that AACR must adapt further to a pluralistic metadata and data environment.
Session two focused on Methods of Providing Access to Web Resources. Erik Jul opened the session with the statement that library science has the potential to become a guiding, leading science rather than a trailing effort. Metadata specialists can become consultants to the world! Norm Medeiros followed with a summary of New York University School of Medicine’s participation in CORC. One result was that the school’s search engine was reconfigured to look at CORC tags, enhancing retrieval. Lynn Marko was next, bringing the group up to date on TEI (Text Encoding Initiative) with Working Toward a Standard TEI Header for Libraries. Eric Miller and Diane Hillmann combined to do a presentation on XML (eXtensible Markup Language) and RDF (Resource Description Framework) contending that if we perceive XML and RDF as just another way to present traditional materials, we’ve missed the point. Hillmann encouraged the audience to extend their reach – to consider providing access to collections and individual items either “above” or “below” the level of granularity of books and serials. She maintains that we can contribute an understanding of bibliographic structure and indexing, insights into user behavior, and experience managing big data – “let’s face the music and dance.”
After the afternoon break, Carlen Ruschoff brought the audience up to date on various ISO standards for metadata as well as outlining the ISO standard proposal path from initiative to published standard. He spoke specifically about a few ISO standards and drafts and included handout information for many others relating to data elements (6), identifiers (8), codes (3), character sets (14), transliteration of nonroman scripts (6), and formats and protocols for communication and retrieval (4). Carlos Rodriguez and Laura Bayard finished off the afternoon session with presentations on Infomine and MARCit, respectively.
Session three which began day two was built around the theme, Growing your own digital library at home. Four speakers described four different projects, each in a library setting. Elizabeth U. Mangan from the Geography and Map Division of the Library of Congress described a scanning project of map images, the American Memory access information assigned to the records, and the resulting functionality for linkages and retrieval. Constance Mayer of Indiana University provided us with a history and future goals of the VARIATIONS Project. The project currently provides access to over 6,000 titles of near CD-quality music from both their OPAC (linked from the 856 field) and from the course reserve lists. URLs are established for each track. Descriptive metadata (provided by the bibliographic record), structural metadata (such as track information), and administrative metadata (date digitized, initials of the technician, etc.) are created or acquired for each piece. Beth Picknally Camden brought us back to the Dublin Core in her presentation on the cataloging of digital video clips. The video clip collection was created for a class, selected by both the professor and graduate students, and digitized by the Digital Media Center. Because the Dublin Core itself has no content standard, she walked us through their local decisions for each Dublin Core element used. She emphasized increased cooperation between the Digital Library and Cataloging staff as a project benefit and encouraged each of us to get involved with these kinds of projects at our own institutions. Finally, William Fietzer from the University of Minnesota described interpretive encoding being added to electronic texts via TEI, providing examples from their Women’s Travel Writing (1830-1930) collection.
Session four focused on metadata projects apart from library settings – seven speakers in 2 hours! Diane Boehr gave us some insight on plans for metadata at the National Library of Medicine. Stanley Blum, a zoologist, spoke on the use of metadata to integrate biological collections and the similarities between libraries and the natural history communities. Murtha Baca described the work of the Art Information Task Force, which resulted in Categories for the Description of Works of Art (CDWA). CDWA is a guideline for the structure of art databases which could be likened to a hybrid of MARC and AACR2 in that they contain both guidelines for data content and guidelines for data value. She also provided information about other metadata standards and vocabulary resources. Wendy Treadwell presented information on the Data Documentation Initiative (DDI) and its role in Social Science Data access. Kris Kiesling explained Encoded Archival Description (EAD). William Garrison followed with information on the Colorado Digitization Project, including their use of OCLC’s SiteSearch and ability for participants to contribute records in different formats (currently Dublin Core or MARC). Brad Eden closed this session with a description of the Instructional Management System Standard and its use aim of enabling an open, rather than proprietary, architecture.
The final session entitled Looking toward the future was a panel discussion which included Clifford Lynch (CNI), Vivian Bliss (Microsoft), and Michael Gorman (CSU-Fresno). Lynch reminded us of the whole point of metadata – to make things more accessible – and maintained that metadata only gets really interesting when we use it. He challenged us to stop equating metadata just with description, but to see it from an information discovery standpoint – to always compliment our thinking of metadata with how it is going to be used and how it is going to be transported.
Bliss described her work with a team at Microsoft’s library to create a general portal to their corporate intranet (over 2 million pages) where queries cross several different collections with different tagging schemes and combine results. Their group created and maintains a metadata registry which brings organization to the company’s diverse controlled vocabularies as well as those that come into Microsoft from outside subscriptions to products like news feeds. Their group also markets their knowledge management expertise to other groups within the company. She echoed Eric Jul’s earlier comment, “If not us, then who?”
Gorman likened the task of cataloging the web to catching lightening in a bottle, asking what are we seeking to organize? He presented our range of choices as 1) identify and catalog, 2) identify and produce metadata according to some standard, 3) identify and product metadata without standards, or 4) leave some items in the murky waters of the net. He asked us to remember that cataloging an item without providing for the preservation of that item is not sufficient.
Eric Jul, Matthew Beacom, and Brad Eden returned briefly to the podium to wrap up the preconference, stating that our greatest day lies ahead! Let’s take action.
[The printed proceedings of this preconference will be published and are expected to be available at ALA Midwinter meeting, January 2001.]
From the Joint MARBI/CC:DA Meeting held during the American Library Association Annual Meeting, Chicago, IL, July 10, 2000
XML and MARC: A Choice or a Replacement? was presented by Dick R. Miller, Head of Technical Services & Systems Librarian, Lane Medical Library, Stanford University Medical Center. After Miller’s presentation Paul Weiss and Matthew Beacom gave formal responses to the presentation representing MARBI and CC:DA respectively. General discussion followed.
Miller reported that in September 1998, Lane Medical Library undertook the Medlane Project. It involved converting catalog records to XML for integration with other web resources – in part a reaction to the feeling that their library information (in MARC format) was under-utilized because of its segregation from mainstream web resources and an awareness of the reluctance of users to search multiple resources. Lane developed sample DTDs (document type definitions) to explore restructuring and simplifying MARC and released XMLMARC software on December 29, 1999 to demonstrate conversion feasibility. It is freely available for noncommercial use and there are currently 300 licensees from over 40 countries. It was developed as a feasibility study, which they believe, has been proven. The project is currently focusing on issues related to indexing, search access, and presentation.
Related projects include BiblioML, released in January by a French government agency which converts Unimarc to XML; the Library of Congress’s literal mapping of MARC to SGML from 1995 to 1998; and Logos Research Systems' MARC to XML to MARC Converter. However, in each of these projects the mappings are literal. Lane’s investigation differs in that it advocates changes to MARC to take advantage of XML's strengths and would mean a permanent change to XML rather than another version used as an adjunct to "real" MARC. Miller identified MARC as the chief impediment to an effective integration of the library resources with web resources.
XMLMARC was developed partly as a feasibility study for converting MARC data to XML, but also to explore ways in which cataloging data could be restructured for greater economy and elegance while still preserving content and previous efforts. Some of the MARC problems mentioned which could be better addressed by XML included its blurring of description, access, and relationships; mixing data values and data properties; excessive vs. insufficient subfields; redundancy; and character set issues.
Miller believes that “it is possible to recast MARC, leveraging untold person-years of effort in defining content, identifying relationships, and resolving problems and conflicts, producing a more coherent and eloquent version using XML. This could add luster to librarianship, engendering respect for librarians and needed technical underpinnings at a time when the profession is facing external as well as internal challenges.” He suggests we do more analysis, build a model, consider transitional strategies, and find a faster way to develop standards.
In his response, Weiss listed a few cautions. Storage space has been estimated at twice that of MARC. XML is a meta-language and not a single standard; its flexibility makes standards critical to maximize its benefit – much work would have to be done. He encouraged consideration of XML for areas in the library not currently standardized, such as the circulation protocol written in XML that will be out for comment this summer, and concluded that there is a place for both.
Beacom reminded the group that the discussion had been about how we express the structure not about the content or fill. We currently have an older expression of our structure (MARC), but having a retooled structure (XML) might more adequately service the new kinds of things we’re describing. He concurred with Miller’s description of the weaknesses in MARC and mentioned that records need to be able to become a cluster of related things. Beacom stated that it has been the flatness of the file that has frustrated the resolution of the multiple version problem.
The general discussion was favorable to exploring a move toward XML. Diane Hillmann reminded the group that our big investment is in semantics (AACR), not in syntax (MARC) and stated that MARC is imminently replaceable. We want to look forward to more sophisticated linking and better support of hierarchies, but will need to go carefully and not sacrifice what we have. John Attig encouraged the group to define the task – adjust the structure? – adjust the semantics? – and cautioned against ignoring legacy data. The meeting concluded with consensus that the discussion needs to continue.
[Dick R. Miller received an invitation to write an article on XML for Library Journal's NetConnect, which appeared in conjunction with this ALA Annual Meeting. This article advocates not only XML replacement of MARC formats, but also XML replacement of proprietary "library information" formats used by ILS vendors (e.g. ILL, patron data, circulation transactions, orders, check-in data) and predicts an XML-based ILS in the near future. See http://www.ljdigital.com/xml.asp. A related article on bibliographic management at Lane Medical Library appears in Cataloging & Classification Quarterly 30(2).]
From the CONSER At Large Meeting held during the American Library Association Annual Meeting, Chicago, IL, July 9, 2000
The Joint Steering Committee will discuss “Revising AACR2 to Accommodate Seriality: Rule Revision Proposals” prepared by Jean Hirons and the CONSER AACR Review Task Force as well as comments received about this document at its meeting in September 2000.
The CONSER Task Force on Publication Patterns and Holdings reported that the publication pattern experiment has begun use of the local OCLC bibliographic field 891 to share publication pattern and holdings data. The experiment embeds MARC fields 853 and 863 in OCLC’s 891 allowing this data to be communicated among systems and used for predictive check-in. In June, OCLC record #35601086 for Heart Failure Reviews became the first CONSER record in which 891 fields with such data were loaded. This task force is also analyzing responses from a survey of system vendors on their use of MARC Format for Holdings Data; a report will be forthcoming.
Guidelines on Subject Access to Individual Works of Fiction, Drama, Etc., 2nd edition is now available. Prepared by the CCS Subject Analysis Committee subcommittee on the Revision of the Guidelines on Subject Access to Individual Works of Fiction, (Hiroko Aikawa, Jan DeSirey, Linda Gabel, Susan Hayes, Kathy Nystrom, Mary Dabney Wilson, Pat Thomas, Chair), this new and revised edition will help catalogers and others in the library apply the suggested headings to individual works of fiction, enrich catalog entries quickly and consistently by following guidelines, satisfy library patrons and readers by pointing them to targeted works, characters, settings, and topics. Softcover, ISBN 0-8389-3503-6.
Music and Media at the Millennial Crossroads: Special Materials in Today's Libraries. This joint OLAC (Online Audiovisual Catalogers, Inc.) and MOUG (Music OCLC Users Group) Conference will be held October 12-15, 2000 in Seattle, Washington at WestCoast Grand Hotel (formerly Cavanaughs on Fifth Avenue). Martha Yee (Cataloger, UCLA Film and Television Archive) and Sherry Vellucci (Associate Professor, St. John’s University) will serve as keynote speakers. Topics for the cataloging workshops will be computer files (taught by Iris Wolley), Internet resources (Linda Barnhart), maps (Susan Moore and Kathryn Womble), music scores (Ralph Papakhian), realia (Nancy Olson), sound recordings (Mark Scharff), video recordings (Jay Weitz), and SACO (Adam Schiff). More information can be found at http://ublib.buffalo.edu/libraries/units/cts/olac/.
Bicentennial Conference on Bibliographic Control for the New Millennium: Confronting the Challenges of Networked Resources and the Web. The Library of Congress is hosting this invitational conference on November 15-17, 2000. It is intended to bring together authorities in the cataloging and metadata communities to discuss outstanding issues involving improved discovery and access to Web resources within the framework of international standards. The conference will focus on producing recommendations that will help the Library of Congress, the framers of AACR, and the library profession develop and implement an effective response to the bibliographic challenges posed by the proliferation of Web resources. Michael Gorman will give the keynote address, From Card Catalogues to WebPACs: Celebrating Cataloguing in the 20th Century. Discussion papers include Metadata for Web Resources: How Metadata Works on the Web by Martin Dillon and Metadata Schemes for Discovery in Digital Libraries: Trends, Interactions, and Common Themes by Caroline Arms. Additional information, names and topics of other speakers and commentators, and full text for some papers is available at http://lcweb.loc.gov/catdir/bibcontrol/. A discussion list to foster constructive feedback on issues addressed by the conference papers is currently available. To subscribe, send a message to firstname.lastname@example.org with the message "subscribe bibcontrol [your name]".