, Robert Bothmann, News Editor
The FRBR Family of Conceptual Models: Toward a Linked Future.
Guest Editor: Richard P. Smiraglia
Co-Editors: Pat Riva and Maja Žumer
Patrick Le Boeuf
ABSTRACT: FRBR is fourteen years old. The information environment has changed dramatically during the fourteen years of existence of FRBR. Is FRBR still relevant today, when the information contained in library catalogs can be transferred to the Linked Data universe without being FRBRized? Is it still needed at all? Its object-oriented redefinition is certainly more in line with current developments of the Semantic Web.
KEYWORDS: FRBR, Bibliographic information as Linked Data
Introduction: Be Careful What You Wish For: Lacunae in the FRBR Family of Models
Richard P. Smiraglia
ABSTRACT: The library catalog as a catalog of works was an infectious idea, which together with research led to reconceptualization in the form of the FRBR conceptual model. Two categories of lacunae emerge-the expression entity, and gaps in the model such as aggregates and dynamic documents. Evidence needed to extend the FRBR model is available in contemporary research on instantiation. The challenge for the bibliographic community is to begin to think of FRBR as a form of knowledge organization system, adding a final dimension to classification. The papers in the present volume offer a compendium of the promise of the FRBR model
The VTLS Implementation of FRBR
John Espley and Robert Rillow
ABSTRACT: A description of a FRBR implementation by VTLS Inc. The basic cataloging and searching functions are described, followed by a description of how VTLS has extended FRBR to handle recursive, or related, works and aggregates. The benefits to the requesting function of the Circulation subsystem are also provided.
FRBR: The MAB2 Perspective
Michaela Putz, Verena Schaffner, Wolfram Seidler
ABSTRACT: FRBRizing legacy data has been a subject to research since the FRBR model was published in 1998. Studies were mainly conducted for MARC21, but in Austria MAB2, a data format based on the rules for descriptive cataloguing in academic libraries, mainly in Germany and Austria, is still in use. The implementation of Primo, an Ex Libris software, made research in FRBRizing MAB2 records necessary as Primo offers the possibility of building FRBR-groups by clustering different manifestations of a work. The first steps of FRBRizing bibliographic records in MAB2 at the Vienna University Library and the challenges in this context are highlighted in this paper.
KEYWORDS: FRBR, FRBRization, MAB2, Primo (Ex Libris), Austria
Implementing FRBR to Improve Retrieval of In-House Information in a Medium-Sized International Institute
Aurélie Signoles, Corinne Bitoun, and Asuncion Valderrama
ABSTRACT: The International Institute for Educational Planning (IIEP) is a specialized institute of UNESCO which undertakes training and research in the field of educational planning and management. IIEP disseminates publications which are the outputs of its research findings.
The Documentation Centre is responsible for the maintenance and upkeep of several databases. In-house databases include a projects database, consisting of activity records (updated by administrative and research staff), and a grey literature document database and reference archive (mission reports, lessons, masters' papers). The latter contains heterogeneous, multilingual documents which are the outputs of activities. The external database is a publicly accessible bibliographic database which follows AACR.
The databases are separate which results in a loss of information. The process was undertaken within the wider context of reorganizing internal cataloguing rules to comply with changing international standards.
The objective is to make IIEP's various databases interoperable by factorizing the fragmented elements and reconciling heterogeneous data from multiple sources (different contributors, indexed and non-indexed content).
The choice of FRBR can be explained due to the appropriateness of an access point by work. On an information level, it allows the user to optimally retrieve resources through connections between the works. On an institutional level, it would enable the history and evolution of activities and their outputs to be traced.
The FRBRized catalogue would be enriched through inter-database relationships and would offer fuller records.
The first step was to establish the users' different needs and to develop a typology of the data to be processed.
Methodology used was based on the FRBRer model.
Then, identifying the entities enabled the work and its levels, the attributes of each group and the relationships to be determined.
To account for the processes of time and the complexity of the levels of work, FRBRoo and CIDOC-CRM models were envisaged.
Finally, an FRBRoo model was developed.
KEYWORDS: Cataloging, FRBRer, FRBRoo, CIDOC-CRM, Conceptualization
A Strange Model Named FRBRoo
ABSTRACT: Libraries and museums developed rules for the description of their collections prior to formalizing the underlying conceptualization reflected in such rules. That formalizing process took place in the 1990s and resulted in two independent conceptual models: FRBR for bibliographic information (published in 1998), and CIDOC CRM for museum information (developed from 1996 on, and issued as ISO standard 21127 in 2006). An international working group was formed in 2003 with the purpose of harmonizing these two models. The resulting model, FRBROO, was published in 2009. It is an extension to CIDOC CRM, using the formalism in which the former is written. It adds to FRBR the dynamic aspects of CIDOC CRM, and a number of refinements (e.g. in the definitions of Work and Manifestation). Some modifications were made in CIDOC CRM as well. FRBROO was developed with Semantic Web technologies in mind, and lends itself well to the Linked Data environment; but will it be used in that context?
KEYWORDS: FRBROO, Bibliographic and museum information harmonization, Bibliographic information as Linked Data.
Item, document, carrier: An Object Oriented Approach
ABSTRACT: I discuss the concept of Item as stated by the International Federation of Library Associations and Institutions (IFLA) in the conceptual model Functional Requirements for Bibliographic Records (FRBR) and the object-oriented version of it (FRBRoo). Using object-oriented modeling techniques I analyze the relationship of the Item with the Manifestation entity, the concept of Document, and the physical object as a Carrier of a Content. A class scheme is proposed, not only as an implementation example, but as a way of clarifying some bibliographic concepts as well.We discusses the concept of Item as stated by IFLA in the conceptual model FRBR and the object oriented version of it (FRBRoo). Using object oriented modelling techniques we analize the relationship of the Item with the Manifestation entity, the concept of Document and the physical object as a Carrier of a Content. A class scheme is proposed, not only as an implementation example, but as a way of clarify some bibliographic concepts aswell.
KEYWORDS: object orientation, object technology, objects paradigm, FRBR, conceptual models, FRBRER, FRBROO
Modeling Aggregates in FRBR
Maja Zumer and Edward T. O'Neill
ABSTRACT: In the bibliographic environment, the term aggregate is used to describe a bibliographic entity formed by combining distinct bibliographic units together. Aggregates are a large and growing class of information resourcesup to twenty percent of the bibliographic records in OCLC's WorldCat may represent aggregates. The Functional Requirements for Bibliographic Records: Final Report only briefly references aggregates. Difficulties and inconsistencies in the application of the FRBR model to aggregates have been identified as a significant impediment to FRBR implementation. To address the issue, the FRBR Review Group established a Working Group on Aggregates which completed its charge and submitted its final report in 2011. The Working Group proposed that an aggregate be defined as a "manifestation embodying multiple distinct expressions". This paper examines the proposed definition and explores how aggregates can be modeled.
Arrangement of FRBR Entities in Colon Classification Call Numbers
ABSTRACT: This paper analyzes similarities and differences between FRBR entities and their representation in the Colon classification call numbers. Dealing with lack of organization in library shelves and in the lists of bibliographic records resulting from users' searches in our present online catalogues, the paper discusses the chance to organize bibliographic resources by FRBR entities using the model of the facet formula provided for call numbers in Colon Classification and by means of relevant, ready and useable extant data. Main results of this analysis are: correspondences between FRBR entities and categories expressed in Ranganathan's bibliographic system can be found; a sound but not completely FRBRized bibliographic arrangement can be reached by call numbers also in catalogues not structurally capable to satisfy FRBR model; in Ranganathan's classified catalogue semantic and semiotic cataloguing are perfectly integrated, giving access to bibliographic universe as a whole; facet formula for call numbers could be used as identifying device.
KEYWORDS: FRBR, Colon Classification, Facet formula, Call numbers, Book numbers, OPACs, Knowledge Organization.
FRSAD and the ontology of subjects of works
ABSTRACT: Critics of the FRSAD model have argued that the FRSAR Working Group failed to make explicit the ontological assumptions underlying the model, and/or failed to make explicit the reasoning behind the choices that were made among competing conceptions. In this article, the philosophical assumptions underlying the design of the FRSAD model are identified and precisely described; the full range of alternatives are discussed and evaluated; and the implications of the Working Group's choices among those alternatives are clarified.
KEYWORDS: aboutness, data modeling, FRSAD, subject authority data, subjecthood
FRBR Entities: Identity and Identification
Martin Doerr, Pat Riva, Maja Zumer
ABSTRACT: The models in the FRBR family include ways to document names or terms for all entities defined in the models, with identification as the ultimate aim, i.e., to distinguish entities by unique appellations and to use the most reliable appellations for entities in a given context. The intention in this paper is to explore the interrelationships between these different models with regards to their treatment of names, identifiers and other appellation entities. The specialisation/generalisation structure of the appellation-related entities and the relationships and properties of these entities will be discussed. The paper also tries to clarify the potential confusion of identity itself in this context - when are we talking about an entity via its name, about the name itself, about the name citation in a document and when about a name of name?
In FRBR(er), titles for group 1, names for group 2 and terms for group 3 entities are merely defined as attributes of these entities. This serves the basic requirement of associating the appellation (label) with the entity, but does not allow introducing attributes of these appellations or relationships between and among them. FRAD, completed a decade later, defined as entities name, identifier, and controlled access point. Clearly making the distinction between a bibliographic entity and its name is a significant step taken in FRAD. This permits the separate treatment of relationships between the persons, families, and corporate bodies themselves and those relationships which instead operate between their names or between the controlled access points based on those names. In FRSAD, the most recent model, two entities are defined, Thema and Nomen. Again, the bibliographic entity is distinguished from the full range of its appellations.
The FRBRoo model expanded on the treatment of appellations and identifiers in CRM by modeling the identifier assignment process. In FRBRoo, F12 Name was defined but identified with the existing CRM entity E41 Appellation. Current development is concentrating on integrating FRAD and FRSAD concepts into FRBRoo, and this is putting a focus on naming and appellations, causing new classes and properties to be defined, and requiring a re-evaluation of some of the decisions previously made in FRBRoo.
As naming and appellations are such a significant feature of the FRBR family of conceptual models, this work is an important step in towards the consolidation of the models into a single coherent statement of the bibliographic universe.
KEYWORDS: FRBR family consolidation, FRBRoo, CIDOC CRM, identity, identification, names
FRBR and Cataloging Rules
FRBR/FRAD and Eva Verona's Cataloguing Code: Toward the Future Development of the Croatian Cataloguing Code
Mirna Willer and Ana Barbaric
ABSTRACT: The purpose of this paper is to research the feasibility of evolving Eva Verona's Code and Manual for Compiling Alphabetical Catalogues, the current Croatian cataloguing code, into a FRBR/FRAD structured code of rules, with the aim of assessing the direction to be taken toward development of the future national cataloguing rules. The methodology used is the mapping of conceptual models FRBR and FRAD entities, attributes and relationships to Verona's Code of rules.
KEYWORDS: FRBR, FRAD, Eva Verona's Code and Manual for Compiling Alphabetical Catalogues, Croatian cataloguing code, mapping
Evaluation of RDA as an implementation of FRBR and FRAD
Pat Riva and Chris Oliver
ABSTRACT: RDA, Resource Description and Access, is based on the foundation of the original entity-relationship statements of the conceptual models FRBR and FRAD. RDA not only uses the vocabulary of entities, attributes and relationships, as well as the user tasks, described in the models, these concepts also form an integral feature of its structure at both the macro level (the organisation of the sections and chapters of RDA reflects the models) and at a more detailed level within chapters. This paper reviews the degree of alignment of RDA with FRBR and FRAD, covering the areas of user tasks, entities, attributes, and relationships, and discusses the divergences of greater or lesser significance which exist.
The FRBR user tasks are almost identical to the corresponding RDA tasks, but in RDA the wording and naming of tasks corresponding to the FRAD user tasks is reoriented towards the point of view of the end user. RDA adopts the bibliographic entities, but does not treat the FRAD entities name, identifier, or controlled access point as entities in their own right, even though the essence of the FRAD model of authority control is integrated into RDA. RDA's data elements can generally be traced back to attributes defined in either FRBR or FRAD, although at times at a greater level of granularity. The FRBR primary relationships are all included in RDA, but a direct link between work and manifestation is also defined in RDA with the work manifested relationship.
RDA takes steps towards the harmonisation of the separate models, some obvious, such as adding the entity family to group 2 and using the FRAD definition of the entities person and corporate body, others less so, for instance in harmonising the different treatment of relationships among group 1 entities in the organisation of the relationship designators in appendix J. The ways in which RDA implements both FRBR and FRAD into a single content standard, as well as the ways in which RDA diverges from the models, may provide valuable insights for the consolidation of the FRBR family of conceptual models.
KEYWORDS: FRBR, FRAD, RDA, entity-relationship models, cataloging standard, user tasks, entities, attributes, relationships
Conceptualizations of cataloguing object: A critique on current perceptions on FRBR Group 1 entities
ABSTRACT: Libraries face a double challenge in the digital age: both the describing framework and the describing object are under change. FRBR attempts to generate a coherent theory and yield a new Paradigm of cataloging. This study deploys current conceptualizations of the FRBR Group 1 entities within the FRBR models family with a view to semantic interoperability. FRBR cannot be considered as simple metadata describing a specific resource but more like some kind of knowledge related to the resource. This study reveals that there are different perspectives of what is introduced by FRBR as the cataloging object in the context of various interpretations of the model, namely RDA, FRBRization projects and FRBRoo.
KEYWORDS: Resource Description, FRBR Group 1 Entities, FRBRization, RDA, FRBRoo, Cataloging Object, Semantic Interoperability
From the FRBR Model to the Italian Cataloguing Code (and Vice Versa?)
The Functional Requirements for Bibliographic Records (FRBR) model has been the main framework of reference for the new Italian cataloging rules. The code puts the work at the center of the catalog and of the rules because users are mostly interested in works and the most wanted works are increasingly available in multiple manifestations. Every work should be identified in the catalog and responsibility relations should be recorded at the proper level. The code is tailored to the specific needs of library cataloging and based on a new thorough analysis of the phenomena to be reflected, organized, and made accessible via the catalog.
KEYWORDS: library catalogs, cataloging rules, OPAC, FRBR, work and expression, Italy
Research Using FRBR
The Contribution of FRBR to the Identification of Bibliographical Relationships: The New RDA-based Ways of Representing the Relationships in Catalogs
Virginia Ortiz-Repiso and Paola Picco
ABSTRACT: Libraries that have implemented FRBR are obliged to resort to auxiliary models to display bibliographic records. This article examines the use of RDA for recording the different types of relationships (identifiers, authorized access points, composite structured and unstructured descriptions) and how to represent them in existing formats. It also analyzes the modifications that will be needed, and the elements that should be taken into account, in order to develop innovative tools capable of representing relationships in the clearest and most appropriate way for users, while staying current with the latest trends in information organization.
KEYWORDS: FRBR, FRAD, RDA, bibliographic relationships, semantic web.
Analysis of Work-to-Work Bibliographic Relationships through FRBR: A Canadian Perspective
Clement Arsenault and Alireza Noruzi
ABSTRACT: The purpose of this study is to investigate the characteristics of Canadian publications by analyzing their bibliographic relationships based on the Functional Requirements for Bibliographic Records (FRBR) model. The study indicates frequencies of occurrence of work-to-work bibliographic relationships for manifestations published in 2009 and catalogued in the AMICUS online catalogue. The results show that approximately 4.4 percent of the 2009 bibliographic records in the AMICUS catalogue exhibit a work-to-work bibliographic relationship.
KEYWORDS: Functional Requirements for Bibliographic Records (FRBR), Work-to-work relationships, Bibliographic family, Cataloguing, Canada
Composing in Real Time: Jazz Performances as "Works" in the FRBR Model
ABSTRACT: In FRBR and FRAD, realization of a musical work through performance is unambiguously included as a type of expression when it involves music in the Western canon. There is room for interpretation, however, as to whether an improvisation in jazz or rock constitutes an expression, or a new work with each performance. Multiple expressions, particularly transcriptions, and related works suggest the potential usefulness of treating a jazz performance as a work. This article examines the question of boundaries between one work and another, and illustrates ways that the FRBR model might be applied to cataloging improvisations.
KEYWORDS: FRBR work entity, jazz improvisation, transcriptions, music cataloging, RDA
Identifying Works for Japanese Classics toward Construction of FRBRized OPACs
Takuya Tokita, Maiko Koto, Yosuke Miyata, Yukio Yokoyama, Shoichi Taniguchi, and Shuichi Ueda
ABSTRACT: A research project was conducted in which proper JAPAN/MARC bibliographic records for 158 major Japanese classical works were identified manually, since existing records contain little information about works included in the resources. This paper reports the detailed method used for work identification, including selecting works, obtaining the bibliographic records to be judged, and building the judgment criteria. The results of the work identification process are reported along with average numbers that indicate the characteristics of certain classics. The necessity of manual identification was justified through an evaluation of searches by author and/or title information in a conventional retrieval system.
KEYWORDS: FRBR, work identification, manual identification, Japanese bibliographic records, Japanese classics, FRBRized OPACs
FRBRizing Bibliographic Records Focusing on Identifiers and Role Indicators in the Korean Cataloging Environment
Hyewon Lee and Ziyoung Park
ABSTRACT: This study aims to find a method to change Korean bibliographic records into the FRBR structure. It also intends to verify that this conversion is possible without main entry or uniform headings because there is no rule for choice and form of the main headings in the current Korean Cataloging Rules. In this paper, we reviewed the role of identifiers and role indicators for FRBRizing. Also, we analysed the characteristics of the Korean cataloging environment, focusing on the bibliographic records and authority records based on the current cataloging rules.
As a result, we suggested the methodology for FRBRizing Korean bibliographical records by the combination of identifiers and role indicators. Although there is not much information about global identifiers or relator codes in bibliographic records based on the KORMARC at present, the FRBRizing method using identifiers and role indicators would be more effective for the global and networked information environment than the method using main entry or uniform headings.
KEYWORDS: Korean Cataloging Rules, FRBR, identifier, ISTC, ISNI, VIAF, KORMARC, authority records, role indicators, relator codes
What do Users Tell us About FRBR-Based Catalogs?
Yin Zhang and Athena Salaba
ABSTRACT: FRBR user research has been the least addressed area in FRBR research and development. This article addresses the research gap in evaluating and designing catalogs based on FRBR user research. It draws from three user studies concerning FRBR-based catalogs: (1) user evaluation of three FRBR-based catalogs, (2) user participatory design of a prototype catalog based on the FRBR model, and (3) user evaluation of the resulting FRBR prototype catalog. The major findings from the user studies are highlighted and discussed for future development of FRBR-based catalogs that support various user tasks.
KEYWORDS: FRBR (Functional Requirements for Bibliographic Records), library catalog, online catalog, OPAC (Online Public Access Catalog), user research, user tasks, FRBR implementation, system evaluation, system design
FRBR and The Semantic Web
Representing the FR Family in the Semantic Web
ABSTRACT: Each of the FR family of models has been represented in Resource Description Framework (RDF), the basis of the Semantic Web. This has involved analysis of the entity-relationship diagrams and text of the models to identify and create the RDF classes, properties, definitions and scope notes required. The work has shown that it is possible to seamlessly connect the models within a semantic framework, specifically in the treatment of names, identifiers, and subjects, and link the RDF elements to those in related namespaces.
KEYWORDS: Cataloging standards, Metadata standards, Bibliographic data - interoperability, Data models - cataloging research, Entity-relationships models - cataloging research
YouTube: Applying FRBR and Exploring the Multiple Description Coding Compression Model
Jane Greenberg, Ketan Mayer-Patel, and Shaun Trujillo
ABSTRACT: Nearly everyone who has searched YouTube for a favorite show, movie, news cast, or other known item, has retrieved multiple videos clips (or segments) that appear to duplicate, overlap, and relate. The work presented in this paper considers this challenge and reports on a study examining the applicability of the Functional Requirements for Bibliographic Records (FRBR) for relating varying renderings of YouTube videos. The paper also introduces the Multiple Description Coding Compression (MDC2) to extend FRBR and address YouTube preservation/storage challenges. The study sample included 20 video segments from YouTube; 10 connected with the event, Small Step for Man (US Astronaut Neil Armstrong's first step on the moon), and 10 with the 1966 classic movie, "Batman: The Movie." The FRBR analysis used a qualitative content analysis, and the MDC2 exploration was pursued via high-level approach of protocol modeling. Results indicate that FRBR is applicable to YouTube, although the analyses required a localization of the Work, Expression, Manifestation, and Item (WEMI) FRBR elements. The MDC2 exploration illustrates an approach for exploring FRBR in the context of other models, and identifies a potential means for addressing YouTube-related preservation/storage challenges.
KEYWORDS: Functional Requirements for Bibliographic Records; FRBR, Multiple Description Coding Compression, YouTube, Metadata
FRBR and Linked Data: Connecting FRBR and Linked Data
ABSTRACT: From the time of the earliest catalogues documenting private collections, to the present proliferation of repositories of material and digital objects, the bibliographic record as an aggregation of logical and physical characteristics of a resource has prevailed. The development of the FRBR conceptual model introduced a shift in focus away from the record as a whole to component pieces of data (or disaggregated data) where those data elements have the potential to be shared and used in diverse, even novel ways. Tim Berners-Lee's "rules" underlying the Open Linked Data Project, offer an opportunity for FRBR-compliant, quality bibliographic data to be exposed to the digital universe via the Semantic Web. Context and potential for seizing this advantage are explored.
KEYWORDS: Functional Requirements for Bibliographic Records; FRBR; Linked Data; Semantic Web
Robert Bothmann, News Editor
Welcome to the news column. Its purpose is to disseminate information on any aspect of cataloging and classification that may be of interest to the cataloging community. This column is not just intended for news items, but serves to document discussions of interest as well as news concerning you, your research efforts, and your organization. Please send any pertinent materials, notes, minutes, or reports to: Robert Bothmann, Memorial Library, Minnesota State University, Mankato, ML 3097, PO Box 8419, Mankato, MN 56002-8419 (email:, phone: 507-389-2010. News columns will typically be available prior to publication in print from the CCQ website at .
We would appreciate receiving items having to do with:
Research and Opinion
The Library of Congress' (LC) announcement to implement RDA: Resource Description & Access on March 31, 2013 has set in motion a number of events related to "Day 1." The official announcement includes a link to an outline for LC's training plan:
Several Program for Cooperative Cataloging (PCC) RDA task groups have been formed to draft and recommend policy for various aspects such as authority control issues, acceptable headings, provider-neutral records, and training materials among others. The task group names, membership, and charges are available on the PCC RDA Web site. In early March 2012 the PCC Secretariat began sending monthly PCC RDA update messages to its discussion lists. Items of note from those messages include:
VIAF, the Virtual International Authority File () is a joint project of several national libraries to match and link widely used authority files into one virtual record. The project, hosted by OCLC, has now become an official OCLC service as of April 2012 ( ).
"iLibraries: Digital Futures for Libraries," May 3, 2012
Submitted by Jennifer Eustis, Catalog/Metadata Librarian, University of Connecticut, Storrs, Conn., USA
The New England Technical Services Librarians (NETSL) held their annual conference at the College of the Holy Cross, Hogan Campus Center, Worcester, Mass. This year's topic was "iLibraries: Digital Futures for Libraries." This conference focused on digital initiatives and solutions to better manage and organize digital resources. The keynote presentation was delivered by John Unsworth on "Big Data: Big Deal? New Challenges for Scholars and Librarians." The afternoon had two panels. The first one, "Transforming Technical Services in the Library," was presented by Alicia Morris and Roger Brisson. The second one, "ILS (Integrated Library System) in the Cloud: Promise or Peril?" was delivered by Martha Rice Sanders and Bob Gerrity. In addition, the conference held four breakout sessions on Dataverse and Data Management Plans, Digital Repository Services with Fedora, OverDrive, and Using Technical Services Skills as a Systems Librarian.
The two afternoon panels delivered excellent information. Both were able to draw the crowd into a discussion on how technical services is changing. Alicia Morris explained how she transformed her department from what was a print oriented cataloging department to one in which metadata plays an essential role. Roger Brisson gave the example of how Alma, from Ex Libris, has improved shared workflows allowing Boston University to better meet its digital needs of the future. In the second panel, one of the most interesting comments was from Martha Rice Sanders who said that we needed to stop thinking of the ILS or the integrated library system. We need to start conceptualizing the ILS or integrated library services. Bob Gerrity seconded this view. He stressed that one of the problems with the integrated library system is that it is a standalone service that silos libraries and our data from the rest of the world.
Overall, the conference was a success and drew people from outside of technical services. It brought a younger crowd this year. It also delivered timely and thought provoking material. The presentations are slowly being added to the conference website
(http://nelib.org/netsl/2012conference). For further information, NETSL now has a Facebook page, and a Google Moderator topic discussion list ) for next year's conference themes.
British Library, London, England, April 26-27, 2012,
Submitted by Barbara B. Tillett, Chief, Policy & Standards Division, Library of Congress and Chair of the Joint Steering Committee for Development of RDA
Diane Hillmann and Gordon Dunsire organized two days of meetings, graciously hosted by the British Library in London, that were broadcast in live streaming Web connections that were recorded (thanks to Corey Harper), and followed by Twitter comments. The presentations and papers from the meetings will be made available on the Dublin Core Metadata Initiative Web site:. (The original meeting April 30-May 1, 2007, also held at the British Library in London, brought together interested individuals from the RDA: Resource Description and Access, Dublin Core, IEEE/LOM, and Semantic Web/W3C environments to discuss data models used in various metadata communities.)
The Dublin Core Metadata initiative's Bibliographic Metadata Task Group had its inaugural meeting to discuss 1) application profiles, 2) alignments and mappings, and 3) the multilingual environment, with a focus on clarifying concepts and building action items for a Task Group.
1) Application Profiles: Gordon Dunsire led the discussion of "what is an application profile." We were reminded of the Dublin Core "Singapore framework" that provided guidelines, but it was also suggested that, for an RDA application profile, we simply need to finish the detail. We have the bounded properties, relationships, domains and ranges for the elements in RDA, but do not yet have the information about what data is mandatory, repeatable, etc. There are also issues of some data being mandatory if applicable, so it was posited that separate profiles for different materials might be needed. Hopefully a way will be found to avoid that, learning the lessons from experience with ISBD and the MARC format where the separate approaches were later consolidated, but not everyone agrees with that view.
2) Alignments and Mappings: There was a discussion about differing needs of various communities in describing things. As an example, RDA's vocabulary has 3 or 4 terms for types of colorization, but other communities need more gradations. There has been a lot written about mapping ISBD with FRBR, ISBD with RDA, RDA with FRBR, RDA with MARC 21, etc., and there are many methods, such as a "hub and spoke" approach. A recent draft ISO standard ISO/DIS 25964-2 was mentioned that discusses the interoperability of vocabularies and mentions the various techniques.
3) Multilingual Environment: There are many efforts underway to provide translations of the RDA value vocabularies, but again the ISO standard on interoperability of vocabularies was mentioned as a good source to describe issues of mapping across languages not always one-to-one terminology, sometimes multiple words are needed to express a term from one language in another, issues of declensions and other linguistic challenges. Definitions are needed and a URI and the label used can then display whatever language or script is desired and in the system.
Diane Hillmann led the DCMI Vocabulary Management Community presentations and discussions focused on management and preservation of elements sets, multilingual vocabularies, and sustainably tracking the growing vocabulary space discovering vocabularies and evaluating whether to trust them.
Bernard Vatant of Mondeca spoke about "sustainability, discovery, and selection" using his project LOV (Linked Open Vocabularies) as an example for smaller Resource Description Framework Schema/Web Ontology Language vocabularies. They have about 250 vocabularies since March 2011 and about 20,000 elements with over 200,000 triples and they cache daily from the various vocabularies. They are conducting a survey to get social feedback from users to see how vocabularies rely on each other. He noted that harvesting metadata "is a pain" because often basic metadata is lacking and it's often hard to identify responsible agents. Their explorations raised issues of sustainability, versioning over time, archiving, and the need to describe responsibility so a user can determine how reliable the data is. He suggested that there needs to be a charter for the commons with a minimal level of commitment that is endorsed by the major actors (DCMI, W3C, libraries, etc.), and there needs to be a business model. He stated libraries are a "natural home" for vocabularies.
Mike Lauruhn of Elsevier Labs spoke about Multilingual Vocabulary Development. He described the developments since last year's DCMI "unconference" in The Hague of diverse participants exploring AgroVoc as a use case for the EU (European Union). The EU works in 23 languages and there is a need to transform them all to FRBR and RDF, so they developed an authority and value vocabularies to validate metadata: countries, languages, currencies, and government bodies. Jon Phipps noted the multilingual support mechanisms on the Open Metadata Registry (OMR) using GitHub and VocabHub as the basis for new services, where they can manage vocabularies using spreadsheets to embed the multiple languages with columns for the different languages and then generate RDF from the spreadsheet. They have found this strategy a quick way to see gaps in translations while providing tracking of changes and provenance data. Another issue for multilingual vocabularies was multiple scripts and system limitations to enable use of the full Unicode characters, which are themselves not totally complete. But finding a way to move beyond that "brick wall" would certainly help move us forward in providing multiscript access for users.
Barbara Tillett gave a brief overview of the 2007 meeting's goals and outcomes and reminded the attendees of what is still to be completed (a schema and application profile for RDA).
Diane Hillmann then spoke about "What we learned by building the RDA vocabularies". Part of the lessons learned were that there is a clash between the XML world view of a closed world defined by schemas and a chaotic, open world of RDF, which has no "records" as we are used to thinking about in the library community. She noted the need for "unconstrained" vocabularies without the links to FRBR entities, which she said the non-library communities have asked for. One decision made in building the RDA vocabularies was to treat RDA's relationship designators (RDA Appendices I, J, and K) as RDF 'properties' rather than 'attributes.'
Pat Riva, chair of IFLA's FRBR Review Group presented an overview of a Canadian project, "Pan-Canadian Documentary Heritage Network," which has pooled data from five sources and, using linked data techniques with mash-ups, produces an interesting mix of information for end users. She noted work is underway to consolidate the various FRBR family of conceptual models (FRBR, FRAD, and FRSAD) and to align ISBD and FRBR.
Alan Danskin, metadata standards manager at the British Library spoke on the topic "From Tags to Triples." He described the business and technical challenges in placing the British National Bibliography data on the Web as linked open data. In July 2011, the British Library released 85 million triples (equivalent to about 2.6 million BNB records, dating back to 1950) as linked data, connected to GeoNames, Lexvo, RDF book mashup, VIAF, LCSH, and DDC. The work was done in collaboration with Talis, who provided training, the technical infrastructure and support, but using, so far as possible, existing staff and utilities to convert MARC 21 records into RDF triples. In the 8 months since BNB was released, monthly transactions have increased from 85,000 to more than 6 million. By exposing the data and publishing the model, the British Library hopes to encourage re-use and innovation.
Ed Chamberlain from Cambridge University Library spoke about COMET, the Cambridge Open Metadata project, which was a 6 month project to test open linked data. The project identified issues about ownership of the records noting that most vendors are ok with non-MARC data having a license of ODC-BY (open data commons - with attribution). He noted Harvard University recently published their records as CC0 (Creative Commons Zero - open with no attribution). The Cambridge data was a mix of more than 3 million records, including MARC 21 data from Library of Congress records and records from OCLC. They had mappings from MARC to Dublin Core and then to RDF. He noted that mapping from MARC 21 to RDF is a pain due to the use of numbers for fields and ISBD punctuation. They linked to LCSH and OCLC's FAST (Faceted Subject Terms based on LCSH) and VIAF. About 74% of their records had links to LCSH. The project was a failure because it hit the limits of the platform, ARC2, which was too lightweight for more than 1 million triples. They also found SPARQL was not good enough to index, so there was a high entry barrier to RDA as a result of the accompanying technologies. The Open Bibliography 2 is a follow-on project in collaboration with the UK Publications Central, and they are exploring copy cataloging using open data in the Cambridge-Lincoln Open Cataloguing Knowledge Base (Open Knowledge Foundation).
Gordon Dunsire's animated presentation, "Turtles Dreaming" followed Terry the Turtle from the MARC 21 swamp into the work of linked data where he realizes he is a triple in Turtle (a textual syntax in RDF), and is linkable and looks best as an RDF graph. He can arrive (link) at higher-level places without travelling, because he can be cloned, albeit with a semantic loss of definition. It is impossible to round trip, but this is not necessary in this context because Terry and his clones remain available to applications at their original levels of granularity. Terry has a cousin Timmy with issues related to aggregation and links to an instance group (for example a publication statement that includes place and date of publication and name of publisher) that can't get where he's going due to a DCAM/DCAP chasm (Dublin Core Abstract Model and Dublin Core Application Profile). There is work to bridge the gap and come to the rescue in the form of the Dublin Core Architecture Forum and the W3C RDF Working Group, looking at named graphs, application profiles, syntax encoding schemes, OWL, description sets, and more. And then there is the cosmic model where a turtle carries 4 elephants that support the world is there a triple to rule them all?
Mikael Nilsson (Google software engineer) spoke about "RDA and Metadata harmonization a five year plan." Libraries traditionally have been keepers of massive databases and were early adopters of computerization and standardization, but were not in the forefront of open access and interlinking of data in 2007, at the first London meeting. Interestingly the linked data world really got going in 2007 (the DBpedia image first appeared in May 2007, at the same time as the London meeting). Following encouragement from the London meeting, Barbara Tillett enlisted Ed Summers to launch the Library of Congress posting of LCSH in SKOS Format, which was probably the first exposure of library data in that way, followed soon by work at the National Library of Sweden. The September 2011 image of DBpedia and linked data shows a huge complex of interconnected data, but nothing really yet links to RDA. How much of it should? How many will connect to RDA in 5 years, or rather which are suited to using RDA in this context?
There was a shift in 2007 from viewing RDA as a self-contained closed standard only usable in libraries to an interlinked standard, re-usable in whole or in parts, based on an interoperable model. It was seen as a part of a bigger puzzle. RDA now has interoperable vocabularies posted on the Open Metadata Registry and lots of collaboration, including interaction with W3C, linked data communities, national libraries, etc.
He then went on to describe various linked data and harmonization goals in 2007 to connect databases using World Wide Web technology and RDF, connect vocabularies using RDF schema/OWL and SKOS, and connect systems based on the above. He mentioned the Pan-Canadian project as a good example, but we have not yet seen many great applications.
So by 2017, in another 5 years he predicts we will still be exploring these same issues, but hopefully be a bit farther along. For libraries, Mikael sees the next steps as:
1. Open up the data, keep licenses open, plan for RDA, make RDA "documents" more accessible and re-usable for external developers, and provide format conversion from MARC, etc. He sees the library community moving in that right direction already.
2. Interoperate and experiment provide more guidelines for using RDA data and help create the ecosystem of library data.
3. Vocabulary interoperation develop more overlap and cross connections among vocabularies. Libraries should participate in and support vocabulary activities and develop application profiles with intersections between data modeling and quality control/publishing
4. Third party applications encourage these and involve unexpected users to help see creative new applications for library data. Publish the unconstrained vocabularies and use agile technologies, support small-scale users and new technologies
5. RDA should not develop ontological constraints it should be used for simple vocabulary descriptions, not for quality control and not for record delimitation. For that, we need to work on the RDA application profile and description sets/named graphs and separate the re-usable parts from local domain data and uses.
His conclusions are that we have travelled far and there are big opportunities ahead. The surrounding communities are advancing rapidly around linked data and libraries are waking up.
Robina Clayphan spoke next about "Europeana Data Model and Europeana Libraries." Europeana is a service covering the cultural heritage sectors in Europe with 1,500 providers, offering a portal to search and access digital objects that stay on the provider's sites. On 1 July 2012 they will use a CC0 (Creative Commons Zero) license. They link to GeoNames for places, GEMET (General Multilingual Environmental Thesaurus) for subjects, and others. They are based on a Dublin Core Application Profile, but found it suffers from a lack of one-to-one rules and they wanted to move to better distinguish between the real world objects and the digital repository images. She noted museums are event based and libraries are object based. They use RDA as a source for the attributes/properties for persons, families, and corporate bodies. They find they can identify cultural heritage objects (CHOs) more precisely, can have non-literal values, use event classes for aggregated statements in MARC, and their first implementation missed some properties that they are now hoping to add.
Owen Stephens (JISC) spoke about the Resource Discovery Task Force and the JISC Discovery Programme. Some of the issues are the publication and aggregation of open, re-usable metadata, having open licenses for re-use. They have prepared the "Open Bibliographic Data Guide" (), recommending that libraries start with thinking about making all data open (public domain or CC0), and then pull back if needed. JISC recently funded 8 projects to make metadata available and 6 of those were linked data, COMET included. The projects typically are to build skills locally and for community engagement and service development in addition to making the data openly available. Their Phase 1 shares lessons learned from the projects ( ). Many projects have used VIAF, OCLC FAST, and LCSH at . They stress the importance of globally unique identifiers and paying attention to how data works on the Web, especially with search engines. Phase 2 goes through July 2012 with 8 more projects, including Will's World (a Shakespeare project) and a World War I aggregator. They hope to find which bits of aggregation we can make easier for multiple audiences, purposes, formats, and schema, etc. using existing records that are available without having to translate from one format to another. JISC is in a period of transformation and their funding position is unclear beyond July.
Tom Baker (DCMI) led the closing discussion, starting with a review of the two days of meetings. He suggested the following goals for the next 5-year reunion in 2017:
If there are about 200,000,000 unique records in OCLC WorldCat, and about 30 MARC tags per record and so 30 RDF triples per record, that's 6 billion triples, and if we bring in archives and museums (other cultural heritage organizations), there are trillions of triples, which is likely a lot more than DBpedia has. Gordon Dunsire suggested that cultural heritage organizations need to continue a public service mission. We need to make the Library of Congress Name Authority File and LCSH more granular so they are more usable by special communities and smaller, focused applications. There are lots of hooks for people to hang their data on. He suggested we watch development of ISBN-A, which is a DOI with ISBN, and the identifiers like ISTC with millions registered already, and the name identifiers with VIAF being a success story for libraries. The challenge is seen as matching library data with the questions people want answered. We want a positive spin on the data input over the decades by libraries as a key selling point. Diane Hillmann said in 1995 with Dublin Core they tried that, and perhaps now it is time to focus on providing navigable rivers, to get libraries, museum, and archives out of the old silos. Corey Harper added that we can provide context with everyone adding descriptive data linked to narratives to help people tell their stories.
Note: The author wishes to thank Alan Danskin, Gordon Dunsire, and Diane Hillmann for assisting with the facts of this report.
ISKO UK and the BCS Location Information Specialist Group (LISG), 29 March 2012, London, England
Submitted by Kimberly Kowal, Curator of Digital Mapping, British Library
This event presented an introduction to the current environment of geographic information (GI) and its complexities in the form of a mixed bag of use cases illustrating geospatial data use, the associated challenges, and the diverse manner in which they are approached, primarily by UK government agencies and quangos (quasi-autonomous non-governmental organization). Relative to the US, spatial data produced by the public sector in the UK (and EU) is only just opening up, and there is a scramble now of initiatives and efforts towards access, data linking and interoperability. The presentations illustrated the disparate attempts to do just this, and conveyed the fairly accurate impression that there is not a comprehensive framework, tool set, or standard being employed. The library community as immense and diverse as it is can in comparison seem to be a uniform monolithic authority in its approach to data, or more precisely to metadata standards.
Amidst this topical morass, there were numerous nuggets of information relevant to library and information science professionals and research uses. Jo Walsh of EDINA, a national data centre based in higher education, presented the plethora of geospatial data access and linking tools developed there, including metadata catalogs, vocabularies and gazetteers, as well as some public online external initiatives, e.g., the Old Maps Online linked data project providing a geographic search interface and metadata management tool for scanned historical maps in libraries. Standards for GI and address data, issued by the British Standards Institution Committee for Geographic Information and used to inform data produced by Ordnance Survey (OS), were introduced by Carsten Rönsdorf, before their application in numerous OS address-based datasets was demonstrated.
All of the abstracts and PowerPoint presentations from the day may be viewed online at.
Submitted by Hallie Cantor, Acquisitions, Hedi Steinberg Library, Yeshiva University, New York, N.Y., USA
These days, when the Mona Lisa can be a screensaver and the Hebrew Grace after Meals can be read off Android, libraries and museums face the "virtual reality:" future researchers may opt to view paintings or prayer books through pixels, rather than print, and in the comfort of their living rooms. The workshop "Rare Books and Archival Collections in the Digital Age: The Impact, the Direction and the Conversation," held March 26, 2012 at The Library of The Jewish Theological Seminary of America in New York City (JTS) and hosted by the Association of Jewish Libraries, New York Metropolitan Area chapter, addressed the need, impact, and use of digitization within specialized collections.
According to the six experts who addressed a packed audience, the Web will become not only a repository but a virtual library and museum in one. Art and incunabula, formerly the domain of stodgy glass vaults and exhibit rooms, will be downloaded for scholars and public alike, as well as integrated within other media. While for some institutions this may be a good thing, Judaica collections, in particular, face a double burden: general issues of digitization and issues that are uniquely cultural and religious.
Opening the conference, Naomi Steinberger, JTS Director of Library Services, remarked on her workplace as the appropriate conference venue. For over a decade the Jewish Theological Seminary, headquarters of the American Conservative Judaism movement and prominent site of research, has been in the forefront of digital preservation, not only providing full access or sound recordings of their immense holdings, but utilizing the Web to publicize many unknown items. Much of the vaunted JTS library collection focuses on primary source materiali.e. newspaper clippings, 18th and 19th century pamphlets, folk music collections, student theses, in addition to the traditional, historical, and religious documents. "Digitization determines the manuscript codices of the future," Ms. Steinberger declared.
In addition to developing in-house projects, JTS has collaborated with other institutions, e.g., Berkeley and Columbia. Among the images of medieval illuminated manuscripts found in one webpage, JTS has contributed 200 of Spanish; among the Freiberg Geniza (burial site of sacred Hebrew texts) JTS holds the world's second largest amount of fragments (30,000). In addition, websites such asand are committed to preserving old American-Jewish books. There are also small-funded projects, with websites on educational projects.
Given JTS's amazingly rich resources, finding partners willing and able to undertake the more ambitious projects is a challenge. Funding has come from METRO (New York society of librarians) and the American Jewish Historical Society. Through the Eldridge Street Synagogue collections (on New York's Lower East Side, a famous Jewish neighborhood), archival documents were brought together, i.e., cemetery records, pre-1900 board meetings. Digital photography has led to shared interfaces, where the researcher can move from one to another.
The upgrade of metadata has naturally created the need for state-of-the-art technology. Through the recent Leonard Polonsky Fund, a new digitization lab will make high-quality images of precious manuscripts and accommodate bound works of paper and parchment, among them illuminated and decorated manuscripts, in a climate-controlled environment. It will also keep the public apprised of ongoing projects. One inspiration is the National Library of Israel, which has digitized the world's 40,000 Hebrew manuscripts.
The sheer volume of projects and websites has pointed to incredible progress. Nevertheless, Ms. Steinberger concluded, "JTS is yet still in learning stages and has far to go."
In "Wonders of the World: The Taj Mahal and Rare Books," Dr. Peggy K. PearlsteinHead, Hebraic section, African and Middle Eastern Division, Library of Congresscredited Divine Providence to her lecture: namely two New York Times articles, both published within the same week and both covering India, which she herself had recently visited.
In the first article "From Exile to Everywhere" (Mar. 23, 2012), controversial author Salman Rushdie, upon first seeing the Taj Mahal, described his awe at the "real life" depiction of a familiar object, adding how constant reproduction devalues the original. Will this be the effect of digitization, Dr. Pearlstein wonders. The second article, found in the Sunday (Mar. 25) Travel Section, described the Taj Mahal as less overwhelming than expected, in spite of the 16 years needed to build the great mausoleum. So digitization cuts both ways.
While in India, Dr. Pearlstein came acrossand purchasedtwo bilingual prayer books in Hebrew and Marathi, the regional language. Apparently serendipity (or, depending on one's point of view, Divine Providence) may lead to such finds.
So how, then, does the Library of Congress acquire its share of rare or unusual items? ("Rare" by LC definition means anything pre-1801.) As with other libraries, gifts and purchases, as well as Federal exchange, are among the thousands of books that arrive daily through the copyright deposit. When appraising an item, it is vital to ask how such an item would enhance collections or try to determine its value within the context of contemporaneous surroundings. Do similar books or materials already exist? Condition of the item is also important. Some are simply too fragile.
For a Jewish library, where contribution to the Jewish cultural heritage takes primacy, certain items will hold more value to Jewish patrons or researchersfor example, Bomberg's second edition of Talmud, which was printed in the 17th century by the Jews in Amsterdam, as opposed to the first edition, which was printed by Christians and even contained a preface by the contemporary pope. Such an appreciation would require familiar knowledge of Judaica, as well as cultural sensitivity.
Referring to LC webpages, which feature some 18,000 manuscripts and get at least five billion hits a year, Dr. Pearlstein discussed the proper look of digital reproductions, which should contain sharp enough resolutions. Scanning and conversion, digital standardsfinding ways to make resources available to libraries with special collections is critical. In these economic times, too, consideration of costs may be a key factor in the decision. So who decides ultimat