RLG DigiNews,
ISSN 1093-5371

Home

Table of Contents

 

Feature Article

Copyright Clearance in the Refugee Studies Centre Digital Library Project
by Mike Cave, Marilyn Deegan, and Louise Heinink
mike.cave@qeh.ox.ac.uk
marilyn.deegan@qeh.ox.ac.uk
louise.heinink@qeh.ox.ac.uk

Introduction
Clearing and protecting copyright are two major challenges facing every digital library today. With the increasing diversity of materials held within a given collection and specific rules applying to the different media, libraries must ensure that procedures are put in place to deal with complex issues of copyright when acquiring new material, digitizing existing material, and looking to protect against misuse of their own collections. This article describes the copyright issues encountered and the procedures adopted by the Refugees Studies Centre Digital Library Project.

Background
The Refugee Studies Centre (RSC) and its Library are part of the University of Oxford's International Development Centre at Queen Elizabeth House. It was set up in the early 1980s as the Refugee Studies Programme with entirely soft money, and is now a well respected academic department with slightly more secure funding—though it still relies heavily on grant income. The RSC's objectives are to carry out multidisciplinary research and teaching on the causes and consequences of forced migration; to disseminate the results of that research to policy makers and practitioners, as well as within the academic community; and to understand the experience of forced migration from the point of view of the affected populations. Forced migration includes a number of different areas: refugee issues, development-induced displacement and resettlement, internal displacement, trafficking, etc. These issues are worldwide concerns, and the RSC has a broad remit.

The RSC's Library is the largest dedicated to forced migration in the world, with a catalogued collection of more than 33,500 items. It is now both an invaluable archive and a centre for the collection and dissemination of current material in the field of forced migration. The majority of the collection is grey literature—that is material of an unpublished or semi-published nature. All items in the RSC library collections are catalogued electronically, and a simple Web catalogue has been available for searching since 1995. The library collection is visited by scholars, students, and practitioners from all over the world. More of the RSC's readers are from outside than are from within Oxford.

The Digital Library Project
In order to make the unpublished materials more widely available, The Andrew W. Mellon Foundation made a grant of $500,000 for the digitization of a substantial portion of these materials, and the Digital Library Project started in September 1997. A grant of c.80,000 ecus was also obtained from the European Union under its Phare democracy programme. The Phare grant was made for joint work with the Czech Helsinki Committee in Prague. Between 1997 and 1999, comprehensive feasibility and pilot studies were carried out, and a test set of digital documents (attached to catalogue records) is available for searching. This system will need substantial revision for the full project; it is intended as a "proof-of-concept" at the moment. A CD-ROM of background information on the Balkans containing approximately 100 searchable documents was also produced. The feasibility study was carried out by the Higher Education Digitisation Service (HEDS).

The project has now moved into a production phase, with documents going through a workflow process of selection, copyright clearance, removal from the collection, inventory, dispatch to vendors for scanning, return of documents, receipt of digital images and text, and reintegration of documents into the collection. Specification of delivery systems is under way (in collaboration with other services within Oxford University), and roll-out of the digital library will begin in 2001.

Copyright Issues
Materials in the Library have been collected since the early 1980s. Many of the archive materials are older than this, but everything in the collection is in copyright. Some of the grey literature comes from organizations; much of it is supplied by individuals. Over the next two to three years, 8,000-10,000 documents from the collection will be digitized, and the process of clearing the copyright on these is under way. It is a fearsome task. Many libraries eschew tackling the issues by deciding to work with older, rights-free materials. We cannot do that. It is the up-to-date nature of our collection that makes it a potentially vital resource for those who need the information it contains. In the original estimates of timescales, costs, and deliverables for the project, it was never assumed that the copyright clearance process would take so long or cost us so much even though we don't actually pay anything to obtain rights to digitize.

Copyright Clearance: Our Process
It has taken some time to establish the workflows for copyright clearance. First of all, we had to ascertain exactly what the legal situation was, and what, under UK law, we could and could not do. Laws differ from country to country, but that does not greatly affect the overall process. Professor Charles Oppenheim of Loughborough University is one of the leading experts in copyright for digital libraries in the UK, and does a great deal of work for the Joint Information Systems Committee (JISC) of the UK Higher Education Funding Councils. We therefore commissioned Professor Oppenheim to review the copyright situation for us, and to produce a model license agreement to use as a basis for negotiation. The next step was to confirm with the University of Oxford's legal department that the license was appropriate, and they in turn had to check with the University's insurers so that they could indemnify us against anything going wrong. This proved to be a lengthy process. Finally, we had everyone's agreements, and a license that all were happy with. The agreement and a copy of a covering letter explaining what we were going to do with the documents are both available online.

Our intention when we began work on copyright clearance was to do the best we could to clear as many rights as possible, and then to use what is called 'best endeavour' and digitize and make available materials that we had been unable to clear. 'Best endeavour' is a term that assumes the best effort possible has been made to find the rights owners and seek their permission. If the rights owners have not been found, or have not replied after several attempts, materials can be reproduced and made available as long as there is a statement saying what efforts have been made to clear the rights, and all steps taken have been fully documented. For various reasons given below, we may not now scan materials without explicit, signed permissions.

Workflow
While all the legal work was in progress, we began to think through workflows, and how we could streamline these as much as possible. Copyright clearance is only done on materials that have been selected for scanning, but as it is the most time-consuming part of the document preparation process, we felt that it was not possible to select documents, pull them from the collection, then start the clearance. What we have is a working collection, and items must only be out of commission for the shortest possible time. They therefore have to be available during the clearance, and are taken from the collection just before being sent away for scanning. It was also clear to us that if we were dealing with thousands of documents, each one needing several transactions in the selection, clearing, and scanning process, we would need some means of recording and tracking all the steps taken. A database was therefore developed in Microsoft Access to manage this. The database (1) contains bibliographic records of all selected documents (downloaded from the library catalogue), contact details of all individuals and organizations who hold the rights, and records of all the transactions entered into during the clearance and scanning phases. At any time, we can find out what the status of a document or transaction is, and can print reports on batches of documents. We can also print out lists of documents to be sent to an individual or organization when we write to secure the rights, and we can generate bulk mailings using the database. What we have established is a sophisticated process, and one that works well.

Problems
The processes described above have all taken much longer to set in place than could ever have been imagined, and progress is slow. We had to appoint a new member of staff to manage copyright (Louise Heinink), who has been working three days a week for over a year, and the end is not yet in sight. This has sent the costs of the project up significantly. One of the most difficult tasks is tracking down the rights owners. Where documents have been produced by organizations, this is generally straight forward. We may have a record of the address in the Centre, it may be on the document, or they may have a Web site that we can find. With organizations, too, the chances are that we have a whole batch of material, so one transaction enables us to clear a substantial volume. Individuals are a different matter. Refugee studies is a fast-moving field in many senses, and the academics and practitioners engaged in it move around a great deal according to current need. Someone in Kosovo in 1999 is likely to have moved on by now, and we have been trying to find people who deposited reports, conferences papers, and other documentation with us in the 1980s. Locating authors is our single biggest problem, and we get many of our communications returned as "unknown."

As suggested above, we had planned to make our best efforts to find people, and then scan if we did not hear from them. The great majority of people and organizations we hear from are warmly supportive of our aims; they sign the forms gladly, and also often send further documents for scanning. But some sensitive issues have arisen that have led us to conclude that we will probably need to have an explicit release for all documents. For instance, some of the documents may have been written for circulation to a very small audience, and while it is acceptable to have them available on paper in one location, much wider access through the Internet (which is of course the key aim of the project) may not be appropriate for a number of reasons, such as anonymity of asylum seekers. Then, too, some of the holdings are conference papers that have been given at an early stage in a project, and then published in a revised form. Another, very strange, problem is that people often do not remember writing a particular document, and they telephone or email us to ask us to send a copy! We have had several responses when we do send them out, the most common being that we should get rid of the document.

Costs
Estimating the costs of copyright clearance is very difficult. The first year the costs are likely to be highest, as all of the up-front work of assessing the scale of the problem, building the database, establishing the worksflows, and writing the licenses have to be done then. Thereafter, the costs probably drop by around 50%. There are also huge differences in the costs per item: it can take less work to clear 30 long documents from one organization than to clear one short one from an individual. Some rough calculations on our own expenses show that it cost perhaps an average of #5-6 per document during the first year, and is now dropping to an average of around #2-3 per document. This is just the administration costs: we pay no fees. In 1998, we did scan some published documents for a teaching module we were developing, and we were charged fees by publishers for these. The highest was #200.

Conclusion
We have now cleared almost 2,000 documents for scanning, which has involved thousands of letters and licenses to be mailed, chased up, and remailed; thousands of dates and transactions to be entered into the database; and many hours of work tracking down people all over the world. No one has asked us to pay for the rights we have secured, and there has been much goodwill and enthusiasm for the project. We do have a few words of advice to offer others in similar situations. While this has been difficult and costly, our conclusion is that there is no way to avoid having to clear copyright when scanning modern printed materials—whether they are published or not. There are still some unresolved legal issues, but conservatism in interpreting the law is probably wise, and getting indemnity insurance in case something goes wrong is vital. This may be a cost on the project, but those working in large institutions may find that the institution's insurers will be prepared to apply coverage from existing policies. In our case, what we had to do was satisfy the University's insurers that we were taking all reasonable steps to apply the proper processes. They then considered that we were covered on the overall block insurance of the University. The key points were a) the risks were low because we were doing things properly and there was a lot of goodwill on the part of the rights holders to our aims and b) there was never any question that someone could lose revenue from what we were doing as these are materials of no monetary value.

It is the sheer complexity of the copyright, and the time it takes to send out all the letters and process all the replies that is most daunting. This has become routine for us now, but it has taken us a long time to get there. If we can offer any help or advice to others embarking on similar endeavors, don't hesitate to contact us.

Footnotes

(1) Enquiries about the database should be addressed to Mike Cave (mike.cave@qeh.ox.ac.uk).

Feature Article

Digitization Grants and How to Get One: Advice from the Director, Office of Library Services, Institute of Museum and Library Services
by Joyce Ray
Director, Office of Library Services, Institute of Museum and Library Services
jray@imls.gov

With Internet access rapidly becoming ubiquitous, digital content is beginning to supersede connectivity as the hot-button issue. If the Internet is to realize its potential to transform the way society uses information, it must offer high-quality resources. Commercial and entertainment sites abound on the World Wide Web, but there is a great need for good educational content. The growth of educational resources has been slower than other sectors, but many libraries, museums, and archives are trying to change that. These institutions hold a wealth of materials spanning the whole spectrum of human knowledge, and documentation that was once available only to a small number of scholars may now be used by students, teachers, researchers, and the general public worldwide, without fear of theft or damage to fragile and valuable items. When a museum or library establishes a Web presence and makes its holdings available to the cyber-public, it is contributing educational content to the Internet and serving the public good as well as enhancing the institution's visibility.

Yet the creation of accessible digital content is expensive. It requires an investment in hardware and software, a trained labor force to prepare materials and scan them at the appropriate level of quality—while handling fragile items carefully—and catalogers and indexers to create the metadata that is needed to retrieve and manage the digital information. Even when the scanning is outsourced, the institution must develop conversion guidelines and Requests for Proposals, pay for vendor services, perform quality control checks, and create metadata. It is no wonder that most institutions turn to outside sources of funding—usually private foundations or government agencies—for some of the costs of digitization.


The Institute of Museum and Library Services (IMLS) was fortunate enough to benefit at its creation from the growing interest in digitization. This new U.S. federal agency, established by Act of Congress in 1996, has statutory authority for the "preservation or digitization of library materials and resources." Since 1998, IMLS has been funding digitization projects through its National Leadership Grants program. Over the past three years, our understanding of what is required for a successful digitization project, and what reviewers want to see in a grant proposal, has improved. This knowledge is reflected in the increasing length of our guidelines. Potential applicants may find them a bit daunting at first, but the guidelines will provide even novices with the information resources and guidance they need to plan and carry out successful digitization projects.

The new 2001 National Leadership Grant guidelines will be on the IMLS Web site after November 1. The deadlines for digitization projects are now established as:

February 1 for libraries
March 1 for museums
April 1 for library and museum collaborations

The key to writing a successful digitization proposal for IMLS is really the same as for writing any successful proposal. Applicants should always establish a clear vision of what they hope to accomplish, should read and follow the program guidelines, and consult with program staff along the way. National Leadership Grant proposals are evaluated by peer reviewers on a set of evaluation criteria described in the guidelines. Applicants should read all the evaluation criteria carefully and keep them in mind while preparing the application. Some important points to consider are:

Footnotes

(1) Located with the National Leadership Grant Application and Deadlines, under "All About Grants and Awards" on the IMLS Web site at http://www.imls.gov

Highlighted Web Site

Moving Theory into Practice: Cornell's Digital Imaging Tutorial
The Department of Preservation and Conservation of Cornell University Library announces the public release of its online digital imaging tutorial, Moving Theory into Practice. Although designed as an adjunct to the recently published book and workshop series known by the same name, the tutorial can also serve as a standalone introduction to the use of digital imaging to convert and make accessible cultural heritage materials.

Produced with funding from the National Endowment for the Humanities, the tutorial is currently available in English, with a Spanish language version to follow in December 2000 (from the same Web address). The tutorial consists of sections encompassing all the major aspects of digital imaging: Selection, Conversion, Quality Control, Metadata, Technical Infrastructure, Presentation, Digital Preservation, and Management. Designed to be self-guided and self-paced, the tutorial includes frequent "reality checks" for evaluating the understanding of the presented material. Most sections are heavily illustrated, and provide suggestions for further reading. The tutorial also includes several tables, providing reference data on topics such as graphic file formats, compression techniques, scanner characteristics, and institutional guidelines for conversion and presentation.

FAQ

What is "zip" compression? Can it be used for storing digital image files?

Let's first clear up one potential source of confusion. Zip has two common uses in current computer parlance. Zip drive or Zip disk (always capitalized) refers to a proprietary removable storage device and its media, patented by Iomega Corporation. At present Zip drives are popular storage peripherals on both Windows and Macintosh computers, using magnetic media that store either 100 Mb or 250 Mb per disk. Zip disks can hold any kind of file, including zip compressed files, but there is no connection between the two.

The zip we're referring to is not really a compression technique per se, though it is often thought of that way. It is more accurately described as an "archive" format, rather than a compression technique. Archive in this usage refers to any set of computer files that are bundled together into a single file (called a zip file) in a manner that preserves their individual identities and hierarchical directory arrangement. A main usage for archive formats has been to ensure that all the files needed for a software application (especially shareware) stay together. There are a number of other archive formats, including arc, sit (used on Macintoshes), and tar (used on Unix machines), but zip is the most commonly seen.

As an example, an archive of a shareware database manager might include the main program, related utility programs, various documentation files in a separate folder, and a form for sending payment to the program's designer. A zip utility would bundle all these files together into a single file, while an "unzip" utility would restore the files to their original form.

File compression is a commonly used, though optional, feature of zip archives. In most cases, the ability to bundle multiple files together is combined with compression of the files in order to create a smaller archive that uses less disk space and requires less time to send across a network. Zip utilities support several different lossless compression algorithms, and may analyze each file to try to determine which compression technique will be most effective.

Zip has the advantage of being widely supported across computing platforms, and could conceivably be used for digital image files. However, zip was not designed with image data in mind. It has no special functionality for image files, and the supported compression techniques are not optimized for image data. Better alternatives include TIFF, which supports multi-image files and several image-specific compression techniques, and PDF, which supports multi-image files, internal hierarchies, and image-specific compression.

--RE

Calendar of Events

Digital Strategies - 2000
November 16 - 17, 2000, College Park, MD

To be held at the National Archives and Records Administration, this conference will focus on topics such as building the information infrastructure, NARA initiatives that tap into the enabling technologies of the next generation to improve management, preservation, and access to electronic records. For further information contact: digitalstrat@arch2.nara.gov.

Information Infrastructures for Digital Preservation
December 6, 2000, York, England

Preservation 2000: An International Conference on the Preservation and Long Term Accessibility of Digital Materials
December 7-8, 2000, York, England
These meetings will focus on the long-term accessibility of digital materials, and discuss the key issues linked to their preservation. The provisional program is now available.

ADL2000: Third International Conference on the Asian Digital Library
December 6-8, 2000, Seoul, Korea
The goal of this conference is to share and disseminate information and knowledge about current issues regarding digital library research and technology. Special emphasis will be on experiences and problems with available systems and technology for digital libraries.

DELOS Network of Excellence Workshop on Information Seeking, Searching, and Querying in Digital Libraries
December 11-12, 2000, Zurich, Switzerland
Jointly sponsored by the European Commission under the DELOS Network of Excellence on Digital Libraries and the National Science Foundation (NSF), this workshop will bring together researchers and practitioners interested in digital libraries to present and discuss recent results as well as future research directions.

 

Announcements

Getty Trust Funds Survey and Guide to Good Practice
The J. Paul Getty Trust has announced the award of $140,000 to the National Initiative for a Networked Cultural Heritage (NINCH) to direct an innovative project to review and evaluate current practice in the digital networking of cultural heritage resources. NINCH will publish in the fall of 2001 a Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials in print and electronic form.

Visual Resources Association's New Image Collection Guidelines
The Visual Resources Association recently announced the publication of a new guide for visual resources professionals, The Image Collection Guidelines: The Acquisition and Use of Images in Non-Profit Educational Visual Resource Collections. The guide will provide practical principles for the acquisition, attribution, and display of visual images for educational use.

Technical Advisory Service for Images (TASI) Revamped Web Site
TASI, the JISC's Technical Advisory Service for Images, has launched a new-look Web site, making it even easier to access the useful information offered by the service. The TASI information is provided free of charge to organizations contemplating or implementing a digitization project.

Thomas A. Edison Papers Now Online
Available on the Web, the documents are searchable in several ways—names, dates, and document types; the text of the 4,000 introductory editorial targets is searchable; and the edition can be browsed by record groups. The database contains records for 100,000 documents, 80,000 of which are now available as digital images, and some 15,000 names.

Preserving Australian Physical Format Electronic Publications - Selection Guidelines
The National Library of Australia (NLA) has recently compiled some guidelines for internal use on the selection for preservation of physical format digital materials in the general collections.

Northeast Document Conservation Center Announces Handbook for Digital Projects: A Management Tool for Preservation and Access
For the past five years, the Northeast Document Conservation Center (NEDCC) has explored the complex issues surrounding digital preservation through its School for Scanning conferences. The Handbook for Digital Projects is focused on meeting the information needs of libraries, museums, and archives.

In the Picture, Preservation and Digitisation of European Photographic Collections
Within the framework of the EU project "Safeguarding European Photographic Images for Access" the European Commission on Preservation and Access (ECPA) published In the Picture, Preservation and Digitisation of European Photographic Collections. This report describes the way in which European institutions manage their photographic collections in terms of preservation and digitisation.

RLG News

Long-Term Retention of Digital Information: Events in Early December

Preservation 2000: An International Conference on the Preservation and Long-Term Accessibility of Digital Materials is a two-day, international conference in York, England on Thursday and Friday, December 7 and 8, 2000. It is sponsored by the Consortium of University Libraries' Cedars Project (CURL exemplars in digital archiving), RLG, and OCLC, in association with the UK Office for Library Networking. The main goal for the conference is to share, disseminate, and discuss current key issues concerning the preservation of digital materials. These include models for digital archives, the economics of digital preservation, and content and selection issues surrounding digital preservation. RLG and the other sponsors are seeking to facilitate meaningful dialog among the wide array of organizations and individuals currently working with digital archives and preservation.

Immediately preceding Preservation 2000 is a related, one-day preconference on Wednesday, December 6: Information Infrastructures for Digital Preservation. Limited to the first 60 registrants, this intensive day includes presentations and papers on current work in digital preservation metadata and standards for description. Attendees will participate in discussions and debate on developments in this key area.

Both events feature expert speakers from North America, Australia, and Europe. The registration fees, depending on options chosen, range from #110 (approximately $160 US) to #250 ($362 US). November 13 is the last day to register for one or both of the events focusing on digital preservation. We encourage participation by libraries, archives, museums and other cultural and heritage organizations that are dealing with digital preservation and access issues. For programs and registration forms, see the event Web sites:

http://www.rlg.org/events/cedars-2000/ (access from outside of Europe)
http://www.ukoln.ac.uk/events/cedars-2000/ (access from within Europe).

Hotlinks Included in This Issue

Feature Article 1
Refugee Studies Centre Digital Library Project http://www.qeh.ox.ac.uk/rsc/
RSC Web catalogue http://www.bodley.ox.ac.uk/rsc/
Phare democracy programme http://www.dlinnt.ee/phare/what.shtml
Phare's test set of digital documents http://rsc.qeh.ox.ac.uk/rsccat
Higher Education Digitisation Service http://heds.herts.ac.uk
Copyright Clearance http://earlybird.qeh.ox.ac.uk/fm/copyright

Feature Article 2
The Institute of Museum and Library Services http://www.imls.gov

Highlighted Web Site
Digital Imaging Tutorial http://www.library.cornell.edu/preservation/tutorial/
Moving Theory into Practice http://www.rlg.org/preserv/mtip2000.html

Calendar of Events
Information Infrastructures for Digital Preservation http://www.ukoln.ac.uk/events/cedars-2000/programme.html
Preservation 2000 http://www.ukoln.ac.uk/events/cedars-2000/programme.html
ADL 2000 http://adl2000.kaist.ac.kr/
DELOS Network of Excellence Workshop http://www.lib.uoa.gr/delos/

Announcements
National Initiative for a Networked Cultural Heritage http://www.ninch.org/
Visual Resources Association http://www.oberlin.edu/~art/vra/guidelines.html
Technical Advisory Service for Images http://www.tasi.ac.uk/
Thomas Edison Papers http://edison.rutgers.edu
National Library of Australia http://www.nla.gov.au/policy/selectgl.html
Northeast Document Conservation Center http:// www.nedcc.org
In the Picture, Preservation and Digitisation of European Photographic Collections http://www.knaw.nl/ecpa/publ/picture.pdf

RLG News
Access to CEDARS from outside of Europe http://www.rlg.org/events/cedars-2000/
Access to CEDARS from within Europe http://www.ukoln.ac.uk/events/cedars-2000/

Publishing Information

RLG DigiNews (ISSN 1093-5371) is a newsletter conceived by the members of the Research Libraries Group's PRESERV community. Funded in part by the Council on Library and Information Resources (CLIR), it is available internationally via the RLG PRESERV Web site (http://www.rlg.org/preserv/). It will be published six times in 2000. Materials contained in RLG DigiNews are subject to copyright and other proprietary rights. Permission is hereby given for the material in RLG DigiNews to be used for research purposes or private study. RLG asks that you observe the following conditions: Please cite the individual author and RLG DigiNews (please cite URL of the article) when using the material; please contact Jennifer Hartzell at jlh@notes.rlg.org, RLG Corporate Communications, when citing RLG DigiNews.

Any use other than for research or private study of these materials requires prior written authorization from RLG, Inc. and/or the author of the article.

RLG DigiNews is produced for the Research Libraries Group, Inc. (RLG) by the staff of the Department of Preservation and Conservation, Cornell University Library. Co-Editors, Anne R. Kenney and Oya Y. Rieger; Production Editor, Barbara Berger Eden; Associate Editor, Robin Dale (RLG); Technical Researcher, Richard Entlich; Technical Assistant, Carla DeMello.

All links in this issue were confirmed accurate as of October 13, 2000.

Please send your comments and questions to preservation@cornell.edu.

Contents Search Home

Trademarks, Copyright, & Permissions