RLG DigiNews, ISSN 1093-5371

Home

Table of Contents

Editors' Note

With this issue we are introducing a new feature to RLG DigiNews, the Editors' Interview. Interviews will appear on an irregular basis in lieu of our feature articles. We are pleased that Kevin Guthrie, President of JSTOR, agreed to be our first interviewee, and to discuss with us JSTOR's plans to safeguard their growing archives of electronic scholarly literature.
Anne R. Kenney and Oya Y. Rieger, Co-Editors

 

Editors' Interview

Developing a Digital Preservation Strategy for JSTOR, an interview with Kevin Guthrie
KG@jstor.org


For information on JSTOR's services, see http://www.jstor.org/.

One of JSTOR's primary missions is to provide a trusted and lasting archives of electronic scholarly literature. How do you define a trusted archives? Could you describe your plans for creating one?

Well, trust in this context is both very important and rather difficult to define. It is important because the goal is to be able to establish a relationship whereby a library can rely on a third party to provide a service that has been a core function of the library; that is, archiving. That is no small responsibility and any enterprise that aims to provide such a service is going to have to earn a very high level of trust.

At this early juncture, I don't think there exists a standard definition or "litmus test" of what it will take to become such a trusted archives. I do think that the mission of the enterprise is a fundamental component of the assessment. If the goal of providing long-term preservation is anything less than a core component of the "archiving" organization's reason for being, it does not seem to me that it can truly be relied upon to offer long lasting access. In this fast-moving technological and economic environment, too many things can happen.

JSTOR was founded on the goal of providing a trusted archives of journal literature. Our objective is to offer a centrally-held repository that will both reduce system-wide costs of archiving and increase the archives' usefulness. Because this is our mission, every aspect of our organizational and financial planning is built around this objective.

In our experience, the job of electronic archiving has five primary components, which I will just highlight here. An archiving organization must address each of these components in some way if it is to deliver a reliable and long-term archival solution.

1. Technological Approach
Data formats, software infrastructure, and data storage specifications must be selected with long-term access in mind. That generally means open standards should be pursued whenever practicable.

2. Preservation and Back-up Plans
Extensive plans must be in place for protecting the electronic data. If we take steps to develop a more centralized approach to archiving electronic data than we have for the existing print model, there will be fewer copies of the data spread around the system. We must therefore be sure that the appropriate safeguards are in place to prevent loss due to disasters, human error, etc.

3. Relationships with Content Providers
If a library were going to be responsible for an archives for which it owned all legal rights, the relationship between the archival organization and content owner would not be an issue, since they would be one and the same. But if an archives is to act as a third-party safe repository for data owned by other entities, it must reach agreement with the content owner that allows it to provide long-term access to that material. There is a natural tension here between the content owner's desire to restrict access in order to maximize the return on investment, and the archives' desire to provide access to facilitate migration and usability of the data in a changing technological environment. This tension must be addressed in some kind of license agreement.

4. Relationships with Users/Libraries
If the archiving entity is going to provide access to content it holds through license agreements with libraries, the issue of long-term access to the content at each individual library must be addressed. This can be a complicated and difficult issue for electronic materials.

Consider the case of journal subscriptions. In the paper world, the publisher has fulfilled its responsibilities to the subscriber when the library receives the journal issue. From that point on, a library owns the issue and is responsible for all costs associated with preserving and providing access to it. The library decides if keeping the material is justified, whether it needs to be housed in off-site storage, or even if it is time to discard the material.

In the electronic environment, the content can be centrally held and access can be distributed over networks. In that case, the ongoing costs of maintaining and providing access to the material remain with the publisher when a library subscribes. So herein lies the problem: if the library cancels a subscription in the paper environment, it has the content to archive and it chooses to incur costs locally to maintain it. If the library cancels a subscription in the electronic environment, and wants to maintain access, the content provider incurs the on-going costs of maintaining access. Unless some revenue flows to the content provider, that model is not sustainable. The costs need somehow to be matched to the resources to cover them. It is of course possible to distribute the electronic content on fixed media, for example, on a CD-ROM, but that replicates the costly archiving system that exists with paper. It doesn't take advantage of the opportunity to centralize the storage of material to reduce costs in the system.

That is why, for entities providing electronic archives for libraries, the issue of trust is so critically important. The goal is for a third-party, whether it be another library, a society, government agency, or another not-for-profit, to be able to provide what are, in effect, archival services to libraries. This is a core component of a library's mission so it is not something that can be taken lightly. It really suggests a new way of looking at the problem, and one that will require new organizational, accounting, and financial models if it is to be addressed optimally. The dynamic nature of the evolving technological environment demands some form of more centralized approach; it is not possible to imagine every library storing and migrating electronic data in the way that they have archived print. The costs at each institution are simply too great.

5. An Economic Model with a Reasonable Probability of Reaching Self-Sufficiency
If an archives is to be trusted, it will have to have a reasonable plan for generating the resources necessary to cover the costs of regular maintenance and continuing migration of software, systems, data, and metadata that is going to be necessary to ensure long-term accessibility. This does not necessarily mean it will have to charge fees for access or archiving, although that is one model. It could mean that there are barter arrangements between libraries, or it could be that the archives has an endowment or some clear permanent commitment from a large institution or government agency. There are surely other possibilities, including combinations of the choices just listed. But the point is that there must be evidence that an organization can reasonably be expected to have the resources at hand that will be required to keep the data accessible.

Have you established base funding figures for this work? How do you plan to cover these costs?

In general, JSTOR generates revenues from fees charged to licensing organizations, what we call participating institutions. These fees imposed have two components: the one-time Archive Capital Fee, and the Annual Access Fee. The Archive Capital Fee is used both to help underwrite the costs of digitizing new collections and to fund the development of an archives reserve fund. Just as there is a capital cost associated with building shelf space, there are capital costs associated with building a database and the infrastructure to house and maintain it. Our Archive Capital Fee is structured to be significantly less than the comparable costs of building the infrastructure to store paper or the data locally. It is also substantially less than building the collection in another format. For example, to purchase the complete microfilm runs of the 117 titles in our Arts & Sciences I collection would cost over $400,000. The Archive Capital Fee for Very Large institutions is one-tenth that; the Archive Capital Fee for Very Small institutions is 2.5% of that figure.

The Annual Access Fee supports the provision of ongoing access to the database, and also covers the addition of new volumes as each year is added to the database. (I discuss JSTOR's "moving wall" in an answer to a later question in this interview). Like the Archive Capital Fee, the Annual Access Fees are structured to be significantly less than the ongoing costs associated with providing access to this material if it were housed locally in print or electronic form. So the economic model is based on the notion that participating institutions can save resources over the long-term by contributing to the costs of a centrally managed archives.

It might be useful for me to comment briefly on some of the principles behind this fee structure. It is my opinion that not-for-profit organizations must develop revenue sources that match the nature of the mission-based uses of their funds. As we know, the job of archiving is an unrelenting obligation; thus it is prudent to establish funding sources that can be counted on to provide similarly stable and recurring revenue. Thus, we are building an archives reserve. The investment proceeds generated from that reserve will be used to help fund the ongoing cost of archiving. We also rely on, though not exclusively, the annual access fees from participating institutions.

In addition to these funding sources, I should add that JSTOR also pursues grants from foundations to help underwrite the costs of digitization of collections, such as for our new General Science Collection. By subsidizing the costs of digitizing new collections, grant funds enable us to lower the amount we need to generate from library participation fees.

What provision have you made for the long-term preservation of these files if JSTOR were to go out of business?

Well, naturally we try to balance our desire to be cautious with what I think is a natural aversion to spending too much time planning our own demise!

First, our library license agreement includes a provision that states that we would provide copies of the database in the prevailing physical medium if something like this were to happen. While this provides a kind of protection, it is not altogether practical for the same reasons I expressed above; it doesn't take advantage of the benefits that accrue from storing the database centrally and using network technologies to distribute it. But the right is there. In addition, the JSTOR publisher license agreement also includes a provision stating that in the event JSTOR ceases to operate, we will provide a copy of the journals to each corresponding publisher.

That said, quite independent of JSTOR the organization, we are building a database that, judging by the reaction in the community and how much the resource is being used, is quite important and valuable. If it turned out that that value was not large enough to support an organization, it is likely that instead of distributing the database to the individual libraries and publishers, we would begin discussions with our publishers and libraries to determine who might be in the best position to care for the archives going forward. It is probably worth noting that the database itself is currently housed on servers managed and maintained at three universities (Princeton University, the University of Michigan, and University of Manchester in the United Kingdom). These are all institutions with substantial resources that are unlikely to disappear in the foreseeable future. So even if JSTOR were to go out of business, there would not be immediate risk to the database itself and there would be time to work out new arrangements.

Are there other institutions or organizations with similar plans? For example, how would you distinguish this approach from OCLC's preservation plan?

We are in communication with OCLC, and have a special agreement with them to provide cold storage backup of the JSTOR database. So they play a part in our preservation and back up plans. I mentioned JSTOR's mirror sites above. We actually have duplicate copies of our 5 million-page database operating at each of the three universities. These sites are updated on a nightly basis to ensure synchrony and each site is actively engaged in serving researchers and students on an ongoing basis. So these are not backup copies that are at risk of not being refreshed or migrated. At each of these sites, reliable backup procedures are in place. In addition, on a quarterly basis, archives tapes are shipped to OCLC and to our New York office as an additional layer of protection.

What reactions have you received from JSTOR members to this plan?

I have presented my thoughts on electronic archiving and on JSTOR's specific plans regarding archiving at numerous meetings. We have a copy of a paper I presented at a JSTOR participants' meeting in February 2000 available on our Web site. In general, I have found the response to be quite positive.

How can you demonstrate the viability of this approach to JSTOR members? What would it take the members to trust JSTOR?

If you are asking me to show hard evidence that JSTOR is going to be around forever, well that is hard to do! In the grand scheme of things, even universities have not been around very long. But we are off to a very positive start, and the financial resources being generated by JSTOR are covering the costs associated with building and maintaining the database. So I am very confident that the passage of time will demonstrate that we have a viable concept and organization and that libraries, publishers, and researchers will come to trust that JSTOR will be around.

We are also building strong relationships with a number of foundations. Seven different foundations have provided support for JSTOR over the last three years. These foundations recognize the value of the archival service we are providing. In addition to the previously mentioned grants to subsidize the digitization of new content, we have received several grants to support access to institutions in countries that could not afford to make contributions at a level consistent with similar institutions in more developed countries. So we are diversifying the support base across foundations, nearly 800 libraries in 35 countries, and an archives reserve. That diversified support base should provide reassurance. In the end though, steady and consistent delivery of the archives is what is going to be required to build the requisite level of trust. But we recognize that trust is not easily earned, and that it is going to take time.

Is there any evidence to suggest that JSTOR members have begun to let their paper collections go and rely solely on the JSTOR archives? If so, what is the main motivation of these institutions in doing so?

In December 1999, we conducted a survey of participating institutions asking these and related questions. (If you are interested in reviewing a summary of the results of the survey, it is available at our Web site.) We were very pleased that librarians from 214 institutions responded. As you would expect, research libraries that consider archiving central to their mission are not discarding back volumes. What they tend to do is to move the older journals to lower-cost remote storage. Of the 214 institutions, 39% (84 institutions) have moved or plan to move JSTOR journals to remote storage. Here is a sample comment that expresses their motivation for doing so: "Because we have such severe space problems, and they are well-known on campus, and because faculty have had a very good experience with JSTOR, it looks like we will have solid support for the move."

There are some institutions that have discarded titles. Most often, these are institutions that have incomplete runs or that do not consider it their responsibility to archive the complete runs of journals. 31% of our respondents said that they have discarded, or plan to discard, JSTOR titles. What follows is a sample comment from the survey regarding their motivation: "There are many reasons we chose to discard JSTOR titles. The library has very limited space This is primarily an undergraduate institution and these are titles which, although important to the collection and a liberal arts education, are not heavily used Students and faculty are more and more going to the "net" for journals."

Libraries may become more comfortable in the future with discarding or storing paper copies remotely as efforts to create centralized print archives for JSTOR journals are developed. This spring the Center for Research Libraries announced a program to acquire, on deposit, the print copies of journals currently available through JSTOR. We are very pleased by CRL's announcement and are eager to work with them to make this program a success. The University of California System is also considering consolidating and saving journal volumes that are included in JSTOR. And we have recently heard of a similar effort in Indiana to send materials to one regional library for long-term storage. I can imagine such regional efforts providing a reliable and duplicated array of these journal runs. Properly maintained, and rarely used, these should last for quite a long time.

Technical components are one thing, what organizational and legal obstacles have you encountered and overcome to position JSTOR as an electronic journal archives?

First, I should say that I think the important issues regarding archiving are organizational. These are the points I have been trying to make in my previous answers. I do not believe that there is a technical solution. We are not at a point, I don't think, where we can develop software that can predict the future course of technical development. So I don't regard the archiving issue as being about technology really. There is no black box fix.

And as you can see there are many factors involved in maintaining this dynamically changing electronic collection, but since you asked about legal challenges, I will address one. I mentioned in a previous answer the tension that exists between archives and content owners concerning access. If a content owner wants to sell access to the same material that exists in an archives' collection, then the owner is not going to want the archives to provide access, or at least is going to want to limit it in some way. The archives wants to provide access not only because it helps to justify its investment in storing the material, but also because providing access to an electronic resource is an excellent way to guarantee that material remains current with evolving technologies. One needs to find room for compromise in these two positions.

JSTOR included provisions for compromise in its publisher license agreement. To protect the publishers' interest in their content, JSTOR established the concept of the "moving wall." The moving wall determines the age of the most recent issue that will be available in JSTOR. For example, if a journal has a moving wall of 3 years, and it is presently the year 2000, journals will be available up to 1997. The length of the moving wall, 3 years, is constant, but the wall moves with the passage of time. At the end of the year, a new volume will be added, so the volumes available will extend up to 1998. The moving wall protects publisher revenues from the sale of current issues, while still allowing JSTOR to provide access to the older material. It also provides libraries with an archives of material on which they can rely.

Do you distinguish between a short-term archives and a long-term (e.g., 100 year) archives? If so, what are the distinguishing characteristics of these two approaches?

At this point in our development, we really don't distinguish between these two concepts. We believe that maintaining access depends upon a steady progression of data and software migrations. We are currently engaged in several projects to migrate both data and software. Our reason for these migrations is related both to technological developments and our desire to work in standard and non-proprietary ways. For example, our metadata has been stored using the original EFFECT specification that was developed for the TULIP project. We are preparing now to migrate that data into an XML data structure. As another example, we are rewriting much of the backend software in JAVA to make it more easily migratable and interoperable. These are just two of many such efforts we have underway. This kind of revision has to be part of the organizational and financial plan.

As mentioned previously, we are not aware of any technical solution or time capsule that we can put the JSTOR database into that we know will be readable and usable 100 years from now. To provide another layer of long-term protection, we support efforts such as CRL's noted earlier to help ensure that there are multiple copies of the paper volumes properly preserved.

Do you have any evidence to suggest that investing up front in the creation of rich digital masters will pay off in terms of long-term management? What are the other lifecycle management issues that you would promote to ensure longevity?

Whenever possible, one should seek out file formats that will stand the test of time. When JSTOR was first conceived, and Anne you will remember this well because you were involved, there was much discussion about the appropriate resolution for digitizing pages in the database ("way" back in 1994!). The debate revolved around whether text-based pages should be scanned at 300 or 600 dots-per-inch resolution. Those concerned about costs advocated 300 dpi and felt pretty strongly about it. Those concerned about archiving, and you were one of them, supported the notion of digitizing the pages at 600 dpi. Fortunately, 600 dpi won out. That was a good result not because 600 dpi is better than 300 dpi and because 600 dpi printers are now ubiquitous. It is a good result because it is a stable level of resolution. At resolutions higher than 600 dpi, even though the images may be "better" in some absolute sense, the discernable benefits diminish markedly. Even if we have screens and printers that process 1200 or 2400 dpi resolutions, the text-based information will not be enhanced significantly, and surely not enough to justify increased storage costs associated with the larger files. So, in that sense, I regard the 600 dpi bitonal TIFF image to be a "rich digital master" for text-based content and was a very worthwhile investment. Had we scanned the original material at 300 dpi, I expect we would now be feeling pressure to re-digitize those materials.

One last question: How can RLG DigiNews readers keep up to date on JSTOR's archiving plans?

I very much appreciate this opportunity to respond to these questions here, and I would welcome the chance to engage in this kind of discussion periodically. For those interested in JSTOR's approach to a variety of topics, we do our best to notify the community of our activities. We have a newsletter, which we publish both electronically and in a paper format. We also try to publish articles on our Web site from time-to-time about topics we think may be of interest to people. The archiving paper I mentioned previously is one example. Another example would be the paper we published a while back written by Ira Fuchs that gives a broad technical overview of the problems associated with remote access and authentication.

I'd like to thank RLGDiginews, and both of you, Anne and Oya, for the opportunity to talk at some length about these issues. They are very important and growing more so as we all rely more heavily on electronic resources

Technical Feature

Image Quality Metrics

by Don Williams, Image Scientist, Image Engineering and Simulation Lab,
Image Science Division, Eastman Kodak Company
williams@image.Kodak.com

" I can't describe it, but I know it when I see it." This is a common, albeit unsatisfying, answer to the question, "What is image quality?" It does not need to be so ambiguous though. For over the last half-century, a great deal of scientific work has gone into developing physical image measurements and understanding how they play into the psychophysics of human vision, and in turn, image quality perception. Today, image quality models are refined to the point where image quality metrics and the techniques for applying them have generally stabilized to lend themselves for practical implementations. While individual, cultural, and task-specific preferences prevent an absolute answer to the opening question, tuning model parameters through experiment can usually accommodate differences.

There is only a handful of physical image quality metrics that drive these models. (1) Their vernacular names are Resolution; Noise; Tone Reproduction; Color Reproduction; and a wonderful catch category called Artifacts. Each of these components has one or more quantitative measurement methods to be used in evaluation. The tools for properly making these measurements are only now finding their way into lay use through standards organizations. This article includes brief descriptions of these image quality metrics.

Tone Reproduction
Tone reproduction is the rendering of a source document's densities into luminances in its digital version. Of all the image quality metrics, tone reproduction is the most important and it forms the foundation for the evaluation of other image quality metrics. Indeed, the effectiveness of all other image quality metrics assumes that the tone reproduction is satisfactory. It determines how dark or light an image is as well as its contrast. It is evaluated objectively by a tone reproduction curve that relates the physical amount of light (i.e., luminance) of points in a scene or source document to the luminances of those same points in the reproduction. Tone reproduction is difficult to measure. Besides viewing conditions, preferred tone reproduction can be driven by cultural and professional preferences as well as the need for image enhancements.

The attributes of a scanner that determine tone reproduction quality are opto-electronic conversion function (OECF), dynamic range, and flare. All of these factors are inherent in the hardware. OECF describes the relationship between the optical density of a document and the digital count associated with that density. Dynamic range is the capacity of a scanner to distinguish the extent of variations in density. Flare is attributed to stray light in an optical system and manifests itself by reducing the capture device's dynamic range.

Color Reproduction
Of all image quality metrics this is the most difficult to explain because of its complexity with respect to the human visual system. There are many well-intentioned color reproduction metrics that are useless because of their inability to correlate with subjective image quality.

A good place to start though is by measuring the extent to which gray or neutral areas in an object are maintained (i.e. balanced) as gray or neutrals. This can be as simple as examining a gray scale target after reproduction and evaluating the equivalence of the red, green, and blue (RGB) luminances. This is a rather shallow way of examining color reproduction though, since a black and white system would rank high in color reproduction using this criteria.

A better objective metric for color reproduction, which often assumes balanced neutrals, is Average Delta E*. It is a single number that is calculated by comparing the input and output colors in a visually derived color space called CIE L*a*b*. One would reproduce a color chart with color patches of known L*a*b* values, compare them to the reproduced L*a*b* values, and calculate their average difference. The goodness of this metric assumes that the colors used in the target are a representative selection for the task at hand.

Resolution
Resolution is the ability to capture or reproduce spatial detail. Most readers may be familiar with dot- per-inch (dpi) measurements or bar target readings as physical measures of resolution. While these measures serve as generic measures of resolution, they are misleading and do not correlate well with judged image quality. Resolution is truly quantified by measuring the extent to which light spreads in the imaging process. The physical measurement that describes this light spread is called the line spread function (LSF). The smaller the spread of light, the greater the resolution. Because of interpretability difficulties, the line spread function is often transformed into the Modulation Transfer Function (MTF) for analysis purposes. (2)

The MTF is not a single number like dpi or line pairs but rather a continuous curve that describes how well the contrast of increasingly finer details (i.e., spatial frequencies) are maintained after being imaged by the component (scanner, camera, printer) of interest. It is probably the most widely used physical measure of resolution in the imaging evaluation community. Its value lies in its capacity to be cascaded with the MTFs of other components in the imaging chain and the human visual system to arrive at combined MTFs. (3)

Noise
For traditional photographic imaging, film grain is the major contributor to imaging noise. These random, unwanted point-to-point (i.e., pixel-to-pixel) fluctuations also occur in electronic imaging systems but are due to the sensor and associated electronics. The physical measurement of noise is usually as simple as measuring the statistical standard deviation of pixel values of an image of a gray scale. Usually the noise varies as a function of the average count level in each patch. The greater the standard deviation, the greater the noise, and usually the poorer the image quality. However, one needs to be careful to measure only the noise element that is associated with the scanner, eliminating the noise due to the target, nonuniform lighting, or other artifacts. Currently, draft standards are in place that recommend ways to objectively measure noise levels of digital cameras similar to that just described. The principles in this proposed standard could also apply to scanners. (4)

Artifacts
Distortion, streaks, non-uniformity, aliasing, dust, scratches, and over-sharpening are some of the common names associated with an image quality category that really has no objective image quality metrics. Most of these artifacts are a result of digital imaging technologies. While many can be physically quantified, the way they correlate to image quality is relatively unknown. At a low level they are considered a distraction or nuisance, however, depending on their magnitude, they can render an image defective. Judging and evaluating artifacts is best accomplished through being sensitized to them through experience.

Footnotes

(1) For a detailed discussion of image quality metrics components and associated quality control methods, see Don Williams, "An Overview of Image Quality Metrics," in Moving Theory into Practice: Digital Imaging for Libraries and Archives, Anne R. Kenney and Oya Y. Rieger (editors and principle authors). Mountain View, CA: Research Libraries Group, 2000. Also see Don Williams, "Selecting A Scanner" in Guides to Quality in Visual Resource Imaging, July 2000, http://www.rlg.org/visguides.

(2) A detailed description of MTF with examples is included in: Don Williams. "What is an MTF  and Why Should You Care?" RLG DigiNews 2, no. 1 (February 15, 1998), http://www.rlg.org/preserv/diginews/diginews21.html

(3) Several MTF tools are available for measuring MTF. Two of them can be downloaded form the Technical Committee on Electronic Still Picture Imaging Web site. Another MTF software utility can be found at Mitre site.

(4) The scope of the PIMA/IT10 committee includes specifying storage media, device interfaces, and image formats for electronic still picture imaging. The scope also includes standardizing measurement methods, performance ratings for devices and media, and definitions of technical terms.

Highlighted Web Site

World Wide Web Consortium Graphics Activity
The World Wide Web Consortium (W3C) has just released (on August 2, 2000) a candidate recommendation for a new Web graphics format called SVG (Scalable Vector Graphics). The highlighted Web site describes the advantages of SVG over the various raster or bitmapped graphics formats and how SVG's use of XML will help create Web documents that are "smaller, faster and more interactive" than those currently produced. The W3C is an international body founded in 1994 to "lead the World Wide Web to its full potential by developing common protocols that promote its evolution and ensure its interoperability" (see About the World Wide Web Consortium). Its membership of over 400 includes hardware and software companies, professional societies, standards organizations, government agencies, and advocacy groups. W3C maintains and updates such key Web technologies as HTTP, HTML, and URLs and issues new "recommendations" in a variety of areas. Some of W3C's technical recommendations include XML (eXtensible Markup Language) and RDF (Resource Description Framework). The fate of its first recommendation in the graphics area, PNG (Portable Network Graphics), remains fuzzy (see the FAQ, below).

FAQ

What's the current status of the PNG image file format? There was a lot of attention a few years ago when it was touted as a replacement for GIF, but I haven't heard much lately.

Background
First, a bit of history. In 1985, Sperry Univac (which later became part of Unisys) received a patent for a data compression algorithm known as LZW. Compuserve used LZW in its 1987 specification for GIF (Graphics Interchange Format - a bitmap or raster image file format), apparently without realizing that LZW had been patented.

GIF became very popular for the creation of graphics on the World Wide Web, and by the time Unisys decided to assert its patent rights over the use of LZW in GIF in late 1994, millions of GIF files were already on the Internet. Initially, Unisys only pursued licenses from companies producing commercial software packages that could create GIFs. Later it expanded its enforcement efforts to freeware and shareware products as well as Web site maintainers who could not prove that their GIFs were produced by licensed software.

The Internet has shown a strong aversion to all things proprietary, and the response to Unisys' announcement of its plans to collect license fees for GIF-creating software was swift. Within two months, an essentially complete specification for a non-proprietary replacement for GIF was released, dubbed PNG (Portable Network Graphics).

GIF vs PNG
The designers of PNG sought not only to replace GIF, but to substantially improve upon it. (Note that many of GIF's limitations stem from the low network bandwidth and 8-bit video hardware that were common in the late 1980s when its specification was developed.) Some of PNG's advantages:

Current Status
There is widespread agreement that PNG is superior to GIF. PNG was adopted by the World Wide Web Consortium (see this issue's Highlighted Web Site) in 1996 as a replacement for GIF and it is making its way through the process of adoption as an ISO/IEC standard. It was selected as one of two required image formats for VRML (Virtual Reality Modeling Language). Starting with Office97, Microsoft made PNG the native image format for Word, Excel, and Powerpoint. Considering the number of new image formats that appear each year, the rapidity of acceptance and degree of support for PNG is impressive.

Nevertheless, all is not well. In order for an image format to challenge one of the two established Web formats (GIF or JPEG), it must have both good applications support for creating and editing files and be recognized as a native format by Web browsers. The level of support for PNG in application software is inconsistent, and the quality varies. For example, Adobe Photoshop 5.5 supports PNG partially, but can't use 16-bit color channels or text annotation and its compression of PNGs is poor.

Support by the principle Web browsers is even more spotty. As of this writing, only version 5.0 of Microsoft's Internet Explorer for the Macintosh has what PNG experts consider to be "near perfect support." Most of the current Windows (and Unix) versions of Internet Explorer (as well as all versions of Netscape Navigator/Communicator) will decode PNG files without the need for plug-ins, but support for key features such as variable transparency, progressive display, and gamma correction is poor.

The Future
Most technological innovations face a "chicken and egg" period during which their ultimate success or failure is uncertain. Users may be unwilling (or unable) to utilize the innovation until readily available and reasonably priced tools appear. Toolmakers may hesitate to invest substantial resources in an unproven technology. Generally, a technology either breaks through to wide acceptance, establishes itself in a small niche market, or fails completely within a fairly short period of time.

A number of factors unique to the WWW have held PNG in technological limbo for an unusually long period. Despite its youth, the Web has developed some inertia when it comes to acceptance of major new coding schemes or file types. This is especially true where there is a desire to accommodate users of older Web browsers. Since PNG support only appeared in version 4 of Internet Explorer and Navigator, any site wishing to remain friendly to earlier browsers would avoid using PNGs.

In addition, one of the major uses for GIFs on today's Web is for animated banner advertisements. Unfortunately, animation is one of the few GIF features not available in PNG, though a related standard, MNG - Multiple-image Network Graphics - does support animation.

Also, despite the widespread outrage generated by the Unisys licensing push, the average Internet user hasn't been seriously affected by it. Even the motivation for software and Web site producers to avoid GIFs will lessen shortly since the Unisys patent on LZW expires in 2003.

Finally, the sheer volume of GIFs in use today precludes a wholesale movement to a new lossless graphics format. More likely is that those who can benefit from PNG's unique advantages will use it despite the current gaps in mainstream support. As older, non-PNG supporting browsers disappear from use, and the quality of support from current browsers improves, PNGs should start to grow in popularity. For those currently using GIF but not concerned about compatibility with older browsers or the use of transparency, PNG merits consideration as a replacement now. More efficient compression provides an immediate advantage, while features such as gamma and color correction will pay off as soon as mainstream browser support improves.

More Info
Considerable information about PNG can be found at the official PNG site. A fairly regularly updated assessment of the current status of PNG and links to the official PNG specification and related documents are available at the site.

--RE

Calendar of Events

Northeast Document Conservation Center Fall Workshops
September 18-20, 2000, Seattle, WA - School for Scanning
October 3-5, 2000, Albany, NY - To Film or To Scan Seminar
The deadline for late registration for the School for Scanning is August 25 and the application deadline for the To Film or To Scan Seminar is August 22.

Digital Futures 2000: The Royal Photographic Society Imaging Science Group Annual Conference
September 11-13, 2000, Harrow, England
Digital Futures 2000 offers an opportunity for archivists, curators, and creators of images to communicate their needs to image scientists and for image scientists to relate their understanding of the medium to the imaging and archival communities. Topics include imaging, archiving, and conservation using digital technologies.

EVA-GIFU 2000: Electronic Imaging and the Visual Arts: Shaping the Cultural Information Society
October 4-6, 2000, Gifu, Japan
Topics to be discussed include digital archiving of cultural materials, and new multimedia technology, applications, and products.

Preservation and Conservation Issues Related to Digital Printing
October 26-27, 2000, London, England
This conference will to explore the impact of digital print on preservation and conservation issues, and to inform those responsible for the preservation of digitally printed material about the process and developments of this technology.

To Preserve & Protect: The Strategic Stewardship of Cultural Resources
October 30-31, 2000, Washington, DC
To be held at the Library of Congress this symposium will focus on preservation and collections security programs in libraries, museums, and archives. There will be discussions on topics such as developing preservation and security strategies, priorities, and expectations; the preservation and security challenges of electronic information and digitization; and innovations in security and preservation.

Computing Arts - Digital Resources for Research in the Humanities Conference - DRRH2001
September 26-28, 2001, Sydney, Australia
This conference will provide a forum for the creators, users, distributors, and custodians of electronic resources in the humanities to present and discuss their work, experiences, and ideas.

Announcements

DLF and RLG Issue Guidelines for Digitizing Visual Resources
The Digital Library Federation and the Research Libraries Group have issued five "Guides to Quality in Visual Resource Imaging" (see RLG News for a description of the guides).

The National Library of Australia's Digitisation Policy is Now Online
The policy is a guide to both the digitizing of items held by the NLA and the management of the resulting digital objects.

Cultivate Interactive: a New European Web Magazine
Sponsored by the European Commission's DIGICULT programme, Cultivate Interactive is aimed at the European cultural heritage community. Feature articles in the premiere issue include the DELOS network and a discussion of intellectual property issues.

Performing Arts Data Service: Guide to Good Practice - Creating Digital Performance Resources
The final version of the "Guide to Good Practice - Creating Digital Performance Resources" is now available online. Discussions include creating digital archives, developing e-journals, and Web site building.

Dublin Core Releases Recommended Qualifiers
The Dublin Core Metadata Initiative (DCMI), an organization leading the development of international standards to improve electronic resource management and information, has announced the formal recommendation of the Dublin Core (DC) Qualifiers. The addition of the DC Qualifiers enhances the semantic precision of the existing DC Metadata Element Set.

"Introduction to Metadata" Version 2.0 Now Available on the Web
The Getty Standards Program has released Version 2.0 of "Introduction to Metadata: Pathways to Digital Information," which includes updated essays by original authors Anne Gilliland-Swetland (Defining Metadata) and Tony Gill (Metadata and the World Wide Web), plus a new essay by Mary Woodley on metadata "crosswalks."

RLG News

DLF and RLG Issue Guides for Digitized Visual Resources

The Digital Library Federation (DLF) and Research Libraries Group (RLG) have issued Guides to Quality in Visual Resource Imaging, available at http://www.rlg.org/visguides/ (http://www.rlg.ac.uk/visguides/ from UK Janet sites). This new Web-based reference is designed to serve the growing community of museums, archives, and research libraries that are turning to digital conversion to provide greater access to their visual resources as well as to help preserve the original materials. "Visual resources" include original photographs, prints, drawings, and maps. Both project managers and technicians will find the guides particularly valuable in filling a gap in the literature for serious digital imaging projects. They provide concrete guidelines as well as help in addressing rapidly changing aspects of technology and practice.

The five guides - which range from project planning to scanner selection, considerations for imaging systems, digital master quality, and masters' storage - share the experience and knowledge of leaders in this field. In addition to providing advice based on the uses to which the images will be put and the technology now available, they also flag areas where further research and testing are needed.

The guides are the outcome of a project begun by DLF and RLG in 1998, when they created an editorial board of experts to review the state of the art in digital imaging of visual resources. Although sources for instruction in digitizing text or text and images existed (and more have become available since then), none specifically addressed the challenges of two- and three-dimensional as well as color-intensive materials. These experts outlined a set of guides needed in the science of imaging, objective measures for image quality, and how they can be controlled in various aspects of the imaging process. DLF then commissioned board-recommended authors to write the guides, which the two organizations have now jointly published.

The five guides are:

Each guide is a module that can stand on its own; as a set, the guides provide comprehensive advice on how to find what an imaging team needs to accomplish stated goals with the available technology. The guides also help to clarify the consequences of trade-offs that all managers must make to stay within organizations' means. The guides will be updated periodically.

Hotlinks Included in This Issue

Editors' Interview
Ira Fuchs authentication paper: http://www.jstor.org/about/remote.html
JSTOR: http://www.jstor.org/
JSTOR Newsletter: http://www.jstor.org/news/index.html
Kevin Guthrie's archiving paper: http://www.jstor.org/about/archiving.html
survey of participating institutions: http://www.jstor.org/about/bvs.html

Technical Feature
Mitre: http://www.mitre.org
PIMA/IT10: http://www.pima.net/standards/IT10a.htm
Technical Committee on Electronic Still Picture Imaging: http://www.pima.net/standards/IT10a.htm

Highlighted Web Site
About the World Wide Web Consortium: http://www.w3.org/Consortium/
World Wide Web Consortium: http://www.w3.org/
World Wide Web Consortium Graphics Activity: http://www.w3.org/Graphics/Activity

FAQs
current status of PNG: http://www.libpng.org/pub/png/pngstatus.html
links to the official PNG specification and related documents: http://www.libpng.org/pub/png/pngdocs.html
official PNG site: http://www.libpng.org/pub/png/

Calendar of Events
Computing Arts - Digital Resources for Research in the Humanities Conference - DRRH2001: http://setis.library.usyd.edu.au/drrh2001/
Digital Futures 2000: The Royal Photographic Society Imaging Science Group Annual Conference: http://leonardo.itrg.wmin.ac.uk/DF2000/
To Film or To Scan Seminar: http://www.nedcc.org/albany.htm
EVA-GIFU 2000: Electronic Imaging and the Visual Arts: Shaping the Cultural Information Society: http://www.vasari.co.uk/eva/gifu/index.htm
Northeast Document Conservation Center Fall Workshops: http://www.nedcc.org/welcome.htm
Preservation and Conservation Issues Related to Digital Printing: http://www.iop.org/IOP/Confs/PPP/
To Preserve & Protect: The Strategic Stewardship of Cultural Resources: http://www.loc.gov/bicentennial/symposia_preserve.html
School for Scanning: http://www.nedcc.org/sfsinfo.htm

Announcements
Cultivate Interactive: a New European Web Magazine: http://www.cultivate-int.org/
DLF and RLG Issue Guidelines for Digitizing Visual Resources: http://www.rlg.org/visguides/
Dublin Core Releases Recommended Qualifiers: http://purl.org/dc/
"Introduction to Metadata" Version 2.0: http://www.getty.edu/gri/standard/intrometadata/
The National Library of Australia's Digitisation Policy: http://www.nla.gov.au/policy/digitisation.html
Performing Arts Data Service: Guide to Good Practice- Creating Digital Performance Resources: http://www.pads.ahds.ac.uk/padsGGPPerformance

RLG News
Guides to Quality in Visual Resource Imaging: http://www.rlg.org/visguides/
UK Janet sites access: http://www.rlg.ac.uk/visguides/

Publishing Information

RLG DigiNews (ISSN 1093-5371) is a newsletter conceived by the members of the Research Libraries Group's PRESERV community. Funded in part by the Council on Library and Information Resources (CLIR), it is available internationally via the RLG PRESERV Web site (http://www.rlg.org/preserv/). It will be published six times in 2000. Materials contained in RLG DigiNews are subject to copyright and other proprietary rights. Permission is hereby given for the material in RLG DigiNews to be used for research purposes or private study. RLG asks that you observe the following conditions: Please cite the individual author and RLG DigiNews (please cite URL of the article) when using the material; please contact Jennifer Hartzell at jlh@notes.rlg.org, RLG Corporate Communications, when citing RLG DigiNews.

Any use other than for research or private study of these materials requires prior written authorization from RLG, Inc. and/or the author of the article.

RLG DigiNews is produced for the Research Libraries Group, Inc. (RLG) by the staff of the Department of Preservation and Conservation, Cornell University Library. Co-Editors, Anne R. Kenney and Oya Y. Rieger; Production Editor, Barbara Berger Eden; Associate Editor, Robin Dale (RLG); Technical Researcher, Richard Entlich; Technical Assistant, Allen Quirk.

All links in this issue were confirmed accurate as of August 11, 2000.

Please send your comments and questions to preservation@cornell.edu .

Contents Search Home

Trademarks, Copyright, & Permissions