The Part -> Whole Relationship in German and American Cataloging Data

Final version

REUSE+ Joint Project of OCLC and Niedersächsische Staats- und Universitätsbibliothek Göttingen The Part à Whole Relationship in German and American Cataloging Data Results and suggestions Bernhard Eversberg 27 June 1998

[This text as WinWord6 file (75 K)]

[Example database]

[Text of talk in CC:DA meeting]

[PowerPoint slides]

Members of task group:

Feruzan Akdogan [Staats- und Universitätsbibliothek Göttingen]
Bernhard Eversberg [Universitätsbibliothek Braunschweig]
Monika Münnich [Universitätsbibliothek Heidelberg / Chair, Working Group for Descriptive Cataloging]
Glenn Patton [OCLC]
Barbara Tillett [Library of Congress].

Contents

1. Introduction
2. Remarks on "work" and "item"
3. Situation A : Multipart item (Manifestation of a work consists of several physical parts)
4. Situation B : Several works manifested in one physical item
5. Synthesis : A and B are essentially the same
6. Notes on authority records vs. bibliographic records
7. "Work records"? A new suggestion
8. Improvements for Germany under the status quo
9. Examples

Appendix: Links between bibliographic records

Acknowledgements
For help with this paper, we would like to thank Stefan Gradmann, who headed the first REUSE project [now Pica, Leiden], Dierk Höppner [Universitätsbibliothek Braunschweig], Cornelia Katz [Universitätsbibliothek Konstanz], and Emma Lee Yu [Brooklyn College Library, New York].

1. Introduction

The subject of Project REUSE+ has been to study possible improvements for the exchange of multipart records between USMARC and German data. This was the only major problem that remained unresolved when the first Project REUSE (1) was concluded in 1997.
The present paper summarizes the findings of this additional study and pursues two purposes:

to explore new concepts of dealing with Partà Whole relationships (as they emerge from an IFLA study, "Functional Requirements of Bibliographic Records" (2)), and

to show how exchange between German and Anglo-American agencies can benefit from these.

Proper handling of Partà Whole relationships according to the IFLA study does not call for radical innovations in either AACR or USMARC - just innovative application within these frameworks. Both rules and format do make provisions for almost everything needed, but as it sometimes happens, there are useful options or alternatives which have never been employed.
Whereas in principle, and within the environments of their respective rules and formats, German and AACR catalogers could produce near-identical results, there are big differences in practice, and the biggest are the following:
Germany: Every part of a multipart item is described in a separate record, and these are linked to a main (bibliographic) record describing the work as a whole. In shared cataloging, every library can attach their exact holdings data to precisely those parts they own copies of. Volume details are viewed as bibliographic data and therefore shared. This makes retrieval more precise and interlibrary loan requests more predictable.
AACR: There is either one and only one record for the whole work, with a very brief description (if anything) of the parts in a contents note, or there are separate records for every part but no main record for the collection - except, sometimes, an authority record in cases, for example, where a uniform title was determined.
In the first case, holdings data in a shared database do not reflect which parts one library actually has. This is established on the local level (circulation system) only. This means, different from Germany, bibliographic data of volumes are not shared.
Discussion in this paper does not cover "multilevel hierarchies". Of course, as soon as one can link something to something bigger, one can extend this to more than two levels and construct tree-like hierarchies. On the other hand: We have been using multi-level linking for a long time in Germany but are now phasing it out in favor of a simpler two-tier approach, which could even be implemented in USMARC without difficulty. We try to demonstrate this in the Appendix.
That means, in Germany we are already in the process of taking big steps in paving the road for international harmonization - without, however, giving up the essentials of record linking which we continue to find of vital importance for our union catalogs as well as OPACs. Recent developments show us to be on the right track: we have long since had the equivalent of linking techniques described in the IFLA study, and also of the metadata element "DC.relation".
Statistics of USMARC data and our scrutiny of pertinent examples have confirmed once again that there is very little that can be done to improve the results of conversion of USMARC records as they presently are into our formats, whereas the reverse seems to be easier, and especially after we have completed the simplification to two levels. We therefore find it appropriate, and we take the liberty to present and discuss approaches beyond the limits of the status quo of USMARC, esp. in the light of the recent Toronto conference and the requirements now taking shape in the metadata projects. We do hope these suggestions will be found constructive while we are aware that not everything can possibly be implemented in the short or even medium term.
It appears very realistic, eventually, to augment USMARC data with subrecords for parts and components converted from German data because these come universally and consistently equipped with these additions, now lacking in USMARC. Examples are provided and a database has been set up to demonstrate the effects.

Suggestions for improvements are tentatively flagged in the left margin with these letters:

Short-term, easy improvements

Medium-term, not really easy

Longer-term, involving conceptual changes

Except for section 8, these suggestions apply to USMARC.

An Appendix is provided to put this study into the wider context of record linking and to explain in more detail some of the suggestions made in the text.

2. Remarks on "work" and "item"

The time seems to have finally arrived when the concept of the "work" has become a real focus of attention in the world of cataloging. Whatever the wording of the various suggested definitions, for example by Martha M. Yee (on p. 33/34 of her paper "What is a work"), they all revolve around the idea of the work being a "product of intellectual or artistic activity ... which can stand alone as a publication...".

A publication is thus a physical embodiment or manifestation of a work, but it IS not itself the work. Cataloging has focused on "the piece in hand" for a long time, which is always a "copy of a manifestation of an expression of a work" (IFLA "Functional Requirements for Bibliographic Records (FRBR)" study (2)). The result is that catalog records (USMARC or other) always contain elements from each of these four levels. Asking "What's the work in this publication" often reveals that either the piece in hand is only a part of what one would call a work, or it may be looked upon both as self-contained AND as part of some bigger intellectual product, or it contains more than one product of intellectual activity, where each one could conceivably stand alone as a publication and sometimes does.

Should cataloging become serious about work orientation, we have to conclude that the unwritten principles of "one book - one record" or even "one call-number - one record" are inadequate, as well as the equation "bibliographic record = main entry card". Though these may characterize the most frequent situation, two other situations need to be distinguished:

A) Multi-part items: A manifestation consists of more than one physical part, each of which may or may not represent a smaller work.
Typical case: a volume in a series; it has its own title but is also part of a larger entity.

B) Containers: One physical item is host to (contains manifestations of) more than one work
Typical case: Audio CD containing recordings of several pieces.

These situations are being catered for in AACR2 chapter 13 "Analysis".

The way these rules are translated into MARC records is still governed by card-oriented, not work-oriented thinking: One item - one main entry card. Additional cards (added entries) are then made as needed, using the appropriate fields for headings. This principle is adequate for an inventory or a shelf list, but less for a catalog. Not for one based on Cutter's objectives at least, or the Paris principles - and these are supposed to govern German rules as well as AACR2.

To sum it up: At present, a "multi-volume item" is very often regarded by AACR2 practice as one item. German rules and practice always regard it as several items. "Containers" are treated as single items in both worlds, though there is the concept of "In" analytics in AACR2 and somewhat more elaborate rules have been worked out in Germany for the same purpose: to produce bibliographic records for contained manifestations (see 4. below).

3. Situation A . Multipart item: Manifestation of a work consists of several physical parts

First, some terminology : "Series" and "multivolume monographs"

Simply stated: English and German terminology are incongruent.

German catalogers' jargon uses "mehrbändige begrenzte Werke" (multivolume works with finite number of volumes) and "Serie mit Stücktiteln" (indefinite series with distinctively titled volumes), and these two categories receive different treatment. There is thus no exact equivalent for "series".

Works by personal authors, since their lifetimes are all finite, are always regarded as multipart items in the finite sense and never as "series", considering that they will have a finite number of separate parts ("mehrbändiges begrenztes Werk") whereas AACR2 (or at least LC) quite often treats these like series intended for indefinite publication, just because there is no indication as to exactly how many volumes there will eventually be.

The traditional German term "mehrbändige Werke" should rather be abolished because a "work" (= "Werk"), in current understanding, cannot consist of volumes. The work is an abstract entity. Only a manifestation of a work is physical and can thus appear in several volumes.

The term "mehrteilige Veröffentlichung" (= multipart publication) has been suggested as a more exact equivalent for "series", but is not established yet.

The AACR term "collection title" for the general title (found on every part) of a series, multivolume monograph or serial translates as "Übergeordneter Gesamttitel" (superimposed or superordinate collective title). "Collection", on the other hand, cannot be translated as "Sammlung" in German, since "Sammlung" is reserved for collections of works by one author (one or several volumes, "collected works"), which is again an incongruent concept.
Lastly, the German expression for "set" is the rather uneasy "mehrbändiges begrenztes Werk ohne Stücktitel".

Shortcomings of AACR2 cataloging, as we perceive them in Germany, boil down to the following, and we attempt to indicate possible solutions in terms of USMARC (see section 9 for examples):
We are aware that some or all of the following, or some equivalent, have been implemented in various US local systems. These could immediately benefit from the corresponding upgrading of shared records.

A.1
Volumes with no titles or indistinct titles are in some cases listed in a note (MARC 505, based on AACR 13.4) instead of just mentioned in the 300. This note, however, can contain other information like tables of contents of single-volume publications, and there is not sufficient formatting in the note text, nor other coding in other fields, to make it apparent for software that the record being converted does in fact describe a multipart item, and then to extract the titles or volume designations. OCLC statistics show there to be fewer than 2% of book records with these characteristics in their database (a 505 plus "v." in the 300); in Music, however, the proportion of records with a 505 is 67%!

An analysis done in Göttingen of 4.1 million LC records shows there to be about 3.3% of book records containing a 300 with a "v." AND a 505, but 3.6% of book records with a "v." in the 300 and no 505. The latter are mostly works where the volumes carry nothing more than a numbering beside the general title. In Germany, these too would get subrecords - with an empty title field. This way, we can always record volume-related dates, paginations, series titles and numberings, ISBNs etc. on the volume level. USMARC data lack these volume details entirely, 505 or not.

No German system, to the best of our knowledge, uses a technique like the 505. Instead, as stated above, we have linked subrecords for all parts of a multipart, whether they have a distinctive title or not.

If the MARC world cannot warm to this idea but wants to retain the 505 (although, statistically, it is not a large section, and thus not a lot of work), then what can be done?

1.

S
The easiest improvement appears to be the use of indicator 2: create the new values of 1 and 2 (when the present values are blank or 0). Values 1 and 2 would then mean: this contents note is a "volume list". Our conversion software could then at least create a main record and rudimentary, linked subrecords.

2.

M
Additionally, the formatting (subfields plus punctuation) within the 505 might be made unambiguous to enable software to extract volume designation, date, and pagination for every volume listed (provided someone has put them in). This punctuation could enable our conversion software to produce subrecords for every volume, with a bit more than rudimentary detail.

AACR2 talk of "contents notes" in a rather roundabout way. Maybe the term "volume list" or "list of parts" or something to this effect should be introduced to denote the case of a multipart publication.

3.

L
On the other hand, a more far-reaching (and thus less realistic) suggestion would be this: AACR2 leave it to the "cataloging agency" (13.1A) to decide if and when they want to make contents notes. Specifically, AACR2 do not strictly dictate a distinctive / nondistinctive title differentiation. Thus, nothing in the rules would stop an agency from phasing out the contents note for multiparts and introducing something similar to German practice. In card printouts, a contents note could still be generated by software. No rule change is called for, but some changes in MARC encoding and software appear necessary.

4.

S
This leaves us with those cases where there is only a "v." in the 300 and no 505. Not all of these are multi-volume: "1 v." can occur. The simplest solution would be to have an indicator for the 300, just like for the 505, saying "this is a multi-part item". If one could sort out the "1 v." cases, one might even generate these indicators retrospectively.

A.2
Volumes with distinctive titles are sometimes cataloged under these titles (i.e., given their own, separate records) with a 4XX (pre AACR2) or 8XX referring to the title of the series.

Of course, there is the occasional argument over whether a volume title is distinctive or not. Cases of doubt are probably more often decided pro-nondistinctive (it's less work), but the opposite is certainly more user-friendly.

AND there is the whole issue of classification decisions entering into cataloging treatment decisions (one call-number - one record!) - in our view, a most distressing aspect because there is no way of knowing which alternative will have been chosen for any particular multi-part item. For countries not using the LC Classification this is all the more mysterious.

To us, it appears this jagged borderline could be completely eliminated.

Here's a new suggestion for USMARC (not for AACR2 since these allow it already):

5.

M
The series as such should be considered a work in its own right and cataloged as such (like a serial record) and apart from the volumes.

Presently, AACR2 do not prescribe a dedicated bibliographic record for the series (13.3). Instead, in USMARC, an authority record is sometimes, but not always, made for the series on the basis of 26.5. (More on this, since it is quite fundamental, see under 6.) German catalogers find this use of authority records quite unusual. And annoying, because this way we never get to see the series record. Why not?
The LC authority file has never been loaded into any German systems for reasons of incompatibility and incongruence, and in any case, we would have no use for the vast majority of the records.
That means USMARC bibliographic records, when converted and merged into German systems, are essentially incomplete: we lack all the references contained in the authority records and particularly, we lack the main records for series. (See section 8.)

A.3
The subfield $p (name of part/section of a work) in the 245 often looks like an uneasy compromise.
The trouble with this is that software cannot determine if the part title given in $p is a distinctive one.

6.

S
This technique, very rarely applied (as statistics reveal), might better be abandoned - in our view - in favor of solution 1. or 2., where $p would become $a.

7.

M
Alternatively, one could restrict the use of 245 $a$p for cases where $p is non-distinctive - but then use this technique for all such cases, instead of the 505. That would mean: one record for every part, 245 $a the same for all parts, $p different. Or: one main record with the 245 $a, and linked subrecords containing only a 245 $p and no $a. (We are aware that $a is presently mandatory in 245.)

In Germany, we would rather see the 245 $a$p phased out completely, in favor of the other solutions.

Taking closer look, the solution typically implemented in Germany consists of

One main record for the series or multipart item (This is a bibliographic record, not an authority record!) This main record contains no links to the subordinate records.

No holdings are attached to this main record.

It is not a "work record", for every manifestation gets its own record.

One subordinate record for every part or volume, whether it has a distinctive title or not. If the latter, then there is no equialent of 245 $a. These subordinate records are linked (upwards only!) to the main record via its IDNr. In the union catalog databases, holdings are attached to these records so as to accurately reflect which library has which parts of the publication.

Volumes with non-distinctive titles just have no title field - obviously. All other fields, like date, physical description, names of persons related to that volume, whatever, it can all be there, in the subordinate record.

Cataloging or OPAC software can always present the comprehensive work with all its parts (following the control numbers) but also, when a subrecord is hit, the main record can be displayed with just the subrecord in question.

This solution can be coded in USMARC as well, at least in theory:
Field 773 can be used to link a volume record to a "collection" record:

773 $w IDNr of collection [ $a heading $t collective title ] $g volume information

For all intents and purposes apart from this, the volume record would be a regular USMARC bibliographic record.
A paper authored by Sally McCallum described this in full detail and was made available to us ("Multilevel descriptions in USMARC", 20 Jan 1997). But to the best of our knowledge, nobody is using this technique. Statistics confirm that LC has not implemented the 773.

8.

M
The approach of least change for USMARC users would be to go on using 8XX for series access points, but then change the series authority records into bibliographic records. This would make the indexing easier, too. To establish real links (via control numbers) and thereby help USMARC users abroad, there's the possibility of introducing $w into the 8XX. As of now, USMARC generally has "textual links" only. (Whereas local systems do have all sorts of sophisticated linking techniques.)

9.

M
Alternative solution: (see Example 2 and for the linking concept, the Appendix)
If the above is found too difficult or unacceptable for other reasons, one may consider using the 787 tag, and 'p' for 2nd indicator. This way, one would have all relationships between bibliographic records implemented in a uniform way in just one additional field. And all multipart publications could be treated alike in this concept - the cumbersome contents note could be done away with altogether for multiparts. (Software can, of course, using the links, assemble a contents note for card output or display. Software could also produce an added entry out of a 787 just like from a 8XX for the structure is virtually identical for this purpose. This means one could even avoid redundancy.)

Navigating the relationships (i.e., to write software for this purpose) would be made easier this way than with any other solution.
Every physical part would have its own record in this solution, linked to a common main record.
Besides, every physical part can relate to a separate work, and the main record in turn could be marked as a part of an even larger work.
With the 787 being repeatable, every physical item can have links to (be a part of) more than one comprehensive work.
Circulation (copy) records can then be, quite naturally, attached to the subrecord describing the physical part on the local level.

4. Situation B. Containers: Several works manifested in one physical item

The most frequent examples are in music, but festschriften or conference volumes are much the same. For the latter two, however, hardly any library is doing analytics for all the contributions in such volumes.

In German, the components are called "Unselbständige Werke" (dependent works). Again, the use of the term "works" here is inaccurate, of course - "manifestations" would be correct.

Everybody will know the structure of USMARC music records: (showing only those parts relevant for our discussion)

100 1 Composer of first piece
240 10 Uniform title of first piece
245 10 Title of container
505 Contents: composers and titles (not in authority form!)
511 Performers and conductors, as given on the piece
700 1 $aPerformer (R)
700 1 $aConductor (R)
700 12 $aComposer of 2nd piece $tTitle of 2nd piece
700 ...... 3rd ... 3rd ...

For the 700$a$t fields, indicator 2 is set to 2 (analytic). These 700 fields can serve for analytical added entries (13.2) as well as for "In" analytic entries (13.5). In practice, only the former is done, because the "In" analytics would be somewhat deficient:

The big problem is, of course, that no program can determine from this which conductor and performer(s) belong to which of the pieces listed in the 700$a$t entries. Results of boolean searches for performer AND composer are therefore often misleading because the performer and composer are in fact unrelated and only happen to be listed on the same CD. For the same reason, keyword searches for composer and title word or opus number can be equally disappointing. (See Example 3 below.)

10.

M
The simplest solution of this dilemma would be to make one separate ("analytic") record for every piece or "cut" on the CD, with its own 100, 240, 245, and 700s for the performer(s) and the conductor belonging to this piece. These analytic records would be linked (upwards) to the main record for the CD. This main record would not have a contents note but just a 245 and the descriptive data necessary to identify the CD. Example 3 below is showing the details.

Rule 6.1G4 allows this and it had been the norm for sound recordings under AACR1.
Unfortunately, the existing, convoluted USMARC music records cannot be dissected by software into a main record plus the necessary number of analytic records, for the reason mentioned. The only possibility is to produce incomplete analytic records with composer/title in a 100/240, out of the 700$a$t fields, the other 700s and the 505 and 511 would have to remain in the main record. We went through this exercise in Braunschweig and produced a classical music database totaling 40.000 records arranged in this way. The advantage is that the anyword boolean search for composer AND title word then yields only relevant titles. Otherwise, when keyword indexing original USMARC, you may search for "mozart and trio", for example, and get a CD containing a Mozart quartet and a Beethoven trio. In terms of retrieval precision, which is what really matters for OPAC users, the current USMARC practice is known to be suboptimal for music.

11.

L
A new, work oriented solution could be based on the same linking technique as in Situation A: here, however, the main record represents the physical volume, whereas the upward linking subrecords are relating to separate works, representing manifestations which in this case do not stand alone as publications. Holdings (or copy records in local circulation systems) would be attached to the physical volume, i.e. the main record. No AACR rule change would be required: Chapter 13 on analytics says nothing on how to implement analytic records in any format.

In Germany, we are only just beginning to catalog components. Rules for "dependent works" have been worked out, at least. There is the chance now to harmonize this area of cataloging from the beginning - if only we can find common ground here.

5. Synthesis: A and B are essentially the same

12.

L
If we turn to a work-oriented approach, we have to focus on logical entities rather than physical items, on identifiable intellectual products rather than "pieces in hand". If we do that, Situations A and B become essentially the same. Two different solutions are no longer needed - one linking technique can cover both situations. And it goes almost without saying: Questions of shelving or classification should (and need) no longer influence cataloging decisions. Let any two (or twenty) volumes share a call number, but not a record!

As early as 1989 Patrick Wilson stated that ".. the control of items is achieved at the expense of the control of works." ("The Second Objective" in: "The conceptual foundations of descriptive cataloging" / ed. by E. Svenonius. - San Diego: Academic Press, 1989, p. 8)

To rectify this, the cost in terms of rule changes is zero (not rule usage and interpretations!), the cost in format implementation is rather low. The real problems are with the legacy data. No complete conversion is possible, whatever model one might choose. Nevertheless, the reasoning of the FRBR study very much supports the view presented here, while at the same time we acknowledge that there seems to be no easy alternative even for the long term.

6. Notes on authority records vs. bibliographic records (see 3.A.2)

Authority records owe their existence partly to the card concept of references, AACR2 chapter 26. Faithful to the letter of rule 26.5, authority records are created for series. The online equivalent of a reference can have more functions than a reference card, however, depending on the software.

13.

M
The same effect can be achieved by having a bibliographic record for the series instead of an authority record. (Other than names and subjects, or "works" for that matter, a series is a bibliographic entity!) This would overcome the awkwardness of using different fields and subfields in authority records and in bibliographic records (100 $t instead of 245, 643 instead of 260, etc.). Which, in the view of database programmers, is a problematic design feature in USMARC.

Rule 26.5B (references for serials) is analogous to 26.5A for series. Yet, for serials with no distinctive title volumes, no authority record but a bibliographic record is made - for obvious reasons: there would otherwise be no catalog entry for the serial. Or the authority record would more or less duplicate the bibliographic record, only with different tagging. And of course, one cannot attach holdings to an authority record. Thus we have two perfectly analogous sections of a rule (26.5A/B), yet their treatment in the format is very different. The only reason for this was that it was the easiest way to produce the appropriate card headings for series volumes (100 $a$t directly provides the heading). It is not the only possible way: if we had a bibliographic record for the series as such, the heading could still be produced, using the 100/240 or 100/245 of the series record. In terms of linking, the volume record would ideally contain not the series title but the record control number of the series record only, which is enough to produce the heading when needed. Where this kind of data linking is not possible, the volume record can retain exactly the structure it has now, with the parts of the 8XX composed out of 100 and 240/245 of the series record instead of the 100$a$t of the authority record.

It should not be difficult to convert series authority records into series bibliographic records. The biggest difficulty will be that these records, like most authority records, are doing double duty as subject authority records. To have, conversely, a series title record (as bibliographic record) double up as authority record is questionable and would surely be rejected. To have two records of equal content but different structure to serve these different purposes is not sensible either.

We are assuming here that authority records of all kinds are merged into an OPAC only if there are records in the OPAC to which they relate.

14.

XL
Probably, if it comes to a work-oriented approach, the whole dichotomy of bibliographic vs. authority records should be re-evaluated. Logically, authority records could be restructured to look largely like bibliographic records, lacking a 245 and 300 etc. That would eliminate the difference in designator definition between the two formats, ever so annoying for implementors. Eventually, the authority format could be phased out altogether. All kinds of links are made easier. Is this mere speculation?

7. "Work records"? A new suggestion.

A work record cannot be a bibliographic description because a work, by any definition, is not a bibliographic entity. (Certainly, what is not envisaged is a combined description of various manifestations in different physical forms.)

It is the elusive quality of the "intellectual product" that needs to be pinpointed, to provide an anchoring point for new types of links (see the Appendix), to collocate records of manifestations in useful ways.

Uniform title or series authority records have aspects of what a work record might be in that they provide standard names (collocation points) for works. A bibliographic record for an original edition can also be regarded as representing
the work which is manifested in it for the first time.

This function of "representing a work" does not call for a new type of record then, but it can be made an additional feature of existing records.

15.

L
The simplest implementation would be to define a flag, and one might suggest position 8 or 9 in the leader (hitherto always blank), to say "this record, among other things, represents a work". This byte would then make the record eligible for all those links, anchored in 787 fields in all kinds of other records (see Appendix), each of which, in turn, could also have this same feature of "representing a work".

This solution appears quite alluring because, all of a sudden, one would have work records for most anything they would be needed for. There would be just that indicator switch to flick to make a record an official "work record".

8. Improvements for Germany under the status quo
Again: There is not much that can be done in Germany with USMARC multipart data as they presently are. That's why we took courage and went to these lengths to work out new suggestions.
To dispel unrealistic expectations, let us state one important point: we do not have the option, in Germany, to simply adopt the USMARC ways of dealing with multipart items. That would mean massive restructuring in shared and local databases and an abandonment of information found helpful for searching and necessary for efficiency in interlibrary loan, and it would mean to destroy a high level of consistency. Too high a price, clearly, for implementing an anachronism.
Series authority records have an 'a' in position 16 of the 008 fixed field and thus are easily selected. Position 13 even indicates whether or not the series is numbered ('a' = numbered, 'b' = unnumbered), and position 12 has an 'a' or 'b' for series / multipart.

16.

S
Therefore, if nothing else happens, at least we can go ahead, in Germany, and restructure those records for our databases, i.e. turn them into series main records. Conversely, series main records produced in Germany could be restructured into series authority records suitable for USMARC, BUT we do not always have an indicator in the main record saying, "this series is a multipart item without distinctive titles" - and would thus have to become a bibliographic record with a possible 505. In most cases, this fact can be derived from the subrecords (having no equivalent of a 245 $a then), but to program this change would not be quite straightforward. Does the MARC world want these records, however?

Statistics show, unfortunately, that the number of series authority records available in USMARC is rather small, so the attainable benefit is not as big as one might hope. Also, these authority records are not normally very rich in detail. There is an alternative:

17.

S
From the examples we looked at it appears we might just as well generate our main records from the contents of 8XX. In that case, of course, we would have to de-duplicate these main records since we would get a new one out of every instance of an 8XX. This would be a bit easier if the 8XX had control numbers (in a subfield $w) referring to the authority records.

However:
Both indications of "multi-partness" are statistically less frequent than we had expected, judging from German data. In just as many cases, around 3 %, there is neither a 505 nor an 8XX but only a "v." in the 300. In these cases, we can convert the record into a main record, but for the subordinate records, there is nothing there to construct them from.

9. Examples

Much of the following is probably unrealistic for the short term - but there are no convincing short-term solutions anyhow. This material is supposed to illustrate the suggestions made above. (All examples are from real life, but abbreviated to the essential parts. The different versions can all be studied in the example database.)

Example 1 : for Situation A.1

(Case 1) Volumes without DISTINCTIVE titles, but with SOME titles

a) Bibliographic record for the 3-part series

001 97002147
100 1 $aKnuth, Donald Ervin,$d1938-
245 14$aThe art of computer programming$c[by] Donald E. Knuth.
260 $aReading, Mass. :$bAddison-Wesley Pub. Co.$c1968-
300 $a v.$billus.$c25 cm.
505 1 $av. 1. Fundamental algorithms.--v. 2. Semi-numerical algorithms.--v.
3. Sorting and searching.

b) Suggested new 2nd indicator for the 505: (saying "this is a volume list")

505 12$av. 1. Fundamental algorithms -- v. 2. Semi-numerical algorithms --
v. 3. Sorting and searching.

c) Improved structuring of 505: (This would also help improve the indexing of contents notes)

505 13$vv. 1. $aFundamental algorithms.--$vv. 2. $aSemi-numerical
algorithms.--$vv. 3. $aSorting and searching.

or even a repeatable 505 (breaking up the above at every "--"):

505 13$vv. 1. $aFundamental algorithms.
505 13$vv. 2. $aSemi-numerical algorithms.
505 13$vv. 3. $aSorting and searching.

d) A much better solution: series main record + volume records (This is what might be produced out of German records)

001 97002147
100 1 $aKnuth, Donald Ervin,$d1938-
245 14$tThe art of computer programming
260 $aReading, Mass. :$bAddison-Wesley
300 1 $av. 1-

001 85028675 //r955
020 $a0201038099
100 1 $aKnuth, Donald Ervin,$d1938-
245 00$aFundamental algorithms /$cDonald E. Knuth
260 $aReading, Mass. :$bAddison-Wesley,$cc1968.
300 $axxi, 634 p. :$bill. ;$c24 cm.
800 1 $aKnuth, Donald Ervin,$d1938-$tThe art of computer programming ;$v1.

or, even better (instead of the 800)

787 1p$w(DLC) 97002147$v1

Volume 2:
001 85028997
020 $a0201038021
100 1 $aKnuth, Donald Ervin,$d1938-
245 00$aSeminumerical Algorithms /$cDonald E. Knuth
260 $aReading, Mass. :$bAddison-Wesley,$cc1969.
300 $axxi, 634 p. :$bill. ;$c24 cm.
787 1p$w(DLC) 97002147$v2

Volume 3:
001 85028998
020 $a020103803X
100 1 $aKnuth, Donald Ervin,$d1938-
245 00$aSorting and searching /$cDonald E. Knuth
260 $aReading, Mass. :$bAddison-Wesley,$cc1975.
300 $axi, 723 p. :$bill. ;$c24 cm.
787 1p$w(DLC) 97002147$v3

(Case 2) Volumes with NO TITLES at all (i.e., no 505)

00L cam 22002291
001 ocm00531535
008 720427m19631965maua 00010 eng
010 $a 63020717 //r65
050 0 $aQC23$b.F47
082 $a530
100 1 $aFeynman, Richard Phillips
245 14$aThe Feynman lectures on physics$c[by] Richard P. Feynman,
      Robert B. Leighton [and] Matthew Sands
260   $aReading, Mass.$bAddison-Wesley Pub. Co.$c[1963-65]
300   $a3 v.$billus.$c29 cm
500   $aVol. 2 has subtitle: The electromagnetic field; 3 has
      subtitle: Quantum mechanics
650 0$aPhysics
700 1 $aLeighton, Robert B$ejoint author
700a1 $aSands, Matthew Linzee,$ejoint author
740 1$aLectures on physics

What we would like to see here is subrecords like this, for all three volumes:

001 85028675 //r955
100 1 $aFeynman, Richard Phillips
245 00$bVol. 1. Mainly mechanics, radiation and heat
300 $ap. 1-1 - 52-12 :$bill.
787 p$w(DLC) 00531535$v1 (see 001 of main record)

Example 2 : Situation A.2 : A multipart (or series?) WITH distinctive titles

Typically, there is an 8XX in the records for the indivudual volumes. We are not quite sure whether there will be an authority record for the series in all of these cases. But if one exists, then the pattern is the following:

a) The authority record, as it is now:

001 n 84717754
100 1 $aKnuth, Donald Ervin,$d1938-$tComputers & typesetting
400 10$aKnuth, Donald Ervin,$d1938-$tComputers and typesetting
640 1 $aComplete in 5 v.$zCIP t.p. verso of v. A
643 $aReading, Mass.$bAddison-Wesley

b) The series main bibliographic record as replacement, as it might be:

001 n 84717754
100 1 $aKnuth, Donald Ervin,$d1938-
240 10$tComputers & typesetting
245 10$tComputers and typesetting
260 $aReading, Mass.,$bAddison-Wesley
300 $aComplete in 5 v.$zCIP t.p. verso of v. A

c) Two of the 5 (distinctively titled) volumes:

001 85030845 //r933
020 $a0201134373
100 1 $aKnuth, Donald Ervin,$d1938-
245 14$aThe TeXbook /$cDonald E. Knuth ; illustrations by Duane Bibby.
260 $aReading, Mass. :$bAddison-Wesley,$cc1986.
300 $axii, 483 p. :$bill. ;$c24 cm.
490 1 $aComputers & typesetting ;$vA
800 1 $aKnuth, Donald Ervin,$d1938-$tComputers & typesetting :$vA.

001 85028675 //r955
020 $a0201134446
100 1 $aKnuth, Donald Ervin,$d1938-
245 14$aThe METAFONTbook : a complete user's guide to typeface design with
      METAFONT /$cDonald E. Knuth ; illustrations by Duane Bibby.
250   $a6th pr., rev.
260   $aReading, Mass. :$bAddison-Wesley,$cc1991.
300   $axi, 361 p. :$bill. ;$c24 cm.
490 1 $aComputers & typesetting ;$vC
800 1 $aKnuth, Donald Ervin,$d1938-$tComputers & typesetting :$vC.

The 800 is a textual link to the main rec (its 100 and 245 combined)
To introduce a data link, the 800 might be extended like this:

800 1 $aKnuth, Donald Ervin,$d1938-$tComputers & typesetting :$vC.
$w(DLC) 84717754

Or, more radical again, a data link of the Partà Whole category:

787 1p$w(DLC) 84717754$vC

The 800 would then be, in principle, redundant, but might still be supplied, minus the $w, for those systems that don't understand the 787.

Example 3 : Container - Several works manifested in one physical item
In our Classical Music Database you can find many examples like this. Go to index 3 and look up "zz", under this pseudo-keyword we have arranged the examples you see above and a few more.

On the cover of a CD recording (Sony Classical SBK 62412) we find

Mozart: Serenade, K. 388
Beethoven: Octet, Op. 103
Dvorak: Serenade, Op. 44

A typical USMARC record for it looks like this:

00L njm 2200265Ia
001 ocm11129457
005 19861023
007 sdrbmm
008 840907s1996 nyu nn 0 N/A d
028 01$aSBK 62412$bSony Classical
040   $aVLY$cVLY$dPMC
245 00$aMozart: Serenade, K. 388, Beethoven: Octet, Op. 103, Dvorak:
      Serenade, Op. 44$hsound recording
260   $aNew York, N.Y.$bSony Classical$cp1996
300   $a1 sound disc (71 min.)$bdigital$c4 3/4 in
500   $aCover title
505   $aSerenade for winds in C minor, K 388 / Mozart (23:01) --
      Octet for winds in E-flat major, op. 103 / Beethoven
      (21:00) -- Serenade for winds in D-minor, op. 44 / Dvorak (23:55)
511   $aPinchas Zukerman, violin (1st work); Marcel Moyse (2nd
      work); Louis Moyse (3rd work)
650 0$aWind octets (Bassoons (2), clarinets (2), horns (2), oboes (2))
650 0$aSuites. (Instrumental ensemble)
700 10$aZukerman, Pinchas,$d1948-$4prf
700 1 $aMoyse, Marcel,$d1889-$ecnd
700 1 $aMoyse, Louis,$d1912-$ecnd
700 1 $aMozart, Wolfgang Amadeus,$d1756-1791$tSerenade$mwood-
      winds, horns (2)$nK. 384a (388)$rC minor$wnm
700 1 $aBeethoven, Ludwig van$tOctet,$mwood-winds, horns (2),
      $nop. 103,$rE_ major.$hsound recording
700 1 $aDvorak, Antonin$tSerenade,$mwinds and strings,$nop.
      44,$rD minor.$hsound recording
710 2 $aLos Angeles Philharmonic Orchestra
710 2 $aMarlboro Festival Octet
710 2 $aMarlboro Woodwind Ensemble

This is fine - as long as all you want is catalog cards. In OPAC searches, however, you get this record when asking for "zukerman and beethoven" or "marlboro and mozart". This record is also brought up in opus number searches for "mozart and 44" or "dvorak and 103". In other words, this kind of encoding cannot provide the level of precision users expect from computer catalogs.
Work-oriented cataloging, as we envision it now, would have to split this into four records. Basically, the 700s would be turned into linked subrecords, looking like these (note the 787 containing the link):

00L njm 2200265Iar
001 ocm12345678
007 sdrbmm
008 840907s1960 nyu nn 0 N/A d
100 1 $aMozart, Wolfgang Amadeus,$d1756-1791
240 10$aSerenade$mwood-winds, horns (2)$nK. 384a (388)$rC
minor$wnm
787 1p$w11129457,1

00L njm 2200265Iar
001 ocm12345679
007 sdrbmm
008 840907s1960 nyu nn 0 N/A d
100 1 $aBeethoven, Ludwig van
240 00$aOctet,$mwood-winds, horns (2),$nop. 103,$rE_
major.$hsound recording
787 1p$w11129457,2

00L njm 2200265Iar
001 ocm12345680
007 sdrbmm
008 840907s1960 nyu nn 0 N/A d
100 1 $aDvorak, Antonin
240 00$aSerenade,$mwinds and strings,$nop. 44,$rD minor.$hsound
recording
787 1p$w11129457,3

These can be produced by a little program that just turns a 700 into this kind of structure if it contains a $a and a $t; which then make up the 100 and 240, resp. The 787 contains the 001 control number of the main record plus an appended numbering. For the example database in Braunschweig, this is exactly what we did.

What no software can do, hoewever, would be to produce subrecords with ALL the information belonging to every single work. The first one would then have to look like this (and you find this in the database as well):

#00L njm 2200265Iar
#001 ocm12345678
#007 sdrbmm
#008 840907s1960 nyu nn 0 N/A d
#100 1 $aMozart, Wolfgang Amadeus,$d1756-1791
#240 10$aSerenade$mwood-winds, horns (2)$nK. 384a (388)$rC
minor$wnm
#700 1 $aZukerman, Pinchas,$d1948-$4prf
#710 2 $aLos Angeles Philharmonic Orchestra
#787 1p$w(DLC) 11129457,1

Appendix: Links between bibliograpic records

The following is an updated version of part 2 of a three-part series of contributions to the subject of linking, distributed last year to the e-mail list set up prior to the Toronto conference. Those postings were in fact an outgrowth of REUSE.

Since then, a very new development became prominent: the "Dublin Core" metadata community has tackled the issue of relationships between documents. One result is the list of relation terms included in the DC Simple standard (http://purl.oclc.org/metadata/dublin_core_elements).

What is attempted here is not to theorize but to present a minimal implementation of record linking into USMARC without so much as a single new field or subfield, yet capable of handling a wide range of logical links - including the DC types. Hopefully, this will at least make the recent suggestions of theorists clearer (see the FRBR study), and hopefully too, it will show that those suggestions are not extremely difficult to realize.

This implementation follows S.L. Vellucci's reasoning in her "Bibliographic relationships" paper ((2) p.28/29) and M. Yee's outline suggestions in her paper "What is a work" ((3) p. 25/26). Both these papers were presented to the Toronto conference.

A "conceptual schema" for a full-scale model can be found in a paper by Gregory Leazer (4).
To make things really simple, let us use just one tag to accomodate all links, namely the

787 Nonspecific relationship entry (Repeatable)

and two subfields :

$w Record control number (target to link current record to)
$g Relationship information (textual; optional)

All other subfields, strictly speaking, are redundant because they all contain fields from the target record pointed to by the number in $w.

The other subfields may be used, on the other hand, in situations where the software cannot handle these links or the target record is not present in the database. They can be inserted into exchange records, of course, which would then look as they do now, plus the $w. This way, local software would not be affected if it is not prepared for linking.

The subfields are defined repeatable in USMARC, but that makes matters unnecessarily complicated. It is better to have another 787 for every link to be established.

Why 787? The 787 was first defined for serials, later made applicable for all types, and it is the closest thing that can be found to suit the purpose in question.

Instead of using 787, one might think of extending the 700, which currently has no $w. (For German readers: the $6 for "linkage" has nothing to do with $w, for it is meant for intra-record linking to an 880 field in an alternate script!) It may be better, however, to restrict the 700 to additional personal name access points and take the name/title references out of it and make true 787 work links instead, or separate analytic records altogether in the case of contained works - but that's for Part 3.

On the other hand, the 787 could be made to look almost like a 700, except the $w, so that local systems incapable of handling it could turn it into a 700. But to provide a 700 in addition to the 787 would be a waste.

To distinguish between the various relationships, and to make them specific, our simple model proposes the use of indicator 2 in 787, as yet undefined. This indicator might take on the following values (and here, a full-scale model would not have to differ): (in parentheses: DC Simple terms for relations)

0 Equivalence (facsimile or reproduction) (IsFormatOf)
1 Simultaneous edition (IsVersionOf)
2 Successive derivation, edition, version (IsVersionOf)
3 Amplification (incl. commentaries, illustrations, criticism etc.) (IsBasedOn)
4 Extraction (abridgements, condensations, excerpts)
5 Recordings of performances
6 Adaptation, modification (change of genre or medium, arrangement) (IsFormatOf)
9 Translations (IsVersionOf)

a Accompanying relationship (supplements of any kind) (IsRequiredBy)
p Part à whole relationship (IsPartOf)
r Review or other descriptive relationship
s Sequential relationship (like successive title of a serial)
u Unspecific relationship, based on shared characteristics of other kinds

This list is based very closely on B. Tillett's taxonomy of relationships and Smiraglia's extensions (these being the numbers 1 through 9 above; they are subcategories of Tillett's "Derivative" category which Smiraglia, by empirical evidence, found to be too broad). (4)

Differing from Tillett and Smiraglia, the list is ordered, more or less, from very close (identity) to rather distant (unspecific) relationships.

One might (but should one?) call this "relevance ranking". However, "relevance" is subjective, which means for the end-user to judge, not for the database producer.

Dublin Core terms for relationships are added in parentheses. These are part of the DC Simple standard which includes only a small list of very broad relationship terms.

What if more than one indicator applies? Like, say, for a sound recording of a part (one movement, an aria etc.)? Then use the first in the list that applies. Thus, "5" takes precedence over "p". This appears sensible for a minimal model. A full-size model would have additional work records for parts of a work, or it would define a repeatable subfield instead of the (not repeatable) indicator.

Another possibility: use repeated 787s to indicate several different relationships to the same work.

Yet another suggestion: add subfields $p for Part and $v for Volume to the 787 definition and use these subfields to relate to parts. But make $v sortable to allow for automatic arrangement of parts in correct sequences! One might think of adding a subfield $l to 787, but the language would be redundant because of 008 and/or 041.

The physical medium or format adds another dimension which we need not discuss here. Of course, to have a work record act as a device to collocate several editions in different media would be a welcome effect.

The OPAC or retrieval software can certainly be made to give the user a selection qualifying for media type, in addition to everything else.

Work authority records? (see section 7)

How can this scheme be applied in cataloging, what will the OPAC user see? This at once raises the question "Don't we need work authority records first, to serve as targets for the links in the 787?" Not necessarily. If we have a record for the original edition of a work, then this may do double duty as a work record. If not, then a uniform title authority record with a $f and $t can be used. For the minimal model, this is sufficient. It means there is no need to define anything new beyond what USMARC already has.

The $t subfield is a bit ugly in that the initial article cannot be marked. To simply omit it may have been found tolerable by many, but for work authority records it is not good enough, not for every language at least. But that's another subject. It is being dealt with by discussion paper 106 and the subsequent proposal paper 98-16.

Instead of a name/title authority record one might think of a skeletal bibliographic record with a 100 , 240$a and 260$c to serve as work record.

Suppose, by way of example, we have three items to catalog:

[1] A translated edition of Shakespeare's "Macbeth"

[2] An English and Italian vocal score of Verdi's "Macbeth" opera

[3] A sound recording of a performance of the latter.

For the two works involved, we use these authority records as work records:

001 a888
100 1 $aShakespeare, William$d1564-1616$tMacbeth$f1605

and

001 a999
100 1 $aVerdi, Giuseppe$d1813-1901$tMacbeth$f1847
787 16$w888

The second has a 787 link to the first, saying it is an adaptation of it.
Our bib records will then have these core elements:

[1]
100 1 $aShakespeare, William$d1564-1616
240 10$aMacbeth.$lItalian [redundant because of 787?]
245 10$aMacbeth
787 12$wa888 [it is an edition of a888]
787 19$wa888 [it is also a translation]

[2]
100 1 $aVerdi, Giuseppe$d1813-1901
240 10$tMacbeth.$sVocal score.$lEnglish & Italian
245 10$aMacbeth :$bopera in four acts
787 16$wa999 [an arrangement of a999]

[3]
100 1 $aVerdi, Giuseppe$d1813-1901
245 10$tMacbeth$hsound recording
787 15$wa888 [it is a performance recording of a888]

The links in 787 can enable software to find and display the work records upon demand, and to collocate and display records related to a work record in all the various ways. IOW, the 787 as defined here is all that's needed, the rest is "only" software.

(At Braunschweig Univ. Lib., we did an implementation along these lines into the USMARC database we keep for compatibility studies, and it was only about an hour of work.)

Just a matter of software too is the support of inputting the 787 into new and existing records. One easily envisages a "point and click" method of making a link, and the system would have to prompt for the type of link and the (optional) $g input (plus, probably, $v and/or $p).

Establishing links can be made a quick and easy process with this model, and it wouldn't otherwise be feasible. The task, however, to introduce any linking model into a large database remains a daunting one. Maybe one can write software to turn 100/240 combinations and part of the 700 $a$t fields into work authority records and new type links, but these would have to be checked and the indicators supplied.

And what will the end-user see? That depends, as always, on the software that presents the OPAC interface. The emerging WebPAC technology can make use of these new elements and index entries to let the user traverse the links.

It will be no difficulty to produce displays like this for a work record:

Verdi, Giuseppe (1813-1901)
Macbeth. 1847.

1. [Adaptation of:] Shakespeare, William: Macbeth.
2. Adaptations
3. Recordings of performances

Here, 1. to 3. may be "blue" links in a WebPAC or just numbers for the user to enter in conventional OPACs. Following link number 1, one might get (depending what types of relationships actually exist in the catalog)

Shakespeare, William (1564-1616)
Macbeth. 1605.

1. Original edition
2. Later editions
3. Amplifications (incl. commentaries, illustrations, criticism etc.)
4. Extracts (abridgements, condensations, excerpts)
5. Recordings of performances
6. Adaptation, modification (change of genre or medium, arrangement)
7. Translations
8. Other related works

Following link number 2 will bring up records like

Verdi, Giuseppe:
[Macbeth. Vocal score (English & Italian)]

Macbeth : opera in four acts
...

1. [Adaptation of:] Verdi, Giuseppe: [Macbeth] (1847)

Link 1 here, of course, leads back to the Verdi/Macbeth work record.

Final remark.

It has become clear that the whole matter of linking bibliographic records is one that relies on implementation into the format, less on cataloging rules. Work records can be defined in cataloging terms alone and printed on cards as well. Links can be defined as new kinds of "added entries", potentially replacing current added entries (cf. what was said about 700 vs. 787 above).
For brevity, these added entries could even be called "links". How much sense all that makes for conventional catalogs, at least in terms of feasibility, is another question. In Germany, concern for conventional catalogs is no longer considered important in discussions of cataloging rules.

References

(1) REUSE Final Report, 15 July 1997. - URL: http://www.oclc.org/oclc/cataloging/reuse_project/index.htm

(2) Vellucci, Sherry L.: Bibliographic Relationships. - International Conference on the Principles and Future Development of AACR. Toronto, Canada, October 23-25, 1997. URL: http://www.nlc-bnc.ca/jsc/r-bibrel.pdf

(3) Yee, Martha: What is a Work? - International Conference on the Principles and Future Development of AACR. Toronto, Canada, October 23-25, 1997. URL: http://www.nlc-bnc.ca/jsc/r-whatis.pdf

(4) Leazer, Gregory: A conceptual schema for the control of bibliographic works. - In: Navigating the networks. Proceedings of the ASIS mid-year meeting, Portland, Oregon, May 21-25, 1994. (p. 115-135). - Medford, N.J.: Learned Information Inc., 1994. ISBN 0-938734-85-7

`1.`

`S`

`2.`

`M`

`3.`

`L`

`4.`

`S`

`5.`

`M`

`6.`

`S`

`7.`

`M`

`8.`

`M`

`9.`

`M`

`10.`

`M`

`11.`

`L`

`12.`

`L`

`13.`

`M`

`14.`

`XL`

`15.`

`L`

`16.`

`S`

`17.`

`S`