ARABMARC: A Long Way to Go

Zahiruddin Khurshid

Abstract

Notwithstanding the availability of USMARC and UNIMARC as international exchange formats, the need for national formats continues to be felt, especially in countries or regions which use non–Roman scripts. This paper reviews the attempts made in the Arabian Gulf Region to prepare the framework for developing the ARABMARC format during the last 13 years. It also gives details of the availability of Arabic script support in various automated systems currently in operation, as well as formerly in use, in the Gulf Region, including STAIRS, DOBIS/LIBIS, MINISIS, VTLS, and Horizon. A critical review of the Arabic script support on RLIN is also provided.

Introduction

MARC, an exchange format developed by the Library of Congress in the 1960s, received wide acceptance by libraries, first in the United States and later all over the world. However, to meet the requirements for local cataloging practices, languages, and scripts, many national libraries or national agencies developed their own formats based upon the original MARC format. Some of them are so closely related to the MARC record structure that they are called “offspring” of the original LC MARC [1]. The differences have generally occurred in the content designation of national formats, which caused problems in the exchange of records among libraries. To resolve this problem of incompatibility in various national formats, IFLA developed UNIMARC, so that data generated by one country using its own format could be translated into UNIMARC and imported into the local system of another. However, like USMARC, UNIMARC was not that suitable for processing non–Roman script data. Therefore, the need for developing national formats continued to be felt, especially in countries using non–Roman scripts. As a result, such national formats as JAPAN/MARC, KORMARC, and Chinese MARC were developed. Some other countries, including the Arabian Gulf countries, are also investigating the possibility of developing their own formats, such as ARABMARC.

Arabic Language and Script

Arabic is the official language of all of the member countries of the Arab League. The Arab League comprises 21 countries, from Yemen in the Middle East to Morocco in West Africa. In addition, Arabic is written and spoken in the Muslim world from Indonesia in the Far East to Morocco in West Africa. The script of some other languages, such as Urdu and Persian, is also Arabic. While Arabic books may be found in libraries all over the world, their major concentration is in libraries in the Middle East and North Africa. Arab countries generally take pride in their language and prefer to see Arabic data in their own script. On the contrary, western libraries like to provide non–Roman data, including Arabic data, in Romanized form in regular fields of the USMARC format, which has “value in a network where the majority of the users are at terminals without the capability to display the non–Roman scripts. And many of them have no need for this display” [2]. This is the main reason why many Arab libraries do not subscribe to Romanized MARC records of Arabic materials. Even in the card catalog era, these libraries did not subscribe to LC catalog cards, as this would have required them to replace Romanized headings with the headings in Arabic, to be able to file cards in a separate all–Arabic catalog. Some libraries might even replace LC subject headings with the headings from local standard lists, such as Ibrahim A. El–Khazindar’s List of Arabic Subject Headings [3] or Nasser M. Swaydan’s Arabic Subject Headings [4]. The consideration of cost in time and effort of replacing Romanized headings with Arabic headings forced many libraries to prepare card sets locally. The amount of original cataloging, therefore, has always been higher in Arab libraries than in libraries elsewhere in the world.

Arabization

When Arab libraries began to implement automation in the early 1980s, they were faced with yet another problem of how to input and display data in Arabic script. Unfortunately, instead of working together to find a solution to this problem, libraries took separate initiatives by deciding to acquire software which, although it might not support Arabic script directly, could support it through modification (Arabization). In the Arabian Gulf Region alone, five libraries worked separately to Arabize the software they were using for processing non–Arabic materials — and three of these libraries (King Saud University, King Fahd University of Petroleum and Minerals, and the Institute of Public Administration) Arabized the same software (DOBIS/LIBIS).

The pioneering effort in Arabizing library application software was made by the Arab League Documentation Centre (ALDOC), when it developed an Arabized version of MINISIS in Tunis in 1982. MINISIS soon became very popular in the Arab world because of its Arabic support and because it is inexpensive. King Fahd National Library and King Faisal Center for Islamic Studies and Research, both located in Riyadh, Saudi Arabia, are two of its major users. The National Scientific and Technical Information Center (NSTIC) of the Kuwait Institute for Scientific Research (KISR) was next to develop the Arabic version of STAIRS, in 1983, by translating the panels, commands, and operands, and increasing the character set to include Arabic characters. The system was based upon the USMARC format, LC rule interpretations, and the List of Arabic Subject Headings by Ibrahim A. El–Khazindar [5]. Following the liberation of Kuwait in 1992, NSTIC re–started automation with VTLS, the Arabic version of which is being used for creating catalog data in Arabic.

King Saud University, Riyadh, Saudi Arabia, initiated a project of Arabizing DOBIS/LIBIS in 1984. They translated all code tables and maps in DOBIS/ LIBIS and added Arabic as one of the dialog languages for conversation. Instead of creating a second version of the system just for Arabic, they chose to stay with one system and one common database for Latin and Arabic data, to facilitate future maintenance and simplify daily operations [6].

The King Fahd University of Petroleum and Minerals Library, Dhahran, Saudi Arabia, took a different approach in Arabizing DOBIS/LIBIS in 1987. Instead of integrating Arabic and non–Arabic data in one system file, KFUPM created a separate local file for storing Arabic data. As such, a non–Arabic database existed at the system level, and the Arabic database at the local level, with a single document number controlling data elements from both Arabic and non–Arabic files. The biggest advantage of this structure is that additional access to an Arabic document could be provided from one or more system (non–Arabic) files, such as the subject file containing LC subject headings. This type of access is particularly useful for bilingual documents with title pages in both English and Arabic.

The third institution, which started the Arabization of DOBIS/LIBIS in 1986, was the Institute of Public Administration (IPA) in Riyadh, Saudi Arabia. “However, due to the lack of a strong data processing support, IPA dropped the Arabization plan and decided instead to create an original program written in COBOL language covering the various functions related to the Arabic collection” [7]. This system developed in–house is known as Ibn Al–Nadeem.

During the last five years, some renowned vendors have also entered the Arabian Gulf marketplace by offering fully Arabized systems in cooperation with their local representatives or distributors. For example, Arabian Advanced Systems, the local representative of Ameritech Library Services, has introduced AI–Ufuq, the Arabic version of Horizon.

This is a fully integrated library automation system including modules for OPAC, cataloging, circulation, serials control, and acquisitions. The code set implemented in Al–Ufuq is the IBM code page 864 (ASCII in the lower half and a modified ASMO 449 in the upper half of the code set), allowing the handling of Arabic and English bibliographic data … . [8]

The system has a user base of 12 academic and special libraries in the area. The other system is VTLS, which is also fully Arabized and has installations in Kuwait and the United Arab Emirates. A more recent addition is Unicorn, which is currently being Arabized by a Riyadh–based company, Systems & Communications House, a local representative of SIRSI Corporation.

Following the Arabization of commercial software, libraries were able to achieve a major breakthrough in creating Arabic script databases. However, it was soon realized that there is a serious problem of incompatibility between their record formats, which will inhibit any exchange of records among libraries, as well as the creation of a national or regional bibliographic network. Even exchanging Arabic records between two libraries having the same system is not possible. This was the time when libraries started to think seriously about the need for an ARABMARC format, which, if fully developed, would prompt vendors to offer support in their systems.

ARABMARC

The first major effort for the development of ARABMARC or Arabic MARC was made by the Saudi Arabian National Center for Science and Technology (now called King Abdulaziz City for Science and Technology, or KACST). KACST was established in Riyadh by Royal decree on 27 November 1977, with one of its objectives being to “conduct applied scientific research programs in the fields that serve the economic and social development objectives of the Kingdom” [9]. KACST decided to adopt computerization in support of its national research programs, and the development of ARABMARC was undertaken as part of the computerization project, with the appointment of an Arabic MARC Development and Implementation Committee in May 1984. The task assigned to the Committee [10] was to review the technical feasibility of

developing an Arabic script MARC record
making online entry of Arabic MARC records into a database running under SCANCT retrieval software
generating an output tape in the MARC format based upon records in the Arabic script database
determining — if the above steps are feasible — the next steps necessary to meet the objectives in the shortest possible time

The task on Arabic MARC, which KACST initiated as an internal project, later gained national interest when the Council of Deans of Library Affairs in Saudi Arabia formally assigned the project to KACST.

The Arabic MARC Committee began its work by setting the following conditions for the format.

It should include all fields for representing bibliographic information for books. The support for other forms of materials could be provided later. This format should form the basis for developing a shared database for Saudi Arabia, Gulf Region, or Arab countries.
All library systems should conform to this format so that exchange of records between different systems would be possible.
Any union catalog project to be started in the region should support the Arabic MARC format.
The format should be compatible with AACR2, standard classification systems, and subject headings lists.

Following a review of the existing MARC formats, such as USMARC, UKMARC, CANMARC, AUSMARC, Chinese MARC, and UNIMARC, the Committee considered the following two options:

Adopt one of the existing formats and translate it into Arabic.
Develop a completely new format, taking into consideration the requirements for Arabic materials.

They did not choose the first option, because the existing formats do not fully support the requirements for Arabic materials. The second option was also not found viable, because it would have been very time consuming and expensive. The Committee then decided to take a middle approach, taking full advantage of the existing formats to prepare a new format, and then modifying it for Arabic materials. For example, 008/06 would be modified to represent the Hijrah (Islamic calendar) date. The Committee also decided to translate into Arabic the labels of all USMARC fields.

An outline of the Arabic MARC Format for Books was presented to the Council at its annual meeting in 1986 for their review. Comments and suggestions received from the Council were incorporated, and a revised draft format was to be presented to the Council for their comments later. KACST hired the services of a highly qualified Saudi librarian with teaching experience to write detailed descriptions of the various fields, with explanatory notes and easily understandable examples [11]. Unfortunately, after this initial work, the project was not continued.

Arabic Script on RLIN

An important development took place in 1991, when the Research Libraries Group (RLG), with a grant from the Kuwait Foundation for the Advancement of Sciences, provided support of Arabic script in its automated bibliographic system, the Research Libraries Information Network (RLIN). The following are the major features of the RLIN system [12].

The script of the RLIN system is not limited to the Arabic language alone, but extends to other languages, such as Persian, Ottoman Turkish, and Urdu.
RLIN adopted ASMO 449, the standard for the encoding of Arabic script characters for bibliographic exchange, and augmented it with some additional characters for other Arabic script languages.
RLIN follows USMARC specifications. The Romanized data are given in a regular field and their alternative graphic representation (Arabic data) in an 880 field. A field containing Arabic script data is linked to its Romanized equivalent through subfield 6, which contains the tag of the associated field.
As per the AACR2 rule 1.OE, RLIN provides Arabic data in the following fields called core fields: 245, 250, 260, and 4XX. However, the RLIN application allows Arabic data to be included in all variable fields of a record.

The Arabic script on RLIN does provide the opportunity for libraries to enter, display, and output data in Arabic script. However, a number of factors have led to the slow growth of Arabic records in the RLIN system and their lack of acceptance by Arab libraries:

Most online catalogs in American libraries are limited to Latin script. Therefore, libraries are not inclined to add parallel Arabic script fields to existing Romanized records.
Libraries consider the requirement for parallel core fields as “double keying”, which adds to the cost of cataloging.
The handling of Arabic script data in alternative graphic representation fields rather than in main fields is not acceptable to Arab libraries.

The Arabic script on RLIN is not ARABMARC. It is “intended solely to add the capability to create, store, retrieve, and display Arabic characters in our bibliographic network” [13]. Their primary users are non–Arabic speakers for whom the Romanized data are more important than the Arabic data. “Each library makes its own policy decision whether to include non–Roman data in its records, and, if so, whether to include non–Roman access points. RLG cataloging standards do not mandate the use of RLIN’s non–Roman capabilities” [14].

Having realized that Arabic script on RLIN is not a standard that can be followed in creating or exchanging Arabic records, Arab libraries renewed their interest in the development of ARABMARC in a Workshop on Arabic Online Cataloging Network, held in Cairo, in 1992. It was followed by another workshop in Al–Ain, United Arab Emirates, in 1993. One of the recommendations of the 1993 workshop was to appoint a Technical Committee, made up of the representatives from Saudi and Kuwaiti organizations, to prepare additional recommendations on technical issues related to MARC and Arabic data [15]. The Dean of Library Affairs at KFUPM, who was appointed chairman of the Committee, initiated a study on ARABMARC at KFUPM. The result of this study was presented in the form of a paper at the Second Special Libraries Association/Arabian Gulf Chapter Conference in Bahrain, in 1994 [16]. The paper discussed, among other things, the characteristics of Arabic letters and their shapes, ASMO 449, the standard Arabic character set codes, and the status of Arabic support in various library automation systems in operation in the Kingdom of Saudi Arabia, including DOBIS/LIBIS and MINISIS. A critical review of Arabic script on RLIN was also done. Finally, the paper sought answers to a number of questions which would provide the framework for the development of ARABMARC. As a follow–up of this study, it was recommended that a number of tasks be achieved through the formation of various subcommittees comprising representatives from various libraries in the region. The main tasks would be to achieve consensus on such issues as format structure, character set, cataloging standards, and hardware and software options, which are directly related to ARABMARC. Unfortunately, no national or regional agency has come forward yet to sponsor and provide funding for the project. The only positive thing resulting from the KFUPM study is that it generated more interest in ARABMARC among the library community in the region. This subject is now being discussed in most professional meetings.

The Arab Bureau of Education for the Gulf States Initiative

The deans and directors of libraries in the Arabian Gulf Region held a meeting in Kuwait in 1995 under the auspices of the Arab Bureau of Education for the Gulf States (ABEG) to discuss ARABMARC, among other issues. Having acknowledged the role of the KFUPM Library in doing earlier work on ARABMARC, the meeting decided to ask the KFUPM Library to submit a proposal on the development of ARABMARC so that it can be resubmitted to concerned authorities for their support.

The KFUPM Proposal

The KFUPM proposal on ARABMARC was prepared following a review of various national formats and the work done earlier by various national and regional agencies, as well as by individual libraries in the Arabian Gulf Region. The proposal identifies a number of points to be addressed, defined, and followed before and during the course of developing ARABMARC, such as scope and structure of the format, content designation, character set, cataloging standards, code lists, and hardware and software to handle Arabic, bilingual, and multilingual data. Following a brief explanation of each of these points, the proposal includes a clear line of action or a preferred approach to be taken. For example, ARABMARC should be based on ISO 2709, and conform to ASCII and ASMO 449 character set codes (Unicode, a universal character set, should also be considered), and USMARC specifications for record structure, etc. The idea is to guide the prospective groups or committees on ARABMARC in the right direction. The KFUPM guidelines are, however, flexible, not rigid. The proposal also includes an action plan detailing various tasks, such as designating a national or regional agency to sponsor the project, preparing a team of resource persons, forming sub-committees or groups for each task, preparing documentation, providing instruction and training, and maintaining ARABMARC. A timeframe of up to two years is proposed for the first phase of the project of developing the ARABMARC format for books. An initial amount of SR 500,000 is estimated to support all activities of this phase.

The KFUPM proposal has been circulated by ABEG among libraries for their review and comments. However, there is as yet no indication that the proposal will soon be discussed at a national or regional forum,. in order to prepare a future action plan.

Conclusion

The Arabian Gulf libraries have made several attempts during the last thirteen years to prepare the framework for developing the ARABMARC format. The most promising attempt was the work done by KACST in 1984. They were moving in the right direction until 1986, when an outline of the format with the Arabic translation of USMARC field labels was prepared and distributed to libraries for their review. Unfortunately, the project was abandoned the same year without making any further progress. The subsequent work done by some institutions also failed to make any headway, as no national or regional organization came forward to sponsor and provide funding for the project.

In the light of past experiences, wherein the committees and groups resulting from the recommendations of conferences, workshops, and meetings have disintegrated without accomplishing their assigned tasks, and following the examples of success of various institutions in developing national formats in other parts of the world, the Arab libraries need to create a permanent ARABMARC Office at one of the national or regional organizations or institutions, such as King Fahd National Library, Arab Bureau of Education for the Gulf States, King Abdulaziz City for Science and Technology, or Saudi Arabian Standards Organization, with full financial support from the Gulf Cooperation Council (GCC) countries. The ARABMARC Office should be entrusted with the responsibility of developing the ARABMARC format, as well as maintaining it, just as the USMARC format is developed and maintained by the Network Development and MARC Standards Office at the Library of Congress. The pioneer work done by KACST and KFUPM could provide a good foundation for the proposed ARABMARC Office in developing the format.

In spite of witnessing progress during the last several years, Arab libraries still have a long way to go to achieve their goal of developing an ARABMARC format.

References

[1] Betty W. Lee, “The MARC Formats: UK MARC vs. US MARC, UNIMARC and Chinese MARC,” HKLA Journal no.10 (1986): 27–34.

[2] John A. Eilts, “The Design of Arabic Script Support in a North American Database,” unpublished paper, p. 11.

[3] Ibrahim A. El–Khazindar, List of Arabic Subject Headings, 3rd ed. (Kuwait: Scientific Research House, Kuwait University, 1983).

[4] Nasser M. Swaydan, Arabic Subject Headings (Riyadh: Riyadh University Libraries, 1978).

[5] Farooq A. Khalid, “Automation in a Special Library in Kuwait,” Information Technology and Libraries 2 (December 1983): 351–63.

[6] L. Booth, et al., “Arabization of an Automated Library System,” in Information Security in Computers and Communications: Conference Proceedings, the Ninth National Computer Conference & Exhibition, 1986 (Riyadh: National Information Center, Ministry of Interior, 1986), v.2, pp. 10–40.

[7] Mohammad Saleh Ashoor, “Arabization of Automated Library Systems in the Arab World: The Need for Compatibility and Standardization,” Libri 39 (1989): 294–302.

[8] Elizabeth Vernon, Decision–Making for Automation: Hebrew and Arabic Script Materials in the Automated Library (Champaign, Ill.: Graduate School of Library and Information Science, University of Illinois at Urbana–Champaign, 1996), p. 34.

[9] Suhail Manzoor, “Saudi Arabian National Center for Science and Technology (SANCST) Database,” International Library Review 17 (1985): 77–90.

[10] SANCST Internal Memo, 1 May 1984.

[11] Report on the Projects Assigned to KACST, Presented to the Meeting of Council of Deans of Library Affairs, held at UPM on 25–26 Jumada I, 1407 (1987).

[12] Joan M. Aliprand, “Arabic Script on RLIN,” Library Hi Tech 10 (1992): 59–80.

[13] John A. Eilts to Husni Al–Muhtaseb: personal communication, 14 February 1994.

[14] Aliprand, “Arabic Script on RLIN,” p. 72.

[15] “Minutes and Summary of Meeting,” Workshop on Arabic Online Cataloging Network, United Arab Emirates University, 5–6 October 1993, pp. 8–9.

[16] Husni Al–Muhtaseb, M. Saleh, Ashoor, and Zahiruddin Khurshid, "A Step Towards Arabic Machine–Readable Cataloging (ARABMARC)," paper presented at the Second Annual SLA/AGC Conference, Bahrain, 12–14 January 1994.

About the Author

Zahiruddin Khurshid is Senior Manager, Cataloging Operations Division, and Library Systems Analyst (Acting), King Fahd University of Petroleum and Minerals Library, Dharan, Saudi Arabia.

Top of Page | Table of Contents