|
Cultural content
COVAX (Contemporary Culture Virtual Archive in XML) is an international
two-year project which began January 2000 with the objectives:
To build a web service for search and retrieval of European Contemporary
Culture descriptions and documents from Memory Institutions.
To make accessible over the Internet existing document descriptions
in Libraries, Archives and Museums.
To satisfy needs of Memory Institutions, regardless of their size
or document types, to provide access to their collections.
To implement standards and achieve inter-operability between systems.
COVAX partners have large and small collections described by existing
museum, library and archive records, including some full text records.
The project is assessing the future provision of improved access
to these collections by converting samples of existing data into
a small number of common structured formats, each of which can be
expressed as XML (eXtensible Markup Language).
Although it is already possible to conduct searches across distributed
systems using the Z39.50 protocol, this protocol cannot be used
directly from web browsers. Results from searches must then be converted
to HTML or other formats for integration with web services. This
restricts the range of discovery and learning contexts from which
the underlying collections can be accessed.
COVAX is exploring the idea that future collection records in libraries,
museums and archives will be stored in a variety of XML formats
instead of proprietary formats, or formats such as MARC which have
not been widely adopted outside their developer sectors. Since much
material is described already in machine-readable form, we need
to develop widely applicable methods and approaches for converting
it to XML and integrating it with native XML data for building user-friendly
websites and data feeds.
Partner Profile: LASER
COVAX
includes partners with academic library, public library, museum
and archive resources. Future newsletters will describe data from
other partners. Here we outline one of the main bibliographic data
contributions, from London and South Eastern Library Region (LASER)
in the UK. LASER is the networking agency for libraries and organisations
connected with information and document provision. Since the 1920s
LASER and its predecessors has offered an extensive range of resource
sharing services, consultancy, advice, research and development
and training opportunities.
V3.Online
is a networked service for resource sharing and interlending requesting
that is the only totally integrated search, message and loan management
service in the United Kingdom. It offers access to over 4 million
bibliographic records in UKMARC format to LASER members, together
with 17.5 million holding locations.
The V3 database covers mainly catalogue records for printed monographs
that are held in public libraries of all the local authorities in
LASERs areas, including London, the South East and West Midlands
of England and selected authorities in Wales. The geographic and
subject scope of LASERs collections is unlimited they
include anything that appears in the British National Bibliography
or has been purchased by at least one member library.
V3.Online is used by interloans staff in public libraries to conduct
interlending transactions on behalf of library users. Use of the
transaction management system requires training and support. Access
to the V3 catalogue would also be useful to a wider range of people,
including other staff, library staff in institutions other than
public libraries, and the general public. For this purpose, a web
interface is being introduced and a Z39.50 target is being established.
Because of a period of major enhancements to V3 services during
2000, and because of the need to limit content to the field of culture,
LASER is exporting a subset of its bibliographic data for use in
COVAX. This will allow it to be searched either separately or together
with data from other COVAX partners. Three levels of sample data
have been planned. A pilot sample has been used to test handling
and conversion software and to validate standards used. A sample
of 5,000 records is being prepared to test the COVAX middleware
and user delivery systems in a demonstrator. Finally, a larger sample
will be prepared if required to contribute to a second prototype,
particularly to ensure that bulk handling of records and updating
are feasible.
COVAXs cross-domain agenda is very relevant to LASER, which
is to change its status in October 2001 reflecting new regional
government boundaries and structures being introduced to England
for cultural services. LASERs website is at http://www.viscount.org.uk/
MARC everywhere
Each partner for COVAX must create or have a database in native
XML. Data for this must be mapped from existing structures.
Led by the Residencia de Estudiantes (Spain), COVAX has surveyed
in depth the content systems and formats used by partners. Aspects
included software used, number of records, original formats, subject
scope, geographical scope, language scope and availability of electronic
records. A study of searchable fields used by partners has also
been completed, to assist in the identification of common practice
and the selection of COVAX searchable fields.
Interestingly, the conclusion of the studies is that there are
5 different MARC formats in use and 3 non standard formats. These
formats are used regardless of the format of the underlying resources.
In addition, ISO 8859-1 is a very widely used character set standard.
COVAX concluded that for bibliographic data we will agree to use
USMARC(MARC21) and use available tools where necessary to convert
to this format before finally converting the data to XML.
Other formats
Partners are supplying poster collection data including images
managed in an MS Access database; multimedia XML data based on a
proprietary Document Type Definition (DTD); records from locally
developed MS Access , Lotus Notes and Filemaker databases. These
will be converted to one of the appropriate XML formats using existing
DTDs for archives (EAD), museums (AMICO) and electronic texts (TEILite).
The DTDs have to be interpreted by the technical team to produce
XML schemas and schemas for the particular local XML repository
software used in each site, such as Tamino and TeXtML. A large aggregate
COVAX DTD is being used and developed to standardise data for the
project, containing the other DTDs such as MARC. The wider community
is developing most of these as the project proceeds, and the COVAX
team uses new versions as appropriate.
Conversions
We have explored a range of methods to convert existing data, depending
on whether existing databases are large and MARC based (typically
university or public library management systems) or smaller and
proprietary. Partners are also taking two approaches to providing
data for inclusion: Adaptation and Migration. Adaptation involves
changing existing systems to deliver suitable COVAX standard XML
interfaces. Migration involves exporting existing records, converting
them and storing them as COVAX compliant XML. A range of software
components and tools have been used to handle COVAX data, including
XML Spy and USEMARCON, as well as exporting software within library
systems such as V3 and Aleph. In Barcelona the Universitat Oberta
de Catalunya is testing a CATMARC-USMARC VTLS Conversion Program.
A report on the State of the Art of XML tools suitable for COVAX
has been made public via the COVAX website at www.covax.org.
XML and MARC
Conversion of MARC data formats to MARC21 enables us to use freely
available tools from the Library of Congress and elsewhere to convert
records to XML (for example http://lcweb.loc.gov/marc/marcsgml.html).
In particular Software AG in Spain has adapted the SGML MARC21 DTD
to XML for COVAX as well as testing the newer LoC MARC XML DTD.
MARC-XML conversion utilities have been successfully tested on COVAX
sample data, with a number of issues surrounding character encoding,
use with Dublin Core and batch transfers of multiple records.
System Architecture
COVAX is designed to demonstrate the feasibility of providing access
to distributed resources held in both XML and non-XML formats. It
will provide a public demonstrator website offering simple and advanced
search of e-collections and resource descriptions within them. Users
will be able to retrieve all types of resource as a result of a
single search. Results will be formatted by middleware using stylesheets
once they have retrieved from the distributed repositories via XML
queries, although exact standards for these are not yet fixed (see
http://www.w3.org/XML/
). These queries will be modelled on the Z39.50 protocol, although
the syntax encoding for transmissions will be different, using XER
(XML Encoding Rules) instead of BER (Basic Encoding Rules, as used
for Z39.50 implementations to date). This will enable the current
distinctions between searching for web documents and searching databases
less significant so that searching is more comprehensive and browsing
more effective. COVAX is also considering how data such as holdings
information can be maintained through updates from source legacy
systems.
Repositories
COVAX partners are currently implementing XML repositories using
two software packages, Tamino from Software AG, a COVAX technical
partner (http://www.softwareag.com/taminoplatform/)
and TeXtML from IXIAsoft (http://www.textmlserver.com/).
Sites are being established in London, Rome, Salzburg, Graz and
Madrid.
More about COVAX
A website has been established in Catalan, English, German, Italian,
Spanish, and Swedish at http://www.covax.org.
Carlos Wert, the project Coordinator and Francisca Hernández
have published an introductory article on COVAX in Cultivate Interactive
Issue 3 January 2001 (http://www.cultivate-int.org/issue3/covax/).
Contactos
Coordinador del proyecto
Carlos Wert, Residencia de Estudiantes, Pinar 23, 28006-Madrid, España
Correo electrónico: ile@interlink.es
URL: http://www.covax.org/
Difusión
Robin Yeates
Associate Director
LITC, South Bank University
Postal address: 103 Borough Rd, London SE1 0AA United Kingdom
Tel: +44 (020) 7815 6924
Fax: +44 (020) 7815 7050
email: yeatesrb@sbu.ac.uk
URL: http://www.sbu.ac.uk/litc
Más información
Si tiene cualquier pregunta sobre COVAX o desea recibir más información de actividades y noticias, por favor pulse
aquí
|