Catalog - Mainframe Programming : muraleedharan.com

Catalog - MVS Systems Programming >>

By Thierry Falissard (France)

Catalogs are used by MVS to locate datasets when a task attempts to allocate them without supplying their volume serial number; they hold records of the volume(s) on which each cataloged dataset exists. All allocations of VSAM datasets and SMS-managed datasets must go through the relevant catalog, and in this case the catalog also holds other information - for VSAM datasets, for example, this includes the physical location of each extent of the dataset on the disk, DCB-type information, and much more (this information is held in the VTOC entry, also known as the DSCB, for non-VSAM DASD datasets). This section looks briefly at the logical and physical structure of catalogs and how the catalog management process works.

Over the course of MVS's history, catalogs have gone through a number of different structures. The current flavour is known as ICF (Integrated Catalog Facility), and I shall concentrate on ICF catalogs here, though I will make a few comments on the earlier flavours in passing (the predecessors to ICF catalogs were known as VSAM catalogs, and the previous generation as CVOLs).

Logical Catalog Structure

Each MVS system must have a master catalog, and there will usually also be a number of lower-level catalogs known as user catalogs. All catalog searches begin by searching the master catalog, and many major system datasets must be cataloged in it.

Prior to MVS Version 4, the master catalog was described in the SYSCATxx member of SYS1.NUCLEUS, where the suffix xx was specified in response to the message "IEA3347A SPECIFY MASTER CATALOG PARAMETER" at IPL time, and defaulted to "LG". This is still a valid option on MVS Version 4 systems, but it is also now possible (and simpler) to specify the master catalog parameters in the LOADxx member of SYS1.PARMLIB. The use of LOADxx is discussed in more detail in chapter 7. MVS obtains the name of the master catalog and the VOLSER of the disk it is on from the SYSCATxx or LOADxx member, and then opens the master catalog, which it uses to find the other datasets it requires for the early stages of the IPL process. At this stage, the master catalog is the only catalog which is open and usable by the system, which is why any system datasets opened during the IPL process must be cataloged in it. Later in the IPL process, catalog management services are fully initialised, allowing the use of user catalogs.

The physical structure of the master catalog is identical to that of a user catalog - they are defined with the same IDCAMS command, and any catalog can be used as either a master catalog or a user catalog. So it is only the selection of a particular catalog at IPL time that identifies it as the master catalog for the MVS system concerned.

However, once you have established which catalog is your master catalog, it is necessary to define its relationships with its user catalogs, and in practice this will make the entries in a master catalog very different from those in a user catalog. Whenever MVS initiates a catalog search in order to locate a dataset, it starts by looking at the master catalog. The dataset to be located may be cataloged in the master catalog, but more commonly it will be cataloged in a user catalog. If the search process is to find it, that user catalog must be defined as a user catalog in the master catalog, and there must be an "alias" entry in the master catalog which relates the high-level qualifier of the dataset's name to the user catalog. When catalog management finds such an alias entry, it interprets this as an instruction to look in the specified user catalog for the catalog entries of all datasets with this high-level qualifier.

Thus there is a two-level hierarchy of catalogs, with most user datasets cataloged in the user catalogs, and the master catalog containing alias entries pointing to the user catalogs, plus catalog entries for system datasets required at IPL time. This is illustrated in Figure 3.8 below. It is generally the systems programmer's responsibility to design and enforce this hierarchy.

Physical Catalog Structure
Physically, the ICF catalog structure consists of two types of datasets: the BCS (Basic Catalog Structure) and VVDS (VSAM Volume DataSet). When we define a master catalog or a user catalog (using IDCAMS), we are creating a BCS. The BCS is itself a VSAM KSDS, whose cluster name is 44 bytes of binary zeros (as this is used as the key to the dataset, this ensures that the first entry in the data component of the BCS is always its own self-describing entry). The name of the data component is the name you assign in your IDCAMS DEFINE command, and the name of the index component, assigned by IDCAMS, always begins "CATINDEX" and continues with a timestamp.

The BCS contains entries of various types, such as ALIAS, NONVSAM, USERCATALOG, and CLUSTER, describing the various types of entity which may be searched for by catalog management:

* ALIAS entries were discussed in the previous section, and redirect a catalog search from the master catalog to a user catalog for a given high-level qualifier. Note that DFP Version 3 permits multi-level alias structures, but these are still uncommon.

* USERCATALOG entries define user catalogs

* NONVSAM entries describe non-VSAM datasets, and for non-SMS managed datasets include simply their name and the device type and volume serial number of the volume on which they are cataloged - further information, e.g. on the DCB and physical location of the dataset on the volume, is found in the VTOC (Volume Table of Contents) for DASD datasets or in the dataset labels for tape datasets.

* CLUSTER entries describe VSAM datasets, and point in turn to DATA entries describing the data component of the cluster, and (for KSDS's) to INDEX entries describing the index component. This is where ICF catalogs differ most markedly from their predecessors.

In an ICF catalog, the entry in a BCS for a physical component of a VSAM cluster works in a similar way to a NONVSAM entry. The BCS entry only contains minimal information about the component, such as the name, the device type and volume serial number, and all the physical details, such as location of extents, CISIZE, etc, is held on the same volume as the dataset itself, to simplify recovery and space management. For VSAM components, however, this information is not held in the VTOC entry. Instead it is held in the second component of the ICF catalog structure - the VSAM Volume Data Set (VVDS).

The VVDS is a special type of ESDS. It is created automatically whenever a VSAM component (including a BCS) is allocated on a volume which does not yet have a VVDS. The VVDS is always called SYS1.VVDS.Vvolser, where volser is the volume serial number of the volume, and you can preallocate a cluster with this name on the volume if you wish to override any of the defaults for VVDS allocation (or control its physical location on the volume).

The first record in every VVDS is known as the VSAM Volume Control Record (VVCR), and consists of two parts. The first of these lists the BCS catalogs which own (or have owned) VSAM datasets cataloged on this volume - these entries are known as "back pointers". The second part maps free space within the VVDS itself, and allows reuse of space within the VVDS (this is the main way in which it differs from a normal ESDS). The second record in the VVDS is a self-describing record, and the rest of the records are either VSAM Volume Records (VVRs) or Non- VSAM Volume Records (NVRs - a new record type introduced with DFSMS).

There is at least one VVR for each VSAM component on the volume, describing its physical extents, key and record sizes, CI/CA sizes, etc. The physical extent information duplicates information in the VTOC, but the VTOC is only used when the component is being defined, deleted, and extended, while normal read/write accesses to the component use the extent information in the VVR.

It should be clear that in this structure, there is a many-to-many relationship between BCS and VVDS datasets. That is, each BCS can own VSAM components on multiple volumes (and therefore uses multiple VVDS's), while each VVDS can contain entries for VSAM components owned by multiple BCS's. Just as the BCS entry for each component contains a pointer to the volume it exists on (and therefore to the VVDS), each VVR contains an indicator which connects it to the back pointer in the VVCR which corresponds to its owning BCS.

To complicate the matter just a little more, the introduction of DFSMS has extended the function of the VVDS. Non- VSAM datasets before the days of DFSMS were only represented in the catalog structure by an entry in a BCS, while all other dataset information was held in the dataset's entry in the VTOC. SMS-controlled non-VSAM datasets, however, do also have an entry (an NVR) in the VVDS of the volume on which they are located.

Before we leave catalog structures, let us briefly mention the differences between ICF catalogs and their predecessors, VSAM catalogs, which you may occasionally come across (if you do, I strongly recommend getting rid of them as quickly as possible!). The main differences were:

* VSAM catalogs did not use VVDS's - all information was held in the main catalog structure, which was simply known as a VSAM catalog

* Because this would have made recovery of volumes with VSAM components belonging to different catalogs extremely difficult, there was a concept of "VSAM volume ownership" - each volume was "owned" by a VSAM catalog, and VSAM datasets could only be allocated on the volume if they were cataloged in the VSAM catalog which owned it

* Under ICF catalogs, every VSAM component corresponds to a dataset which has space allocated to it using normal DADSM (DASD Space Management) routines and which appears in the VTOC with the same name as the VSAM component. VSAM catalogs, however, allowed the user to allocate a "dataspace" (no connection to the MVS/ESA dataspace concept discussed below!) which was then available for VSAM to suballocate to VSAM components without telling DADSM or the VTOC.

The Catalog Address Space
With the introduction of MVS/XA, a catalog address space known as CAS was created, which is started up at every IPL. This is used by the DFP catalog management function to hold most of its program modules and control blocks, which were previously held in the PLPA and CSA respectively. (DFP, or Data Facility Product, like JES, is a product which for all intents and purposes is part of MVS, though IBM markets it as a seperate product).

Catalog management routines also use CAS to "cache" records from catalogs - in other words, frequently referenced catalog records, including ALIAS entries from the master catalog, will be held in virtual storage in the CAS address space to avoid the need to repeatedly perform real I/O to disk for them. Unfortunately, in pre-ESA versions of DFP, records from catalogs shared with other MVS systems are flushed out when the system attempts to re-use them.

Each request for a catalog management service is handled by a "CAS task", which is assigned a task ID, and the status of these tasks can be monitored using the MODIFY CATALOG operator command. The LIST subcommand lists out CAS tasks, showing their task ID, the catalog they are trying to access, and the job on whose behalf they are trying to access it. The END or ABEND subcommand can then be used to terminate the CAS task if necessary.

MVS/ESA versions of DFP also provide commands which allow you to allocate and deallocate user catalogs, enabling certain maintenance functions to be performed more easily.