*** COB2SAS, Release 2 Reference Files in which the programs are stored: R2CMS ........ Invokes COB2SAS on CMS. R2MVS ........ Invokes COB2SAS on MVS. R2VMS ........ Invokes COB2SAS on VMS(TM). R2VSE ........ Invokes COB2SAS on VSE. R2COB1 ........ Creates formats used for parsing. R2COB2 ........ Parses COBOL data description entries. Creates the data dictionary. Creates the group data set. R2COB3 ........ Updates the data dictionary with the information in the group data set. R2COB4 ........ Compresses COBOL data names to no more than 8 characters. R2COB5 ........ Expands OCCURS variables. R2COB6 ........ Uses the information in the data dictionary to produce SAS(R) language input statements. R2COB7 ........ Uses the information in the data dictionary to produce SAS language label statements. Files in which the documentation is stored: R2NSTL ........ COB2SAS, Release 2 Installation Guide R2RFRN ........ COB2SAS, Release 2 Reference R2RNTS ........ COB2SAS, Release 2 Release Notes R2USGD ........ COB2SAS, Release 2 Usage Guide COPYRIT ........ COPYRIGHT Notice DSCLMR ........ DISCLAIMER TDMK ........ TRADEMARK Notice Files in which test data and their results are stored: CP1 ........ COBOL Program 1 CP1LOG ........ Log of an execution on COBOL Program 1. CP1LST ........ Listing of an execution on COBOL Program 1. Contents - Statement of Purpose & Overview .......................... 1 - Description of the Data Dictionary ....................... 2 - Expansion of items and groups occuring more than once .... 3 - SAS language statements produced ......................... 3 - Assumptions, Restrictions and Caveats .................... 4 - Clauses that are recognized .............................. 8 - Attributes that are recognized ........................... 10 - The programs that invoke COB2SAS, Release 2 .............. 12 - R2COB1 through R2COB7 .................................... 13 - The programs that create the data dictionary ............. 14 - Processing performed to build the data dictionary ........ 16 - Parsing algorithm ........................................ 18 - Actions taken for each clause ............................ 24 - Level Nesting ............................................ 26 - Usage Stack .............................................. 26 - Group Stack & Group Data Set ............................. 26 - Redefinition Stack ....................................... 26 - Offset Stack ............................................. 26 - Conversion from Usage and Picture to Informat ............ 27 - Options available by way of the SWITCHES data set ........ 29 - Extending recognition of attributes of data description entries ................................. 31 - Modifying uhe program to parse other entries ............. 32 - Acknowledgements ......................................... 34 COB2SAS, Release 2 Reference Contents * Statement of Purpose COB2SAS is a tool that can assist you in converting COBOL language entries into equivalent SAS language statements. COB2SAS, Release 2 is designed to assist you in converting COBOL language data description entries into SAS language statements. It is also designed to be extendible so that you can easily modify it to parse entries other than data description entries. * Overview of COB2SAS, Release 2 COB2SAS, Release 2 uses the information in the data description entries of COBOL language programs to create a data dictionary. Succeeding steps use this data dictionary to create SAS language statements equivalent to the data description entries. There are several reasons for producing a data dictionary prior to producing any SAS language statements. One important reason is to provide a mechanism that allows the user to reconcile the differing conventions of the COBOL and SAS languages. Once a complete data dictionary is produced, these differences can be reconciled by programs that use the data dictionary as input. * Differences in the conventions of the COBOL and SAS languages The COBOL language allows variable names to have up to 30 characters, but the SAS language allows only 8 characters for variable names. The COBOL language has provisions for storing character strings whose length exceeds 200, but the SAS language can not store more than 200 characters in a single variable. COB2SAS, Release 2 Reference Page 1 * Description of the Data Dictionary A description of each variable in the data dictionary is given here: FILENAME (File Name) The name of the file after the level indicator FD or SD in the File Section of the Data Division. LEVEL (Level Number) The level number of the data description entry. NST_DPTH (Nest Depth) The depth of level nesting. Also known as the normalized level number. DATANAME (Data Name) The name of the data item described by the data description entry. This is either a programmer supplied word or the keyword FILLER. NEWNAME (New Name) A compressed version of the data name. This name has no more than 8 characters. USAGE (Usage Value) The usage of the data item described by the data description entry. In the case of the beginning of a record, this has the value 'GROUP'. PICTURE (Picture Value) The picture value, if any, of the data item described by the data description entry. This is blank when USAGE has a value of GROUP, COMP-1, COMP-2, BINARY, or INDEX. INFMT (Informat) The SAS language informat that corresponds to the usage and picture value of the data description entry. OCR_BASE (Occurs Base) The byte at which an item or group that occurs more than once begins. ITM_DISP (Item Displacement) The displacement from OCR_BASE of each item within a group that occurs more than once. ATBYTE (At Byte) The byte at which the data item is located. COB2SAS, Release 2 Reference Page 2 BYTES (Bytes) The number of bytes in a group or data item. OCR_VAL (Occurs Value) The number of times that a group or item occurs when it occurs more than once. RDF_NAME (Redefines Name) The data name, if any, of the data description entry being redefined. * Expansion of items and groups occuring more than once After the data dictionary has been built, succeeding programs use it as input. One of these programs (called, R2COB5) will expand items and groups that occur more than once. This program does not attempt to expand multidimensional tables. It only expands 1 dimensional tables. This program uses the values in the variables OCR_VAL, LEVEL, OCR_BASE and ITM_DISP to create the number of entries specified in OCR_VAL. The programs used to invoke COB2SAS, Release 2 on each operating system (for example, R2MVS on MVS) are sent with the inclusion of this program commented out. If you wish to invoke it, you must uncomment the line that includes R2COB5 for execution. * SAS language statements produced COB2SAS, Release 2 is able to produce SAS language INPUT and LABEL statements. COB2SAS, Release 2 Reference Page 3 * Assumptions, Restrictions, Caveats COBOL language statements are parsed in the program called R2COB2. R2COB2 does no checking for syntax errors in the COBOL language statements. Furthermore, R2COB2 makes the following assumptions about the data description entries that it will read. Assumptions regarding the location of entries within margins: (1) The contents of columns 1 through 6 are ignored. (2) Any character in column 7, other than a blank or a hyphen, causes the entire line to be ignored. (3) The level indicators CD, FD, RD, SD, and 01 are in columns 8 through 11. (4) All clauses, other than the division and section identifiers, and the level indicators, are in columns 12 through 72. Assumptions regarding the clauses that identify divisions & sections (1) In order for it to recognize the beginning of divisions, it checks for occurrences of the strings: IDENTIFICATION DIVISION ENVIRONMENT DIVISION DATA DIVISION PROCEDURE DIVISION (2) In order for it to recognize the beginning of the file section, it checks for an occurrence of the string: FILE SECTION (3) In order for it to recognize level indicators, it checks for occurrences of the following strings within columns 8 through 11: 01 FD SD CD RD Caveats regarding level numbers (1) 1 through 9 are recognized as valid level numbers. However they are converted to 01 through 09 before they are output to the data dictionary. COB2SAS, Release 2 Reference Page 4 Caveat regarding FILLER (1) When a level number is not followed by either a programmer supplied data name or the keyword FILLER, FILLER is output to the data dictionary. Restrictions regarding the PICTURE clause (1) No attempt is made to make use of the operational character S in the picture string. (2) No attempt is made to either parse or make use of the operational character P in the picture string. (3) No attempt is made to honor a continuation indicator within a picture string. Restrictions regarding the OCCURS clause (1) No attempt is made to recognize the OCCURS clause on entries with level number 01, 66 or 88. (2) If the OCCURS clause has a DEPENDING ON clause, the number of items is assumed to be the value after the keyword TO. In other words, variable length tables are treated as fixed length. (3) No attempt is made to address the issue of interrecord or intrarecord slack bytes. (4) No attempt is made to use any INDEX variables. Assumptions and Caveats regarding the REDEFINES clause (1) The level number in the entry with a REDEFINES clause must be the same as the level number of the item being redefined. (2) There may be no intervening entries with the same or smaller level number between a redefining entry and the item being redefined. (3) The redefining area may have a length that is less than or equal to the length of the redefined area. (4) No attempt is made to recognize the REDEFINES clause on entries with level number 01. COB2SAS, Release 2 Reference Page 5 (5) Multiple redefinition is supported. Multiple redefintion is of the form: B REDEFINES A C REDEFINES A Multiple redefinition is not of the form: B REDEFINES A C REDEFINES B In other words, each redefinition in a multiple redefinition redefines the same data area. (6) Implicit redefinition caused by more than one level 01 entry subordinate to the level indicator FD is supported. Caveat regarding the BLANK WHEN ZERO clause (1) The BLANK WHEN ZERO is only recognized on elementary items. Caveats regarding the SYNCHRONIZED and JUSTIFIED clause (1) No attempt is made to appropriately process these clauses. (2) No attempt is made to address the issue of interrecord or intrarecord slack bytes. Caveats and restrictions regarding the COPY Clause (1) No attempt is made to parse pseudo-text. (2) When a COPY clause is being built, R2COB2 looks for either a level indicator, a level number, or the end of the file section to terminate the COPY clause. COB2SAS, Release 2 Reference Page 6 Assumptions regarding COPY members (1) The entries in copy members are syntactically correct COBOL statements. In other words, the following assumptions regarding the margins and the interpretation of entries within those margins are made: (1.1) The contents of columns 1 through 6 are ignored. (1.2) Any character in column 7, other than a blank or a hyphen, causes the entire line to be ignored. (1.3) The level indicators CD, FD, RD, SD, and 01 are in columns 8 through 11. (1.4) All clauses, other than the division and section identifiers, and the level indicators are in columns 12 through 72. (2) In order for R2COB2 to begin processing copy members, it must set DIVISION to DATA_DIV, SECTION to FILE SECTION, and NTRYTYPE to IN_DD. In order to do this, it searches for the character strings 'DATA DIVISION' and 'FILE SECTION'. If it does not find these strings, then it checks for a level indicator in area A. If it does not find a level indicator in area A, it inspects the first token of each line for one that identifies a clause in a data description entry. If it finds the division or section identifier, a level indicator, or a token that initiates a data description entry clause, it will begin processing. Caveats regarding variable length tables. (1) When building the data dictionary, variable length tables are treated as fixed length. (2) No attempt is made to produce SAS language statements that will read variable length tables. Restrictions regarding variable length files. (1) No attempt is made to handle variable length files. COB2SAS, Release 2 Reference Page 7 * Clauses recognized A list of clauses that R2COB2 is able to parse is given here: 01 CLAUSE: 01 dataname 01 FILLER 02-49 CLAUSE LEVEL-NUMBER dataname LEVEL-NUMBER FILLER REDEFINES CLAUSE REDEFINES dataname REDEFINES FILLER EXTERNAL CLAUSE EXTERNAL IS EXTERNAL GLOBAL CLAUSE GLOBAL IS GLOBAL USAGE CLAUSE usage-value USAGE usage-value USAGE IS usage-value PICTURE CLAUSE PIC pic-value PIC IS pic-value PICTURE pic-value PICTURE IS pic-value SIGN CLAUSE sign-value SIGN sign-value SIGN IS sign-value sign-value SEPARATE sign-value SEPARATE CHARACTER SIGN sign-value SEPARATE SIGN IS sign-value SEPARATE SIGN sign-value SEPARATE CHARACTER SIGN IS sign-value SEPARATE CHARACTER COB2SAS, Release 2 Reference Page 8 SYNCHRONIZED CLAUSE SYNCHRONIZED SYNCHRONIZED LEFT SYNCHRONIZED RIGHT JUSTIFIED CLAUSE JUSTIFIED JUSTIFIED RIGHT BLANK CLAUSE BLANK ZERO BLANK WHEN ZERO VALUE CLAUSE VALUE val-value VALUE IS val-value OCCURS CLAUSE 66 CLAUSE 88 CLAUSE COPY CLAUSE COB2SAS, Release 2 Reference Page 9 * Attributes that are recognized 01 Clause 01 Clause USAGE Clause 01 Clause PICTURE Clause 01 Clause PICTURE Clause BLANK WHEN ZERO Clause 01 Clause PICTURE Clause USAGE Clause 01 Clause PICTURE Clause USAGE Clause BLANK WHEN ZERO Clause 02-49 Clause 02-49 Clause USAGE Clause 02-49 Clause PICTURE Clause 02-49 Clause PICTURE Clause BLANK WHEN ZERO Clause 02-49 Clause PICTURE Clause USAGE Clause 02-49 Clause PICTURE Clause USAGE Clause BLANK WHEN ZERO Clause 02-49 Clause REDEFINES Clause 02-49 Clause REDEFINES Clause USAGE Clause 02-49 Clause REDEFINES Clause PICTURE Clause 02-49 Clause REDEFINES Clause PICTURE Clause BLANK WHEN ZERO Clause 02-49 Clause REDEFINES Clause PICTURE Clause USAGE Clause 02-49 Clause REDEFINES Clause PICTURE Clause USAGE Clause BLANK WHEN ZERO Clause 02-49 Clause OCCURS Clause 02-49 Clause OCCURS Clause USAGE Clause 02-49 Clause OCCURS Clause PICTURE Clause 02-49 Clause OCCURS Clause PICTURE Clause BLANK WHEN ZERO Clause 02-49 Clause OCCURS Clause PICTURE Clause USAGE Clause 02-49 Clause OCCURS Clause PICTURE Clause USAGE Clause BLANK WHEN ZERO Clause COB2SAS, Release 2 Reference Page 10 66 Clause This attribute is ignored. 88 Clause This attribute is ignored. Copy Clause This attribute is ignored. COB2SAS, Release 2 Reference Page 11 * The programs that invoke COB2SAS, Release 2 Included in COB2SAS, Release 2 are 4 programs, one for each operating system for which COB2SAS is distributed, that invoke COB2SAS. These 4 files differ in the conventions used to invoke R2COB1 through R2COB7 and in the conventions used to reference files. - R2CMS is used to invoke R2COB1 through R2COB7 on CMS. - R2MVS is used to invoke R2COB1 through R2COB7 on MVS. - R2VMS is used to invoke R2COB1 through R2COB7 on VMS. - R2VSE is used to invoke R2COB1 through R2COB7 on VSE. Any special requirements for the INFILE and FILE statements are provided in thes programs. Also, in these programs, you can provide statements that will direct the data dictionary to a permanent SAS data set and the SAS language statements to a flat file. * Differences between CMS, MVS, VMS and VSE - INFILE & FILE statements Under CMS and MVS, there is no need to specify DCB information on the INFILE and FILE statements. Under VSE, it necessary to provide complete and explicit DCB information (that is, RECFM, LRECL and BLKSIZE). Under VMS, it necessary to explicitly provide RECFM=F LRECL=80 so that the SAS System will treat the file as if it has fixed length records. - Conventions for including code for execution Code is included for execution from a MACLIB under CMS. Code is included for execution from a PDS under MVS. Code is included for execution from a sublibrary under VSE. Code is included for execution from a directory under VMS. - Appending to sequential files CMS, MVS and VMS allow specification of the MOD disposition. This mechanism allows programs to append output to existing sequential files. VSE provides no mechanism for appending output to existing sequential files. COB2SAS, Release 2 Reference Page 12 * R2COB1 through R2COB7 The programs that convert COBOL language data description entries into SAS language statements are stored in 7 files. - R2COB1 uses PROC FORMAT to create formats that are used while parsing COBOL language data description entries. - R2COB2 uses a data step to parse the data description entries and produce 2 SAS data sets. The first data set contains an observation for each data description entry. The second data set has information about the lengths of groups of data items. - R2COB3 uses PROC SORT and a data step to combine the group data set and the data dictionary data set. - R2COB4 uses PROC SORT and the data step to compress the lengths of variable names to no more than 8 characters. - R2COB5 is an optional step that uses a data step and PROC SORT to expand 1 dimensional tables. * R2COB5 is designed to: => expand items that occur no more than 999 times => expand 1 dimensional tables In the event that either of the limits is exceeded, it will generate an appropriate error message. * By default, the statement that includes R2COB5 is commented out. If you want to use R2COB5, you will need to acquire a copy of the program that invokes COB2SAS on your operating system (for example, R2MVS on MVS) and uncomment the line that includes R2COB5 for execution. - R2COB6 uses the data step to produce SAS input statements. - R2COB7 uses the data step to produce SAS label statements. Regardless of the operating system, the contents of R2COB1 through R2COB7 are identical. COB2SAS, Release 2 Reference Page 13 * The programs that create the data dictionary * R2COB1 R2COB1 consists of PROC FORMAT statements. These formats are used while parsing the COBOL language data description entries. The following is a description of each of the formats. $DDTFMT. This format has a list of all keywords in the data description entries. Any programmer supplied word, in the data description entries, will resolve to the value OTHER. $DDICFMT. This format has a list of all the keywords that initiate a clause. In the COBOL data description entries, each clause begins with a keyword. For this reason, when the program is attempting to identify the clause, it will always have exactly one keyword that identifies the clause. $DDCFMT. This format is used to resolve each clause into a single value. Since the exact construction of each clause often includes optional words (for example, IS), each combination of required and optional keywords in a clause is resolved into a single value here. DDSHOFMT This format is used when the program encounters an entry that it does not yet recognize. When the program recognizes that it has encountered an entry that it does not recognize, it will use this format to list each type of clause in the entry. DDAVFMT This format is used to convert those entries that it recognizes into a single value and to return UNRECOGNIZED for those entries that have not yet been defined. COB2SAS, Release 2 Reference Page 14 * R2COB2 R2COB2 consists of a single data step. This data step creates two SAS data sets called, DICTNRY and GROUP. The DICTNRY data set contains an observation for each data description entry. The GROUP data set has information about the lengths of groups of elementary items. This data step reads either complete COBOL programs or COPY members. It keeps track of which division, section and type of entries it is processing. When it is processing data description entries within the File Section, it evaluates each token, clause and entry. When a clause is complete, it extracts information from that clause for later processing. When an entry is complete, it processes the information it has extracted and it outputs an observation to the DICTNRY data set. This data step keeps track of: - the depth of level nesting within the entries in the File Section. - the appropriate usage value for each level. - the total number of bytes within each group of elementary items. - the position, within the file, at which each group begins. * For records that occur more than once, when the end of that group is encountered, it updates the number of bytes for that record by multiplying the number of bytes in a single occurrence by the number of times that it occurs. * For groups that are redefined, it updates the position, within the file, at which the redefining group begins. * For nestings of groups within groups and the various combinations of records, occurs and redefines, it maintains an accurate count of the total number of bytes and the position, within the file, for each group. * R2COB3 At the end of the R2COB2 data step, there is an observation in the DICTNRY data set for each data description entry. If there are any records, there is an observation for each in the GROUP data set. The PROC SORT and data step in R2COB3 combines these two data sets. COB2SAS, Release 2 Reference Page 15 * Processing performed to build the data dictionary The data dictionary is a SAS data set that is built by the data step in the R2COB2 file. The following is a description of the algorithm used to build the data dictionary. Read a line from the input file. Skip blank lines. If the variable TRACEPRS has a value of 9, then display a ruler, and the contents of the current and next lines. Return to the top of the data step. Evaluate the contents of column 7. If the indicator area has anything other than a blank or a hyphen, return to the top of the data step. Check for a line that initiates a new division. Check for the beginning of a copy member. Depending upon the division, process the input appropriately. In the case of the Identification and Environment Divisions, no actions are taken. In the case of the Data Division, the file and data description entries are processed. In the case of Procedure Division, unless the SEVERAL flag has been set to 'Y', the program performs end of file processing and stops. * Processing within the Data Division Upon initial entry into the Data Division, any pending processing is finished. Check for a line that initiates a new section. In the case of sections other than the File Section, no actions are taken. In the case of the File Section, data description entries are processed. COB2SAS, Release 2 Reference Page 16 * Processing within the File Section Upon initial entry into the File Section, any pending processing is finished. The current line is evaluated to determine whether it has a level indicator initiating entries in a file, sort-merge file, or data description entry. Depending on the type of entries, process the input appropriately. In the case of file and sort-merge file description entries, any pending processing is completed, initial values are set appropriately, the value of the file name is saved for later processing, and the program returns to the top of the data step. In the case of data description entries, any pending processing is completed, each word is parsed and evaluated (the details of this process are described later), and the program returns to the top of the data step. * Processing Data Description Entries - Each data description is known as an entry. - Each entry is made up of one or more clauses. - Each clause specifies an attribute of the entry. When parsing data description entries, each clause is stored in a variable CLS_STR, known as the clause string. When the clause string is formatted with $DDCFMT, known as the data description clause format, the value returned is used to set a bit in the variable AV_SUM, known as the attribute vector. Depending on the clause, information is extracted from the clause string for processing when the end of the entry is encountered. When the end of an entry is encountered (which is when the beginning of the next entry or the end of file description is encountered), the value in the attribute vector is formatted with the DDAVFMT, known as the attribute vector format. At this point, the program can tell whether this particular set of attributes is defined to it. If the set of attributes is defined to it, it will make use of the information that it acquired from each clause and attempt to perform appropriate processing. COB2SAS, Release 2 Reference Page 17 If the set of attributes is not defined to it, and only if the variable ATTR_ERR is set to 'Y', it will use the SHOCLS format, known as the show clause format to show each type of clause that is in the entry. (It does not have the ability to show the actual clause itself, instead it shows each type of clause in the particular set of attributes that is undefined to it.) * Parsing Algorithm The syntax of COBOL data description entries are defined in general formats by the American National Standards Institute. In addition to these general formats, each implementor of a COBOL compiler may define extensions to these formats. COB2SAS, Release 2 is designed to recognize the general formats for data description entries in the ANSI specification X3.23-1985, as well as, some of the extensions implemented in IBM(R) OS/VS COBOL. * Overview of the parsing algorithm Each data description entry consists of one or more attributes. Each attribute is specified by a clause. For example, the entry 05 MONTHLY-PAYMENT PICTURE IS 9(6)V99 USAGE IS COMP-3 OCCURS 1 TO 12 TIMES DEPENDING ON CURRENT-MONTH. consists of 4 attributes, namely, the level clause, the picture clause, the usage clause and the occurs clause. Each clause consists of one or more reserved words and zero or more programmer supplied words. Reserved words may only be used as specified in the general formats that describe each entry. The values of programmer supplied words are, as the name implies, supplied by the programmer. For example, in the clause OCCURS 1 TO 12 TIMES DEPENDING ON CURRENT-MONTH the tokens '1', '12' and 'CURRENT-MONTH' are programmer supplied words and the remaining tokens are reserved words. The parsing algorithm adds tokens to the current clause until it recognizes that it has built one complete clause. When the end of a clause is found, a bit in the attribute vector is set to indicate that the clause is part of the entry. When the end of the entry has been encountered, the attribute vector is evaluated and acted upon. COB2SAS, Release 2 Reference Page 18 Each clause begins with exactly 1 keyword and consists of 1 or more keywords and programmer supplied words. Within the context of several of the clauses, the numbers 1 through 49, 66 and 88 are valid programmer supplied words (for example, 'OCCURS 1 TO 49 TIMES DEPENDING ON VAR1'). These very same numbers can also indicate the beginning of a clause (for example, '01 VAR1'). For this reason, a variable known as Parse Mode is employed. Parse Mode either has the value 'Identify Clause' or the value 'Building Clause'. When it has the value, 'Identify Clause', the keywords encountered are interpreted as the beginning of a new clause. On the other hand, when it has the value, 'Building Clause', the keywords and programmer supplied words are interpreted as being tokens within the clause being built. Although most of the clauses are relatively simple constructions of of a few words, there are several clauses that consist of more than one repetition of the same set of keywords and programmer supplied words. For this reason, a variable known as Clause Mode is employed. For each clause, this variable is given an appropriate value. When parsing data description entries, Clause Mode may have the value 'Picture Clause', 'Value Clause', 'Occurs Clause', '66 Clause', '88 Clause', 'Copy Clause' and 'Simple Clause'. COB2SAS, Release 2 Reference Page 19 * Detailed example of parsing a data description entry * Valid forms of the 02-49 Clause and the PICTURE Clause level-number level-number FILLER level-number data-name PICTURE character-string PICTURE IS character-string PIC character-string PIC IS character-string * Partial listing of formats used while parsing /* Data Description Identify Clause Format */ $DDICFMT '01' - '49' = 'LEVEL NUMBER' 'PICTURE','PIC' = 'PICTURE' OTHER = 'UNDEFINED' /* Data Description Token Format */ $DDTFMT '02' - '49' = '2.' OTHER = '3.' 'FILLER' = '4.' 'IS' = '6.' 'PICTURE','PIC' = '9.' /* Data Description Clause Format */ $DDCFMT '2.3.','2.4.' = '2' OTHER = 'UNDEFINED' /* Data Description Show Format */ DDSHOFMT 2 = '02-49 CLAUSE' 6 = 'PICTURE CLAUSE' OTHER = 'UNRECOGNIZED CLAUSE' /* Data Description Attribute Vector Format */ DDAVFMT 68 = '2.6.' OTHER = 'UNRECOGNIZED' COB2SAS, Release 2 Reference Page 20 * Algorithm used to parse a data description entry The string in the input buffer is: 05 MONTHLY-PAYMENT PICTURE IS 9(6)V99. The value of the Attribute Vector Sum is 0. -> The next token is obtained. It is '05'. Parse Mode has the value 'Identify Clause'. The token is formatted with the $DDICFMT format. The result is 'LEVEL NUMBER'. Parse Mode is given the value 'Building Clause'. Clause Mode is given the value 'Level Clause'. Clause String is given the value of the token. Clause String has the value '05'. The token is formatted with the $DDTFMT and the result is stored in the Token Vector. Token Vector has the value '2.'. Token Vector is formatted with the $DDCFMT format. Clause ID is set to 'UNDEFINED'. -> The next token is obtained. It is 'MONTHLY-PAYMENT'. Parse Mode has the value 'Building Clause'. Clause Mode has the value 'Level Clause'. The token is concatenated to the Clause String. Clause String has the value '05 MONTHLY-PAYMENT'. COB2SAS, Release 2 Reference Page 21 The token is formatted with the $DDTFMT and the result is concatenated to the Token Vector. Token Vector has the value '2.3.'. Token Vector is formatted with the $DDCFMT format. Clause ID is set to '2'. (This number is used as a power of 2 in order to set a bit in the Attribute Vector Sum.) At this point, the Clause String has a complete 02-49 Clause. The value 2**2 is added to the Attribute Vector Sum. The value of the Attribute Vector Sum is 4. Token Vector is set to the value ' '. Clause String is set to the value ' '. Parse Mode is set to the value 'Identify Clause'. -> The next token is obtained. It is 'PICTURE'. Parse Mode has the value 'Identify Clause'. The token is formatted with the $DDICFMT format. The result is 'PICTURE'. Parse Mode is given the value 'Building Clause'. Clause Mode is given the value 'Picture Clause'. Clause String is given the value 'PICTURE'. Clause ID is set to 'UNDEFINED'. COB2SAS, Release 2 Reference Page 22 -> The next token is obtained. It is 'IS'. Parse Mode has the value 'Building Clause'. Clause Mode has the value 'Picture Clause'. The token is concatenated to the Clause String. Clause String has the value 'PICTURE IS'. Clause ID is left with the value 'UNDEFINED'. -> The next token is obtained. It is '9(6)V99'. Parse Mode has the value 'Building Clause'. Clause Mode has the value 'Picture Clause'. The token is concatenated to the Clause String. Clause String has the value 'PICTURE IS 9(6)V99'. Clause ID is set to '6'. (The value in Clause ID is used as a power of 2 in order to set a bit in the Attribute Vector Sum.) At this point, the Clause String has a complete picture clause. The value 2**6 is added to the Attribute Vector Sum. The value of the Attribute Vector Sum is 68. Since this is the end of the data description entry, the Attribute Vector Sum is formatted with the DDAVFMT format. The result is '2.6.'. The program attempts to appropriately process these attributes. COB2SAS, Release 2 Reference Page 23 * Actions taken for each clause The actions taken at the end of each clause is given here: 01 Clause - Bit 1 is set in the attribute vector. - Initial values are set for a new record. 02-49 Clause - Bit 2 is set in the attribute vector. - Initial values are set for a new record. REDEFINES Clause - Bit 3 is set in the attribute vector. - The data name being redefined is stored in RDF_NAME. EXTERNAL Clause - If ATTR_ERR equals 'Y', bit 4 is set in the attribute vector. - This clause is ignored. GLOBAL Clause - If ATTR_ERR equals 'Y', bit 5 is set in the attribute vector. - This clause is ignored. PICTURE Clause - Bit 6 is set in the attribute vector. - The picture value is put into a standard format. USAGE Clause - Bit 7 is set in the attribute vector. - The usage value is put into a standard format. SIGN Clause - If ATTR_ERR equals 'Y', bit 8 is set in the attribute vector. - This clause is ignored. SYNCHRONIZED Clause - If ATTR_ERR equals 'Y', bit 9 is set in the attribute vector. - This clause is ignored. COB2SAS, Release 2 Reference Page 24 JUSTIFIED Clause - If ATTR_ERR equals 'Y', bit 10 is set in the attribute vector. - This clause is ignored. BLANK WHEN ZERO Clause - Bit 11 is set in the attribute vector. - The BWZ_FLAG is set to 'Y'. VALUE Clause - If ATTR_ERR equals 'Y', bit 12 is set in the attribute vector. - This clause is ignored. OCCURS Clause - Bit 13 is set in the attribute vector. - R2COB2 continues to build this clause until it has built either: OCCURS ocr_value TIMES or OCCURS integer-1 TO ocr_value TIMES at which point, it it simply ignores remaining tokens. The first token that it encounters that identifies the beginning of a new clause causes it to stop building the OCCURS clause altogether. 66 Clause - If ATTR_ERR equals 'Y', bit 14 is set in the attribute vector. - This clause is ignored. 88 Clause - If ATTR_ERR equals 'Y', bit 15 is set in the attribute vector. - This clause is ignored. COPY Clause - If ATTR_ERR equals 'Y', bit 16 is set in the attribute vector. - This clause is ignored. - No attempt is made to parse pseudo-text. COB2SAS, Release 2 Reference Page 25 * Level Nesting The R2COB2 data step keeps track of the depth of level nesting within file description entries. Since it makes use of a stack for tracking usage values, the index to this stack is the value of the depth. * Usage Stack In the COBOL language, it is not required that every item have an explicitly stated usage value. For this reason, R2COB2 keeps track of each item's usage value in the STKUSAGE routine. When an entry has a usage clause, its level number is compared to the previous level number. Depending on the result of the comparison the usage stack is updated with the value in the usage clause. On the other hand, if an entry does not have a usage clause, the usage value is extracted from the usage stack. * Group Stack & Group Data Set To track groups, it makes use of two data structures. First it maintains a group stack which is used to keep track of the position, within the file, at which a group begins and the total number of bytes in that group. When a group is first encountered, a new entry is pushed onto the group stack. As elementary items within that group are encountered, the total number of bytes in that group is incremented by the number of bytes in that item. When the end of that group is encountered (that is, when a level number that is less than or equal to the level number of the group is encountered), the number of bytes in all groups on the the stack are reevaluated, the group just ended is popped off of the stack and an observation is output to the GROUP data set. * Redefinition Stack The redefinition stack keeps track of the position and number of bytes in entries with level numbers greater than the current entry. When redefinition is encountered, it uses this information to update the ATBYTE and ITM_DISP variables. * Offset Stack The offset stack keeps track of the position at which those groups that occur more than once begin and the displacement of each item within those groups. COB2SAS, Release 2 Reference Page 26 * Conversion from Usage and Picture to Informat All picture values are converted to a standard format. For alphabetic or alphanumeric strings, the format is either 'A(w)' or 'X(w)'. For numbers, the format is '9(w)V9(f)'; where w and f are numbers of bytes. For example, the picture 'S99999V99' is converted to '9(5)V9(2)' and the picture 'X(48)' remains 'X(48)'. (1) When USAGE is 'DISPLAY': 1.A) If PICTURE is 'A' OR 'X', then the informat is '$CHAR' 1.B) If PICTURE is '9', then If BWZ_FLAG = 'Y', then the informat is 'ZDB' Otherwise the informat is 'ZD' (2) When USAGE is 'COMP' or 'COMP-4': The PICTURE should be '9' and the informat is 'IB' (3) When USAGE is 'COMP-1': The PICTURE should be ' ' and the informat is 'RB4.' (4) When USAGE is 'COMP-2' or 'BINARY': The PICTURE should be ' ' and the informat is 'RB8.' (5) When USAGE is 'COMP-3' or 'PCKDCML': The PICTURE should be '9' and the informat is 'PD' (6) When USAGE is 'INDEX': The PICTURE should be ' ' and the informat is 'IB4.' COB2SAS, Release 2 Reference Page 27 The variable PICTURE will have values in the form, 'A(w)' or 'X(w)', or in the form, '9(w)V9(f)'; where 'w' stands for the whole part and 'f' stands for the fractional part. The whole and the fraction are appropriately combined to produce a width and a decimal for the SAS informat. (1) When the informat is '$CHAR': The WIDTH = WHOLE (2) When the informat is 'ZD' or 'ZDB': The WIDTH = (WHOLE + FRACTION) The DECIMAL = FRACTION (3) When the informat is 'IB': The NUM_DIG = (WHOLE + FRACTION) The DECIMAL = FRACTION If (NUM_DIG GE 1) and (NUM_DIG LE 4) Then WIDTH = 2 If (NUM_DIG GE 5) and (NUM_DIG LE 9) Then WIDTH = 4 If (NUM_DIG GE 10) and (NUM_DIG LE 18) Then WIDTH = 8 (4) When the informat is 'PD': The WIDTH = CEIL((WHOLE + FRACTION + 1) / 2) The DECIMAL = FRACTION COB2SAS, Release 2 Reference Page 28 * Options available by way of the SWITCHES data set TRACEPRS (Trace Parse) (Default Value: TRACEPRS='0') Trace Parse determines the type of tracing information produced while parsing data description entries. TRACEPRS = '0': No parse trace infromation is produced. TRACEPRS = '4': Show Divisions, Entry Types and program statements. TRACEPRS = '5': Show parsing variables. TRACEPRS = '6': Show Divisions, Entry Types, program statements and parsing variables. TRACEPRS = '9': Do nothing, but show the action of the Look Ahead Buffer and the value of EOF. (It always shows EOF=1 twice because it is looking one line ahead and recognizes the last line in the file twice.) TRACESTK (Trace Stack) (Default Value: TRACESTK='0') Trace Stack determines the type of tracing information produced by the stack routines. TRACESTK = '0': No stack trace information is produced. TRACESTK = '1': Show information used by routines: STKUSAGE, STKGROUP, STKREDEF, STKOFFST. TRACESTK = '2': Show information used by routine: STKUSAGE. TRACESTK = '3': Show information used by routine: STKGROUP. TRACESTK = '4': Show information used by routine: STKREDEF. TRACESTK = '5': Show information used by routine: STKOFFST. TRACESTK = '6': Show information used by routines: STKGROUP, STKREDEF. TRACESTK = '7': Show information used by routines: STKGROUP, STKREDEF, STKOFFST. COB2SAS, Release 2 Reference Page 29 ATTR_ERR (Attribute Error) (Default Value: ATTR_ERR='N') Set ATTR_ERR to 'Y' if you wish to flag unrecognized attributes. Set ATTR_ERR to 'N' if you wish to process unrecognized attributes. SEVERAL (Several) (Default Value: SEVERAL='N') Set SEVERAL to 'Y' if you wish to process more than one COBOL source program or COPY members. Set SEVERAL to 'N' if you wish to process only one COBOL source program or COPY member. DEL_FLLR (Delete FILLER) (Default Value: DEL_FLLR = 'Y') Set DEL_FLLR to 'Y' if you wish to exclude entries, from the INPUT statements produced, that have 'FILLER' in the data name. Set DEL_FLLR to 'N' if you wish to include entries, in the INPUT statements produced, that have 'FILLER' in the data name. USE_AT (Use @) (Default Value: USE_AT = 'Y') Set USE_AT to 'Y' if you wish to have the @ column pointer in INPUT statements produced. Set USE_AT to 'N' if you wish to have no @ column pointer in INPUT statements produced. MAKE_LBL (Make LABEL) (Default Value: MAKE_LBL = 'Y') Set MAKE_LBL to 'Y' if you wish to have LABEL statements produced. Set MAKE_LBL to 'N' if you want no LABEL statements produced. COB2SAS, Release 2 Reference Page 30 * Extending recognition of attributes of data description entries If you wish to add processing for new sets of attributes within the data description entries, follow these guidelines. - Create a COBOL data description entry with a set of attributes not yet implemented. - Set TRACEPRS to '5' and ATTR_ERR to 'Y'. - Execute COB2SAS, Release 2. - Inspect the SAS Log for the message: UNRECOGNIZED ATTRIBUTES IN DATA DESCRIPTION - This shows you a list of that combination of attributes that is not yet recognized. - In the line above this message, find the value of AV_SUM. - You must now edit R2COB1 and implement that combination of attributes. For example, the attributes: 02 FIELD PIC 99 SYNC result in AV_SUM=580. So, in DDAVFMT add, 580 = '2.6.9.' '2.6.9.' is arrived at by inspecting the DDSHOFMT format and adding a period to the end of each attribute's id number. In other words, in the DDSHOFMT format, 2 = '02-49 Clause', so append a period after the 2. Another way of saying this is that the 02-49 Clause sets bit 2 in the attribute vector, the PICTURE Clause sets bit 6 in the attribute vector and the SYNCHRONIZED Clause sets bit 9 in the attribute vector. - Once you have added this to the DDAVFMT format in R2COB1, you will most likely have to add appropriate processing for it in R2COB2. If you must extract any information from the clause, edit the IN_DD routine. The clause will be in the CLS_STR. If you must perform processing peculiar to this set of attributes when the end of the entry is encountered, then add that in the EODDNTRY routine. (NOTE: For clauses which must set Clause Mode to values other than 'Level Clause' and 'Simple Clause', be sure to update the EODDSCTN routine as well.) COB2SAS, Release 2 Reference Page 31 * Modifying the program to parse other entries If you wish to modify the program to parse entries other than data description entries, follow these guidelines. 1) Build a Token Format for the entry. - Take the general format from the ANSI Reference Summary. - Remove any clause that contains either ellipses or more than 1 programmer supplied word. - Working down from the top of the general format, enter each reserved word in the description token format. - Items enclosed in braces will produce the same output when formatted (for example, PICTURE and PIC). - Simply assign a formatted value to each set of reserved words (for example, 'PICTURE',PIC' = '9.'). 2) Build an Identify Clause Format for the entry. - Take the general format from the ANSI reference summary. - Enter the reserved word or words that identify each clause in the entry. (In other words, enter each token that initiates a clause in the entry.) - In the case where the clause contains ellipses or more than 1 programmer supplied word, assign a unique value for the formatted output. - Assign a unique value for the formatted output to those tokens that initiate the entire entry. For example, the level indicators. - For all tokens that initiate a simple clause (that is, a clause that does not contain an ellipses or more than 1 programmer supplied word), assign the value 'IDENTIFIED'. - Any other token, when formatted, has the value 'UNIDENTIFIED'. COB2SAS, Release 2 Reference Page 32 3) Build a Clause Format for the entry. - Take the general format from the ANSI Reference Summary. - Remove any clause that contains either ellipses or more than 1 programmer supplied word. - Working down from the top of the general format, combine the formatted ouptut from the Token Format to form each valid clause. - Programmer supplied words are assigned the value 'OTHER' in the Token Format. Underlined reserved words are required in the clause. At most 1 formatted value is available for items in braces. It is up to you to form all possible valid combinations of tokens within each clause. - The formatted output of the Clause Format is a string. Assign values to this string by choosing sequential numbers starting at the top and working down. - All other values, when formatted, are assigned the value 'UNDEFINED'. 4) Build an Attribute Vector Format. - The formatted output from the Clause Format is treated as a number. That number is used as an exponent of 2. The result of raising the base 2 to that exponent value is added to the Attribute Vector Sum (AV_SUM). - When the end of an entry is found (which is either at the end of the section or the beginning of the next entry) the value in AV_SUM is formatted with Attribute Vector Format. - This formatted output is used to determine which attributes are in the entry just completed. COB2SAS, Release 2 Reference Page 33 * Acknowledgements COBOL is an industry language and is not the property of any company or any group of companies, or of any organization or any group of organizations. No warranty, expressed or implied, is made by any contributor or by the CODASYL Programming Language Committee as to the accuracy and functioning of the programming system and language. Moreover, no responsibility is assumed by any contributor, or by the committee, in connection therewith. The authors and copyright holders of the copyrighted material used herein FLOW-MATIC (Trademark of Sperry Rand Corporation), Programming for the UNIVAC(R) I and II, Data Automation Systems copyrighted 1958, 1959, by Sperry Rand Corporation; IBM Commercial Translator Form No. F 28-8013, copyrighted 1959 by IBM; FACT, DSI 27A5260-2760, copyrighted 1960 by Minneapolis-Honeywell have specifically authorized the use of this material in whole or in part, in the COBOL specifications. Such authorization extends to the reproduction and use of COBOL specifications in programming manuals or similar publications. Copies of ANSI standard X3.23-1985 are available from the American National Standards Institute, 1430 Broadway, New York, NY 10018. IBM is a registered trademark of International Business Machines Corporation. SAS is a registered trademark of SAS Institute Inc., Cary, NC USA VMS is a trademark of Digital Equipment Corporation. COB2SAS, Release 2 Reference Page 34