Go to the first, previous, next, last section, table of contents.


12. Contributing to COBOL for GCC Development

If you would like to work on improving COBOL for GCC, please contact me at tej@melbpc.org.au.

You can help by doing testing, by writing tests, by fixing bugs, or by writing code.

Contributions to the documentation are also very welcome, particularly for the chapter on the GCC internals.

We are working on a subset which will allow most of the runtime and parts of the compiler itself to be written in COBOL. So even if you don't know C you will be able to write work on most of the compiler and the runtime routines, using COBOL.

A list of tasks is available on the web site at http://sourceforge.net/projects/CobolForGCC/.

Note that the GNU coding standards are used. These are available at http://www.fsf.org. They include naming and documentation standards as well as layout standards for C code.

Extreme programming is used. See section 12.3 Programming Standards and Methodology. In particular this means every feature or fixed bug must have a test, and we only add code we need for the features we are implementing now.

The core compiler will be released under the GNU General Public Licence, but the runtime routines will be released under the GNU Libray General Public Licence. This means that in the library, software licenced under the standard GPL cannot be used without a release from the author.

For programming tasks, it is a good idea to define the interface to the program you are writing first and post that to the development mailing list. Once we have that agreed, write the tests and then the code.

Some of the types of tasks are:

  1. Documentation. Documentation of features supported by other compilers and the syntax they use, for example what the various COMP usages mean for the various compilers.
  2. COBOL Test Code (not hard if you know COBOL). Provide COBOL code that can be used to test various features of the compiler. Provide the code plus details of the expected output and any expected error messages or failure conditions, plus details of the feature being tested. Things we need to test include: are bad input programs detected; are non-standard programs detected; do verbs work as expected; does the compiler support minima prescribed by the standard (eg minimum number of nested ifs); are useful error messages output in various situations; how well do we recover from certain error conditions; how fast do certain programs run. Normally we divide tests into small focussed tests for testing features, bugs or performance, and large torture tests for trying to push th compiler to and past its lmits. At this stage tests of function in the COBOL nucleus are most important.
  3. Runtime Routines. Some of the COBOL features will not be compiled directly into compiled code; rather the compiled code will call runtime routines which will perform complex functions, such as sorting. In each of these cases, you need to define an interface to the routine from the generated code and then write a program that implements that interface. You can write this in C or in COBOL. No special skills are needed other than command of the language used and an understanding of the COBOL 85 standard. If you use COBOL, then you need to confine yourself to the compiler-use subset of COBOL See section 12.2 COBOL Subset for use within the compiler.
  4. Grammar. YACC grammars for any part of COBOL outside the nucleus. We will need to parse the whole standard, even though we will probably not actually implement some of the language (segmentation, debug and communications will probably not be supported but we need to be able to parse them and report that we do not support them, to be standards compliant). COBOL is not LALR(1) compliant and YACC requires LALR(1) cmpliance. Therefore there is a need forsome 'hacks' to help YACC along. See the file `cobctok.def' for documentation on the existing hacks and to see how they should be structured.
  5. Enhancements to GCC to improve its support of COBOL eg native packed decimal support (very hard). One approach is to contribute patches to GCC to provide native support for packed decimal arithmetic on platforms that support it like S/390. This requires excellent knowledge of GCC internals and the target machine codes involved and good C coding skills. We woudl also need to to provide builtin routines for platforms without useful packed decimal support. This is a hard task because the GCC internal structure assumes oprtations are done with reigsters, which is usually not the case for packed decimal instructions. An alternative approach to this problem could be to provide builtin routines for packed decimal which just happen to include some 'asm' assembler inline code on certain platforms (you can check for the platform using #ifdef type constructs).
  6. Fixes for any bugs.

12.1 Contributors to COBOL for GCC

The original author was Tim Josling. The COBOL for GCC program actually consists for the most part of the main GCC compiler, so all the contributers to GCC have indirectly also contributed to COBOL for GCC.

Others have contributed in various ways.

12.2 COBOL Subset for use within the compiler

The subset of the language that can be used in the runtime of the compiler and in compiling the rest of the language will be a limited subset. This will allow much fo the compiler and runtime to be written in COBOL.

The compiler will be built in two phases. One, build a compiler that can handle the limited subset. Then build the full compiler. The code within the full compiler including the full compiler's runtime library can use the limited subset of COBOL.

The subset will have access to all the C runtime (including memory allocation, file IO, formatting, string, date and time) via the function call facility.

What should be in the subset?

I am thinking:

Only parts of the Nucleus, plus support for functions (required to
interface to C), plus some new data types (pointers, binary-*) from the
new draft standard, and parts of interprogram communication, and ability
to create and call functions.

Excluding lots of things...

IDENTIFICATION DIVISION:

Unsupported: AUTHOR - comments can do this
Unsupported: INSTALLATION - comments
Unsupported: DATE-WRITTEN - comments
Unsupported: DATE-COMPILED - comments
Unsupported: SECURITY - comments

You get program-id/function-id and the repository paragraph from
COBOL 2000.

ENVIRONMENT DIVISION:

Unsupported: SOURCE-COMPUTER - obsolete (no debugging mode)
Unsupported: OBJECT-COMPUTER - obsolete
Unsupported: SPECIAL-NAMES - luxury

So get basically nothing here.

DATA DIVISION:

A limited subset of PIC only - only pic x(nnn) or pic x. All numerics are via binary-xxx

A limited subset of USAGE only - only display plus
binary-char/short/long/double plus pointer (data pointers, program
pointers and function-pointers (an extension beyond COBOL 2002).

Unsupported: SIGN IS
Unsupported: SYNCHRONIZED
Unsupported: JUSTIFIED
Unsupported: BLANK WHEN ZERO
Unsupported: RENAMES (66 level)
Unsupported: condition names (88 level) - syntactic sugar only.
OCCURS is allowed but not ascending/descending key. Occurs
depending on can have occurs from n to 0 where the zero means 'no
limit'.

So you get linkage and working storage, structures, redefines,
pointers, binary numbers and alphanumeric data (pic x), and
occurs.

The only values that are allowed are integers and alphanumerics
(character strings). Strings can be joined by the "&" operator. Thus
"AB" & "CB" is the same as "ABCD". Hex strings are allowed too: X"C1".

From COBOL 2000 you also get local-storage (automatic variables).

Procedure division:

Limited support for: ACCEPT - "from" phrase not allowed.
Unsupported: ADD, DIVIDE, SUBTRACT, MULTIPLY - use compute
Unsupported: ALTER - obsolete
Limited support for: COMPUTE - only one receiving item allowed, no rounding, no
size error.
CONTINUE - allowed
Limited support for: DISPLAY - "upon" phrase not allowed
Unsupported: ENTER - obsolete - maybe later for embedding assembler
Unsupported: EVALUATE - not allowed
Unsupported: EXIT - has no effect, not allowed
Limited support for: GO TO - go to without procedure name not
allowed. Go to depending on supported.
Limited support for: IF - OK but next sentence not allowed
Unsupported: INITIALIZE - not allowed
Unsupported: INSPECT - not allowed
Limited support for: MOVE - corresponding phrase not allowed.
PERFORM - allowed
Unsupported: SEARCH - not allowed
Limited support for: SET - pointer arithmetic only.
Limited support for: STOP - OK but stop literal not allowed
Unsupported: STRING - not allowed
Unsupported: UNSTRING - not allowed
Unsupported: USE - not allowed

FUNCTION - ability to define and call functions per COBOL 2002, but the COBOL
instrinsic functions will not be there.

So you have limited forms of

accept, compute, display, go to, if, move, set, stop.

Also from inter-program communication you would have 

Limited support for: Procedure division using/returning but no by reference/content
Limited support for: CALL - no on overflow or on exception. No dynamic
calls, only call "literal".
Unsupported: CANCEL - not allowed
Unsupported: GLOBAL/EXTERNAL phrases - not allowed
EXIT PROGRAM - allowed (no 'goback')
linkage section.

12.3 Programming Standards and Methodology

I use the ExtremeProgramming methodology. (See c2.com/cgi/wiki?ExtremeProgrammingRoadmap)

In particular:

Development languages: GNU C, Flex, Bison, configure, dejagnu, COBOL subset, autogen.

Coding standards: Free Software Foundation standards and ExtremeProgramming standards.


Go to the first, previous, next, last section, table of contents.