What is COBOL? COBOL programming explained

The 60-year-old programming language that powers a huge slice of the world’s most critical business systems needs programmers

What is COBOL? COBOL programming explained
Thinkstock

Some technologies never die—they just fade into the woodwork. 

Ask the average software developer about COBOL (Common Business Oriented Language) and they’ll look at you as if you mentioned carbon paper, leaded gasoline, or the 78 RPM record. Compared to modern languages like Go or Python—or even Pascal or C!—COBOL seems wordy, clunky, passé.

But COBOL has endured. Far from an obsolescent technology we’ve happily parted company with, COBOL has become an institution. Massive COBOL codebases are still in use around the world, many of them running almost exactly as they were when first created. In Hollywood parlance, the COBOL language has “legs.”

So, yes, COBOL is still relevant and timely—painfully so, in fact. In recent months COBOL has re-entered the public consciousness, as states like New Jersey have put out a call for programmers to help move their COBOL applications into the 21st century.

In this piece we’ll look at COBOL’s origins, how the design of the programming language stands out even today, and what makes COBOL both so enduring and so intractable.

COBOL history

COBOL arose in the late 1950s and early 1960s. The development of the language was a project sponsored by the United States Department of Defense (DoD) that included a consortium of computer companies including IBM, Honeywell, Sperry Rand, and Burroughs. The goal was to create a programming language with the following attributes:

  • Portability between computer systems, thus making it easier to migrate software both across generations of hardware and between hardware makers.
  • More English-like syntax than other languages of the time (e.g., FORTRAN) as a way to encourage programming by a wider audience, even if at the expense of some operational speed.
  • The ability to accommodate future changes to the language.

The first official COBOL specifications came out in 1960. Over the next decade, and to the consternation of its critics, COBOL became the default choice for writing business applications. One reason for its fast spread was network effects: IBM, one of the original collaborators on the language, became an aggressive early adopter, and IBM’s dominating presence in the computing world helped contribute to COBOL adoption.

Due to its design advantages and heavyweight industry backing, COBOL has stuck around, outliving the original systems it was designed for by a wide margin. According to various estimates, by 1970 COBOL was the most widely used programming language in the world. By 1997, COBOL was believed to be running some 80 percent of business apps.

COBOL language

The designers of COBOL broke with the terse syntax of other programming languages at the time (again, such as FORTRAN). The idea was to create a programming language that could be read and understood by non-programmers, particularly accounting, finance, insurance, and other business professionals.

Consider a “hello world” program written in an early dialect of COBOL:

IDENTIFICATION DIVISION.
PROGRAM-ID. HELLO-WORLD.

PROCEDURE DIVISION.
DISPLAY 'Hello World!'.
END-DISPLAY.
STOP RUN.

For modern software developers reared on the terseness of languages like Python, this code is verbose. But the verbosity of COBOL (if not its execution) springs from the same conceit that informs modern languages like Python — that code is read many more times than it’s written, so it should be written to be readable.

A similar program in a more modern version of COBOL might look something like this:

program-id. hello.
procedure division.
display "Hello world!".
stop run.

While this example is more concise, the same basic principles apply: The code strives to be explicit about what’s going on at each step.

COBOL has strict rules regarding syntax and the internal organization of programs. A COBOL program is explicitly divided into sections, or divisions, that make it easier to locate and understand its components at a glance:

  • Identification division: Essentially a metadata section, containing details about the program, its author, and so on.
  • Environment division: Contains details about the runtime environment, for instance aliases for external devices, which might need editing when running the program on different hardware. This aided portability of a program between systems, where for instance I/O might be handled entirely differently.
  • Data division: Containing file and working storage sections, the Data division describes the files and variables (respectively) used in the program.
  • Procedure division: The actual program code lives here, broken into logical units called sections, paragraphs, sentences, and statements. It’s tempting to analogize these structures to modules or functions, because they serve roughly the same functions (dividing code into blocks, with constrained inputs and outputs) but they are far less flexible.

COBOL also has extremely strict formatting rules for the code, down to the number of spaces preceding a command. (Python users will find this familiar!) Some of these restrictions are a by-product of COBOL’s coming-of-age during the mainframe era of the 1960s, when programs were encoded on punched cards and the exact formatting of 80-column lines mattered. But other formatting restrictions enforce readability.

The idea behind the strict regimentation of COBOL programs is to make them as self-documenting as possible. After all, COBOL programs tended to remain in place for years or decades on end. The intent (if not always the end result) was to make every COBOL program an artifact that any COBOL programmer could understand, even years later, without the help of the programmer who created it. 

COBOL challenges

Much of COBOL’s continued prevalence—and inertia—comes from the fact that COBOL applications, once written, tended to be left in place indefinitely, with only minor modifications. The bigger and more mission-critical the app, the less likely it was to be disturbed. Mainframes, like IBM’s offerings, played a key role: They were built to be highly backward compatible and to run legacy software—like COBOL apps—across generations of hardware with minimal modifications. The result: Billions of lines of COBOL code running essentially unchanged for decades on end.

Over the years, COBOL has evolved, if slowly. It even now has an object-oriented variant, OO-COBOL, which includes support for modern features like Unicode, locales, and more advanced data types beyond strings and integers. But COBOL aggressively retains backward compatibility, so even these improvements and extensions adhere to the mandate that existing COBOL applications must continue to run.

Not all of COBOL’s language design choices have been popular with COBOL programmers. Some have led to overly complex programs that proved difficult to understand or debug, discouraging rewrites or improvements. COBOL’s GO TO command, like its counterpart in C, allowed programmers to jump freely around a program, and thus write more powerful applications. But undisciplined use of GO TO could turn a COBOL program into a rat’s nest of hard-to-trace cross-references.

COBOL programming today

COBOL survives today in a few incarnations. IBM actively maintains its own COBOL implementations, such as a version for the z/OS mainframe, and sustains many existing COBOL applications where they run. Micro Focus COBOL is a commercial COBOL edition that runs on Microsoft Windows, compiles COBOL applications to Java and .NET, and even deploys to cloud environments like Azure. You’ll also find open source implementations of COBOL, such as GnuCOBOL and the gcc-cobol frontend for the GCC compiler, which are freely available and compile to native machine code. However, they may lack some of the more advanced deployment or debugging features of the commercial COBOLs.

While COBOL remains in wide use, deep COBOL expertise is becoming harder to come by with each passing year. As a result, many former COBOL programmers have to be coaxed out of retirement to wangle older applications into the 21st century. Often, it isn’t COBOL programming knowledge that’s most at a premium, but intimate understanding of the mainframe environments where COBOL runs. Many COBOL applications work hand-in-hand with legacy technology such as IBM’s IMS and CICS transaction management and database systems, all of which require expertise that is increasingly rare.

Thus, as old-school as COBOL might seem, the need for COBOL language and development-environment expertise has grown with each passing year. Job listings for COBOL and related expertise abound. In March 2020, New Jersey put out an emergency call for COBOL programmers to help upgrade state unemployment benefits systems in the wake of the COVID-19 crunch.

Learn COBOL

Learning resources for COBOL are proliferating again, given the growing demand for the language. Modern developers who want to get up to speed with this most enduring of languages have a few options:

  • The University of Limerick, in Ireland, offers a complete COBOL programming course online, courtesy of its Department of Computer Science and Information Systems. It is not as up-to-date as some other resources, but given how little COBOL changes with time, that’s not necessarily a defect.
  • The Open Mainframe Project (part of the Linux Foundation) also offers COBOL resources. One is a full course in COBOL programming, co-sponsored by IBM. It is more modern than the University of Limerick course, and tailored to IBM’s zOS implementation of COBOL, which is a widely deployed version of the language.

COBOL has been a staple of business computing for decades, and the demand for COBOL programming talent only continues to grow. If maintaining or modernizing COBOL programs interests you, the time seems riper than ever to dive in.

Copyright © 2020 IDG Communications, Inc.