Technical Descriptions of Ada and Other Third-Generation Programming Languages
This appendix gives technical and historical descriptions of Ada and the most often cited third-generation programming languages (3GLs)-C and C++-and a new 3GL-Java-the focus of rapidly growing interest in the programming community and a potential candidate for replacing C, C++, or Ada in certain application domains. These descriptions are followed by a comparison of the languages in terms of their capability for ensuring high reliability and for supporting the requirements of long-lived, embedded, real-time, and/or distributed systems.
Any software system can be implemented in essentially any reasonably complete programming language. However, languages vary with respect to how effectively-in terms of cost, schedule, and level of risk-they support the programming of a solution that successfully achieves the required functionality and quality. The descriptions below are intended to clarify such variations among languages. In contrast to the approach taken in the section in Chapter 2 titled "Software Engineering Process and Architecture," architecture, design, and development and maintenance processes here are held fixed; for the purposes of the discussion in this appendix, the only independent variable is programming language choice. As pointed out in Chapter 2, certain programming language choices may enhance the development and maintenance process itself, but that interaction is ignored for the purposes of this comparison.
Ada 83 was the result of a requirements-driven language design competition, beginning in 1975 with the first "Strawman" requirements document, continuing through a series of requirements documents culminating in the "Steelman" document, and resulting in a preliminary standard in 1980, an American National Standards Institute (ANSI) standard in 1983, and an International Organization for
Standardization (ISO) endorsement of the ANSI standard in 1987. The actual design work was performed by a design team, with review by a panel of experts and the interested public at large.
Major concerns in the design of Ada 83 were reliability, maintainability, human engineering, and efficiency. Human engineering refers to choosing keywords, syntax, and semantics to maximize readability, while trying to minimize "surprise" and error-prone constructs. For example, all control flow constructs have a distinct "end" marker (e.g., "end if," "end loop"), and all program units allow the name of the unit to be repeated (and the compile-time to be checked) at the end marker. Parameters may be specified as "in," "out," or "in out" to indicate the direction of information flow upon subprogram call. Formal parameter names may be used at the call point to identify unambiguously the association between formal and actual parameters.
Ada 83 supports strong type checking, extended to provide strong distinctions between otherwise structurally equivalent numeric types, as well as between otherwise structurally equivalent array types and pointer types. Ada 83 is unusual in that it allows the programmer to distinguish two same-sized integer types as representing distinct abstractions, and to specify that an array is meaningfully indexed by one, but not the other, or that a subprogram can meaningfully be passed by one, but not the other. For example, the two integer type declarations
type Month_Number is range 1..12
type Hour_Number is range 1..12
introduce two distinct integer types, and the fact that they have identical ranges does not alter the fact that they are distinguishable at compile-time when used as array indices, subprogram parameters, and record components. The compiler will detect the use of a value of one type when the other is expected. Furthermore, a change to one, such as switching Hour_Number to be range 0.23, does not have an unintended effect on some other abstraction.
Ada 83 supports data abstraction, modularity, and information hiding through a module construct called a "package" and through "private" types, types whose internal structure is hidden from code outside the defining package. Objects, subprograms, and any other language entity may be declared in the private part or body of a package, thereby hiding it from external access, and allowing revision during maintenance without disturbing external clients of the package.
Program units may be separately compiled while preserving full compile-time consistency checking across units. All program units may have a separate specification and body, allowing the physical configuration control of interfaces to allow productive parallel development of large systems, and enabling interface integrity to be verified before, rather than after, the code is developed.
Packages and subprograms may be defined as "generic" units, which are parameterized by types, objects, and subprograms. Such generic units must be explicitly instantiated with appropriate actual parameters prior to use. Like other units, generic units have a separate specification and body. When a generic unit is compiled, it is checked for legality. Further checks are performed when the unit is instantiated.
Ada 83 defines a complete set of run-time consistency checks to enforce range constraints on numeric types, index constraints on array types, and "discriminate" constraints on other composite types. In addition, all pointers are default initialized to null, and checked for null prior to dereferencing. Ada 83 defines an ability to raise and handle run-time exceptions. The predefined run-time checks all raise such run-time exceptions, allowing the programmer to write fault-tolerant code that catches unanticipated software problems, and performs appropriate recovery or disciplined shutdown actions.
Ada 83 includes a standard multithreading model, with a rendezvous construct to support inter threading communication and synchronization. Explicit delays are supported, as is timed
rendezvous. Finally, Ada 83 includes constructs for explicit user control over representation of types, as well as a "pack" directive to influence the compiler's selection of representation.
The current Ada standard, Ada 95, was developed between 1990 and 1995. As with Ada 83, the development was performed by a language design team, and requirements and review were provided through an open forum. In February 1995, the revised language was approved as an ISO standard, replacing the former edition of the standard. The overall goal of the Ada 95 design process was to maintain the reliability, maintainability, human engineering, and efficiency of Ada 83, while enhancing the flexibility and extensibility of the language, and the programmer's control over storage management and synchronization.
Ada 95 generalized the type definition mechanisms of Ada 83 to allow a type to be defined as an "extension" of another type, and to treat a type and all its extensions, direct and indirect, as a "derivation class" of types, with "class-wide" operations and dynamically bound implementations of operations. Added to the existing support for abstraction and modularity, type extension and dynamic binding give Ada 95 support for the object-oriented programming paradigm.
Ada 95 also enhanced the multithreading model, by providing "protected objects" that allow the programming of data-oriented synchronization mechanisms, without introducing additional threads.
Ada 95 added support for pointers to subprograms, as well as pointers to declared, as opposed to heap-allocated, objects. All access types include an "accessibility" level, which is checked by the implementation, generally at compile-time, to prevent the creation of dangling references.
The numeric model was enhanced with the addition of modular (unsigned, wraparound) integers with bit-wise logical operators, and decimal fixed-point types, to support exact financial calculations.
The generic facility was enhanced to allow parameterization by packages that are instances of other generics, so that layered generic abstractions may be defined. In addition, the generic "contract" model was strengthened so that the legality of an instantiation is fully determined by the actual parameters and the generic specification, allowing the body of the generic to be altered during maintenance without endangering the legality of existing instantiations.
Where appropriate, additional run-time checks were defined in Ada 95 to support the enhanced features. In particular, a conversion from a class-wide type to an extension of its root type involves a run-time check to ensure that the conversion is meaningful, as does a conversion from an "anonymous" access type to a named access type to prevent the creation of a dangling reference (based on the "accessibility" level mentioned above).
In addition to these syntactic and semantic enhancements to the language, a number of additional standard packages, pragmas, and attributes are defined in "annexes" to the standard. Some of these packages, pragmas, and attributes must be supported by all implementations, such as packages for string manipulation and random number generation and pragmas for interfacing to other languages. Others are specifically designed to support particular application domains, such as real-time, distributed systems, and safety/security-critical systems.
The C language was designed at Bell Laboratories in the early 1970s, as a successor to the language BCPL, for the purpose of writing an operating system (Unix) and associated utilities for minicomputers. During the late 1970s, C and Unix were used widely in universities, and during the 1980s C emerged as the language of choice for systems programming on minicomputers, workstations,
and personal computers. The ANSI standard for C was approved in 1989, and the ISO standard based on ANSI C was approved in 1990.
C has a sparse syntax, with braces used for begin and end markers in all control flow, program unit, and type declaration constructs. Single-character operators are provided for assignment, indirection, address-of, bit-wise and, or, "xor," and not, and the usual arithmetic operations. Operators are also provided for pre- and post-increment and decrement, operate-and-assign, and left- and right-shift.
Numeric data types are selected by names, such as "short int" or "long float." There is no capability to select a numeric type by required range or precision, and there is no notion of implementation-enforced range constraint. Enumeration data types are supported, but are implicitly convertible to and from integer types in any context.
Historically, interface definitions have not been necessary for C functions, with the default being that a function returns an "int" and takes any number of parameters. ANSI C introduced the notion of a function "prototype" to specify the function interface, and some implementations can be directed to require the presence of a prototype for all functions.
All arrays are indexable by any integer or enumeration type; all arrays have a low bound of zero, and a high bound of one less than the specified size. No bounds information is carried with array parameters, and no bounds checking is defined by the language standard, although some tools exist that will check for out-of-bounds references. Arrays are treated by the language as essentially constant valued pointers, and array indexing is defined in terms of an indirection applied to the addition of a pointer and an integer index.
Strings in C are represented by a pointer to their first character, with a null character used by convention to signify the end of the string. There is no language-defined checking for running off the end of a string.
Record-like "structs" are supported, but there is no language-defined data abstraction mechanism. Opaque, incomplete pointer types can be used to provide some degree of data abstraction. A "union" construct allows the creation of an undiscriminated union of types. There is no language-defined check for accessing the "wrong" member of a union.
The "cast" construct may be used to explicitly convert between numeric types (although implicit conversion is performed as part of a function call, and implicit widening is performed during arithmetic). The cast construct may also be used to convert between pointer types, or between an integer and a pointer type. There is no language-defined check associated with a cast.
No default initialization is defined by the standard for local variables; pointers, in particular, are not default initialized. No null-checking is defined for pointer indirection.
There is no language-defined construct for raising and handling exceptions, although there are standard functions for sending and handling "signals," which can be used to emulate exceptions in certain circumstances.
C provides some control over representation by the use of bit field indicators on "struct" components. However, it does not define the ordering of bit fields within a word. Some implementations provide "pack" pragmas or other means of providing more representation control.
There is no language-defined "module" construct other than a source file; objects and functions declared "static" are local to the source file. Objects and functions not declared "static," when defined at the top level, are externally visible from any other file that includes an "extern" declaration for the entity. By convention, the "extern" declarations for a source file, and associated type definitions, are usually grouped into a header file (".h" file), which can be textually included ("#include") in any source file requiring access to the type, object, or function.
C includes a standard preprocessor that supports textual include, conditional compilation, and parameterized textual macros.
The ANSI C standard includes a full set of library functions to support string manipulation (where a string is a null-terminated array of characters), random number generation, and input/output, among others.
The C++ language was first released in 1983 as an enhancement to C, with the major enhancement being the addition of a "class" construct inspired by the same-named feature of the language Simula-67. The language was initially defined by the implementation available from AT&T ("cfront") that translated C++ to C. Cfront, and hence C++, went through several major updates that added features such as multiple inheritance, generic templates, and exception handling. In the early 1990s, ANSI and ISO committees were formed to produce a standard for the language. A few additional features, such as run-time type identification and "namespaces," have been added during the standardization process. Approval of the ISO C++ standard is expected within the next year.
C++ includes all the features of C, although some features are revised to be more strongly typed. For example, enumeration types in C++ are implicitly convertible to integer types, but not implicitly convertible back. Also, function prototypes are required for all C++ functions. C++ adds to C support for data abstraction, type inheritance, and dynamic binding (virtual functions). Two kinds of multiple inheritance are supported: the default inheritance replicates the fields if the same base class is inherited through multiple paths, and the virtual inheritance shares fields if the same base class is inherited through multiple paths.
C++ also supports a generic template facility. No checking is defined for templates prior to instantiation; there is no template "contract" model. Instantiation is implicit by referring to an instance via "template_name8parameters>." Template functions are also supported; instantiation of a template function is automatic at a call, with the template parameters determined implicitly by the types of the call parameters.
C++ supports "throw"ing and "catch"ing exceptions. Exceptions can be represented by objects of any type; the "catch" is based on a type matching. The standard C++ library defines certain exception types, instances of which are thrown when an allocator fails to allocate storage, or when other errors occur.
The array indexing and cast constructs inherited from C remain unchecked in C++. There are standard templates for defining checked arrays and checked casts. Local pointers in C++ are not default initialized, and there is no language-defined check for dereferencing a null pointer. "Smart pointer" abstractions can be developed to check for null pointers, or to implement persistence or similar capabilities.
As in C, all numeric types are implicitly convertible on assignment and parameter passing, and implicitly "widened" in calculations.
C++ supports information hiding through the notion of protected and private data and function members. Private members are visible only inside a class (and to its "friends"). Protected members are visible inside all descendants of a class. C++ supports a multilevel namespace though a "namespace" construct, which provides no information hiding (there is no "private" part of a namespace) and is simply a hierarchical naming mechanism.
Java was developed over the past 5 years at Sun Microsystems. It was originally called "Oak" and was intended for use in small appliances, set-top boxes, and other embedded applications. In April
1995, a World Wide Web browser written in Java, called HotJava, was announced by Sun. HotJava had the ability to download small programs written in Java over the Web and execute them in the context of a Hypertext Markup Language (HTML) page being displayed by the Web browser. Since then, Sun's Java technology has been licensed by essentially all other Web browser developers, including Netscape and Microsoft, and has achieved widespread attention for its potential to provide many of the capabilities of client/server systems without many of the attendant complexities.
Java is syntactically based on C++ but semantically is closer to Modula-3 or Ada 95. It provides modularity through a combination of a "package" concept, which is a namespace with some information hiding associated with it, and the "class" construct, which is modeled closely on the C++ (and Simula-67) class construct. To support information hiding, methods (called "member functions" in C++) and data components may be marked as public, protected, or private, much as in C++, but with the added notion that, by default, methods and data are visible only to classes within the same package. Unlike C++, there is no textual "include" in Java; instead, individual classes, or a whole package of classes, are explicitly imported using an "import" statement at the top of the source file defining a class.
All code and objects in Java must be inside some class. The methods of a class are by default "virtual" in Java; calls to such methods are "dynamically" bound. Methods may be explicitly specified as "static"; calls to such methods are "statically" bound. The data components of a class are by default "per-instance," as in C++. Data components may be marked "static," which means that they are "statically" allocated and shared across the class, rather than one per instance.
Java fully supports single inheritance between classes. By default a class inherits from the single "root" type called "java.lang.Object." Alternatively, it may explicitly specify one parent class from which it inherits non-static methods and data components. Java provides a limited kind of multiple inheritance through the concept of an "interface" type: a list of methods that any "implementer" of the interface must provide. A class may specify any number of interface types that it claims to "implement." The compiler verifies that the methods required by each identified interface are present in the class. There is no separate specification for a class (other than that provided by interface types it implements). There is no separate "prototype" for a method of a class. A tool may be used to extract the documentation and specification for class.
Java has no direct support for enumeration types. Named integer constants may be used, but the compiler provides implicit widening between integer types on assignment and parameter passing and allows any integer type to index any array. Arrays in Java are indexed from zero, as in C and C++, but unlike C or C++, their semantics are not defined in terms of pointer arithmetic. In fact, Java does not support pointer arithmetic. Arrays are first class types, and carry a length at run-time against which all indexing is checked.
Pointers ("references") in Java are default initialized to null, and all pointer dereferences are checked for null. Conversions between references are checked at run-time for meaningfulness.
Java has exceptions, much as in C++, except that it enforces compatibility and completeness of "throw" signatures at compile-time (C++ enforces "throw" signatures at run-time). Failures of run-time checks, such as an array-bounds check, or a null-pointer check, result in a "throw" of a predefined exception. Run-time error exceptions do not need to be mentioned in a "throw" signature; other exceptions, including user-defined exceptions, do need to be mentioned in the "throw" signature of a method if it is going to throw or propagate the exception.
Java has no generic templates; the root type java.lang.Object can be used in some contexts to define (heterogeneous) "generic" data structures. Proposals exist to add a parametric polymorphism facility to Java, which could provide some of the added compile-time type checking associated with "homogeneous" data structures provided by the generic template features of Ada and C++.
Java has largely the same control flow constructs as C++. As in C and C++, switch statements rely on a programmer-inserted "break" to terminate a case. Java defines a special type "boolean" and requires a value of boolean type in the expression of an "if," "while," or "for" test. There is no implicit
conversion to "boolean"; the relational operators return boolean, as do the logical operators. The operator "=" is for assignment; "==" is for equality. A "break" or "continue" statement may have an identifier to identify the particular construct being exited or continued, providing some additional flexibility and maintainability relative to C and C++.
All class and array instances are allocated dynamically on a garbage collected heap. That is, there are no "stack-resident" arrays or class instances. All implementations of Java provide a garbage collector. There are no class instances "nested" inside other class instances, only references to dynamically allocated class instances. The same goes for arrays.
There is no user control over representation of data objects. There is no user control over storage management, other than a system method to force a garbage collection.
Java has a large standard library of classes and includes support for multithreading through a combination of a standard thread class and the notion of "synchronized" methods.
In comparing the features of Ada, C, C++, and Java, various principles underlying each language can be identified.
With C, the underlying goal is to provide reasonable portability (certainly when compared with assembly language) while giving the programmer full control of the machine. There is little attempt to provide strong consistency checking at compile-time, and no notion whatsoever of run-time checking built into the language (other than via use of the standard "assert" macro).
C++ provides more tools for defining abstractions, and increases the strength of the type checking on enumeration types. However, the default run-time behavior in C++ is still inherited from C, which means no run-time initialization or checking of pointers, no checking of array indexing, and no notion of range checking. The default conversion syntax, the simple "cast" inherited from C, does no checking. The basic primitives of C++ remain unsafe, although there are additional mechanisms available for creating safe abstractions.
Java takes the route of strongly enforcing run-time consistency, with all the necessary checks to ensure that a program does not corrupt data outside its prescribed space, including pointer initialization and null checking, array-bounds checking, and conversion checks. However, at compile-time, Java has essentially gone one step backward from C++ by dropping support for enumeration types, thereby eliminating an important source of compile-time consistency checks. Java very successfully creates a language that prevents code from corrupting data outside its purview, but it fails to provide tools for supporting thorough compile-time enforcement of interface consistency.
A second area of concern with regard to use of Java for critical systems development is that it is inextricably tied to a dynamic storage allocation model. Garbage collection is certainly less error-prone than is manual storage reclamation, but any use of dynamic storage allocation opens up the possibility of eventual storage exhaustion, as does dynamic stack extension. For an embedded or critical system, it is standard practice to require that all storage be allocated statically (at link time), including the stacks for all threads of control; recursion is also disallowed.
In comparison to the above languages, Ada 83 and Ada 95 attempt to provide more features to make compile-time consistency checking useful for finding mistakes, backed up by run-time consistency checks for cases in which only a dynamic check is meaningful. As mentioned above, Ada is one of the few languages that allows the programmer to create strong distinctions between structurally equivalent numeric, array, and pointer types. These distinctions allow an Ada interface to capture more of the semantics, and allow the Ada compiler to catch more mistakes in the use of an interface. The last decade has seen an explosion in the number of application programming interfaces (APIs) used to build systems. Inappropriate uses of an API are among the most common mistakes in such systems. By creating
stronger distinctions between numeric, enumeration, array, and pointer types, an Ada version of an API can reduce the likelihood of inappropriate use, and identify more such errors at compile-time.
At run-time, Ada has pointer default initialization, pointer null checking, array bounds checking, with user control over both the low and high bound, and conversion checking. In addition, Ada provides range checking, variant record checking, and, in Ada 95, both compile-time and run-time checks designed to eliminate "dangling" references associated with pointers to deallocated stack variables. This set of "dangling reference" checks ("accessibility checks") allows an embedded or critical program to avoid completely the use of dynamic storage allocation, while still providing the convenience of using pointers.
Both Ada and Java have support for multithreaded applications as a standard, portable part of the language, whereas C and C++ support multithreading generally through operating-system-dependent libraries. The Ada multithreading support includes various real-time-oriented features, such as timed entry calls and selective accepts with delay alternatives, whereas Java has only a basic timed "sleep" operation. To the basic Ada 83 multithreading support, Ada 95 adds protected objects, which are designed to support real-time systems by reducing overhead, minimizing "priority inversion," and generally improving predictability of thread synchronization. Java's synchronized methods, with wait/notify operations, provide similar capability, although with less encapsulation of the fields requiring synchronized access, a more race-prone "notification"-oriented synchronization model, and no particular concern for priority inversion.
Although Ada is a general-purpose 3GL, it was designed with extra attention to the concerns of real-time, embedded, and critical systems developers, namely very thorough consistency checking, mechanisms to support a very "static" storage allocation model, and multithreading support with time and priority-cognizant constructs. As such, at a technical level, it is a better fit to the needs of DOD critical and embedded systems development than are the other languages in widespread commercial use. These reliability-oriented features of the Ada language make development and maintenance more cost-effective, when cost to achieve the required level of quality and correct functionality is included. Of course, there are other non-technical issues involved in language choice (as discussed in Chapter 1), and other non-language issues involved in managing successful software development (discussed in Chapter 2).