Technical Protection of Electronic Documents in Computer Systems
Valery A. Konyavsky
Scientific Research Institute of Problems of Computer Technology and Information Russian Ministry of Communications and Information
Information results from the reflection of the movement of material objects in living systems.1 It circulates in company with similar organisms in the form of data and reports. Data are formed as a result of the reflection by these organisms of material objects, including reports. Reports are formed as organisms for the transmission of data to other organisms. They contain the totality of data being transmitted and represent a selection of signs with the help of which data may be transmitted to and received by another organism.
The transformation of data into reports and reports into data is carried out by individuals using algorithms for coding and decoding the set of symbols received into the elements comprising its “information” model of the world. Thus, information in the form of data is born in the minds of individuals (and only there) and cannot be protected through technical means.
Until recently, the problems of technical protection were reduced to the protection of computers from unauthorized access, the limitation of access to data, and network protection. Paradoxically, at none of these stages was there any discussion of what exactly we were protecting. It is obvious that if a plant produces teapots and a factory makes boots, then they will be potential targets for crime. Computer systems do not produce information. They process certain reports and elaborate others. But what does the information system produce? What should be protected? If we agree that it does not produce information, then it remains to be determined exactly what it does produce.
The results of information systems operations include spam (informational trash) and electronic documents. What is not deemed a document is just a useless scrap of paper. Structured and combined together, electronic documents repre-
sent information resources, which have value when—and only when—they are complete, authentic, accessible, and current.
For a report to be an electronic document, it must include a number of attributes attesting to its compliance with special requirements of high-tech end products deemed by society to have legal weight. It must adhere to technical and technological requirements for document creation and transmission, points that must also be documented by various generally recognized means.
Again, information systems produce electronic documents, a process that involves elements such as
data (other electronic documents)
network (telecommunications) resources
One important information security-related event that has occurred in the last decade is the appearance and development of the concept of device security. The main ideas of device security include the following:2
recognition of the multiplicative protection paradigm and, as a result, equal attention to implementation of control procedures at all stages of information systems operations (the protection of the entire system is no greater than the protection of its weakest link)
“materialist” resolution of the fundamental question of information security: “What first—hardware or software?”
consistent rejection of software-oriented control methods as obviously unreliable (attempts to use software to monitor the correctness of other software is equivalent to attempting to resolve the unsolvable question of self-applicability—“Munchausen Syndrome”) and the shifting of the most critical control procedures to the device level (Archimedes’ principle), in accordance with which “support points” must be created to carry out device-based control procedures
maximum possible separation of condition-stable (software) and condition-variable (data) elements of control operations (divide-and-conquer principle)
The need to protect information technologies has only recently been recognized. Up to now, the public has defined an electronic document as a file signed with an electronic signature. This is incorrect. Here are two illustrations—a coded message and a piece of currency. Neither has a signature or a seal, but they are documents nonetheless. Why do we accept them as documents? Only because (and this is enough) we trust the technologies by which they were produced. If the commander of a military unit receives a coded message with orders from his command from the hands of the code officer, he has every reason to accept the text he has received as a document (order). And if he finds that same
text lying on his desk without knowing how it got there, then it is time to investigate the matter. Such investigations involve methods little known in broader circles. Matters are different with regard to the currency. Few of us ever receive bills directly from the printing plant. More often, the ways in which bills come into our possession are not completely known. Our behavior is also different—when we receive bills at a local branch of the state savings bank, as a rule we count them quickly, but if we receive them as change from a trader at the market, we might examine them more carefully for evidence they may be counterfeit.
Technologies for electronic exchanges must meet certified standards, and this compliance must be monitored. The various stages of the information exchange process involve people (operators, users) and information technologies—technical (personal computers, servers) and programmatic (operating systems, preprocessor output programs). Information is created by people, then transformed into data, and then entered into automated systems in the form of electronic documents, which together with other such documents represent information resources. Computers exchange data over communications channels. During the operation of automated systems, data (electronic documents) are transformed in accord with information technologies being applied. Therefore, we may identify the following seven components of technical security:
authentication of participants involved in information exchange
protection of hardware from unauthorized access
delineation of access to documents, personal computer resources, and networks
protection of electronic documents
protection of data in communications channels
protection of information technologies
delineation of access to datastreams
In working with the last component, protection is required not only for data in communications channels but also for the channels themselves. In fact, at present it is impossible to create a system of any large scale on these channels—it is expensive, ineffective, and unprofitable. It is almost impossible to make full use of a given channel. Existing channels are operating at barely 10 percent of capacity, which suggests the obvious conclusion, namely, the organization of virtual private networks using existing channels. This requires datastream tunneling; that is, data in various virtual private networks created over common channels must be isolated. Access to these data must be restricted.
Taken together in their entirety, points 1, 2, 3, and 5 and, in part, 7 also compose the focus of information protection as it is traditionally understood. It is obvious that this focus is actually much broader, including at least points 4 and 6. This fully explains the lack of significant successes in traditional approaches to resolving these problems in practice.
Having clarified this,2 following are a few requirements for implementing the various levels of protection. As a house is built of bricks or other structural components, an information system is likewise built from various premanufactured elements, with only a small applied component being created from scratch, as a rule (although this newly created component is the most important, as it determines the functionality of the system). It is appropriate to recall the multiplicative protection paradigm, particularly that the level of information security of a system is no higher than that of its weakest link. For us this means that when using premade components we must select them so as to ensure that the protection level for each one is no lower than that required for the system as a whole. This applies to the protection of both information technologies and electronic documents. Lack of protection for either means that efforts in other areas are wasted.
AUTHENTICATION OF PARTICIPANTS IN INFORMATION EXCHANGE
Operator identification/authentication (IA) must be performed on a device basis at each stage in the operating system loading process. IA databases must be stored in the energy-independent memory of the information security system so that access to them via a personal computer would be impossible; that is, the energy-independent memory must be located outside the personal computer’s address space. The control software must be stored in the controller’s memory and protected against unauthorized modifications. The integrity of the control software must be ensured through the technology built into the information security system controller. Identification must be performed using an alienable information carrier.
As with operator IA, device-based procedures for remote user IA are necessary. Authentication may be handled through various means, including by electronic signature. A requirement for “intensified authentication” is becoming mandatory, that is, periodic repetition of the procedure during the work session at various time intervals short enough to prevent an ill-intentioned individual from doing significant damage if protection measures are thwarted.
PROTECTION OF HARDWARE FROM UNAUTHORIZED ACCESS
Means of protecting computers from unauthorized access may be divided into two categories: electronic locks and device modules for authorized loading. The main difference between them is the way in which integrity control is implemented. Electronic locks operate on a device basis to carry out user IA procedures, but they must resort to the use of external software to perform integrity control procedures. Device modules perform all the functions of electronic locks as well as integrity control and administration functions. As a result, these mod-
ules not only provide user IA but also handle the authorized loading of the operating system, a most important function in the construction of an isolated programming environment. Device modules handle a significantly wider range of functions than electronic locks, and they require device-based performance (not using operating system resources) of complex functions such as file system selection, reading of real data, and so forth. In addition, by integrating control functions into the hardware, device modules also offer greater reliability of results.
• Control of technical integrity of personal computers and servers. Control of the integrity of personal computer components must be carried out by the information security system controller before the operating system is loaded. All resources that might be used must be monitored, including
system BIOS (Basic Input/Output System)
interrupt vectors INT 13 and INT 40
CMOS, including floppy disks, hard disks, and CD-ROMs
The integrity of the technical components of servers must be ensured through intensified network authorization procedures. These procedures must be carried out at the point at which the computer being verified logs on to the network and again at time intervals previously determined by the security administrator. Enhanced authentication must be implemented using the recommended type of random number generator device. The performance of the device must be monitored with a system of recommended tests.
• Control of operating system integrity. Control of the integrity of system components and files must be carried out by the controller before loading of the operating system, which is done through the mechanism of real data reading. Since the electronic document exchange process may involve the use of various operating systems, the controller’s built-in software must handle the most popular file systems. The integrity of a given software package must be guaranteed by the technology of the data security system controllers. The controllers’ devices must protect the software from unauthorized modifications. The well-known (published) hash function must be used to monitor integrity, and its standard value must be stored in the energy-independent memory of the controller protected on a device basis from access from the computer.
• Control of applications software and data. The integrity of applications software and data may be monitored by the data security system on a device or software basis if its integrity was registered on a device basis at the preceding stage. The well-known (published) hash function must be used to monitor integrity, and its standard value must be authenticated with the help of a remote technical data carrier (identifier).
LIMITATION OF ACCESS TO DOCUMENTS, COMPUTER RESOURCES, AND NETWORKS
Modern operating systems increasingly include built-in access limitation capabilities. As a rule, these capabilities utilize particular features of specific file systems and are based on attributes closely linked to one of the application program interface (API) levels of the operating system. This inevitably leads to problems, including at least the following:
• Linkage to features of specific file systems. As a rule, modern operating systems use not one but several file systems, both new and old. It usually happens that an operating system’s built-in access limitations work with a new file system but might not with an old one, as they make use of substantially different features in the new file system. This circumstance is usually not directly addressed in the system documentation, which could lead the user to become confused. Let us imagine that a computer with a new operating system is running software developed for the previous version, which is oriented toward the features of an older file system. The user rightly assumes that the established security mechanisms, certified and intended for this operating system, will perform their functions, while in reality they will be turned off. In real life, such cases may be encountered fairly often. Why rewrite an application just because you have changed operating systems? Especially when the goal is to ensure the compatibility of old file systems and link them to new operating systems!
• Linkage to the API of the operating system. As a rule, operating systems now change frequently, once every year or year and a half. It is not impossible for them to change even more often. Some of these changes involve changes in the API, for example, the replacement of Win 9x with WinNT. Since the access limitation attributes are a reflection of the API, a move to an updated version of an operating system requires the reworking of security system addons, retraining of personnel, and so forth. Thus, we might posit the following general requirement: The subsystem for access limitation must be built on the operating system and be independent of the file system. Of course, the set of attributes must be sufficient for the purposes of describing the security policy, and the description must not be in operating system API terms but rather in terms that are customarily used by security administrators.
ELECTRONIC DOCUMENT SECURITY
The life cycle of the electronic document occurs in three spheres of existence, located concentrically one within the other: the electronic environment of numerical processes, the analogous environment of subjects and objects, and the social environment of cognitive subjects. The outermost layer is formed by the multitude of cognitive subjects of the social environment, which forms the sector
of activity for the document that dictates the rules of information exchange for its subject members, including requirements for interactive technology. If these rules and requirements are met, the report is deemed a document, while the information it contains is deemed by the sector as a (juridical) fact, a formal basis for initiating, changing, or terminating specific relations between subjects in the society.
The requirements of the sector of effectiveness may be divided into two categories: semantic, which are applicable to the representation of the meaning of the information, and technological, which dictate the formation of the document. The semantic aspects are the prerogative of the social environment and therefore are not considered in this paper and are deemed to be fulfilled. Given this condition, in order for a report to be recognized as a document, the parameters of the technologies used in its creation, transformation, transmission, and storage must fall within the bounds of allowable deviations from a certain standard prescribed by the sector for document-based electronic interaction. Only in this case do we have the legal grounds for considering that the requirements are met, for example, with regard to ensuring the integrity, confidentiality, and authenticity of the document.
The traditional analog document is created once in object form—a sheet of paper with a surface of designs or letters. The physical parameters of the object are stable with regard to external effects, and any changes made are relatively easy to detect. Over the course of its entire life cycle, the object document is not transformed into a different object. At any moment, the analog document is concentrated in a single point in space, so opportunities for unauthorized access are limited. The selection of available traditional information technologies is narrow, so the requirements for standard technology are obvious in their omission. Electronic documents are another matter. The ease and simplicity of modifying them are based on the very environment in which they exist—copying and replacement operations are fundamental even in Turing machines. The electronic document is transformed many times during its life cycle, and physical indications of its distortion are difficult to find. Here, requirements for the correlations of technologies and standards are extremely significant. Therefore, protecting the electronic exchange of information involves two classes of tasks: ensuring that the document remains equivalent to the original electronic document or standard over the course of its life cycle and ensuring that the electronic technologies used remain equivalent to the standards prescribed by the sector of effectiveness.
In the electronic environment it makes no sense to interpret information as data, sense, knowledge, or fact. Random numbers are also poetry to a computer—a multitude of binary bits, from which comes order, a sequence of 0’s and 1’s. Any two multitudes reflect the same information if the given relation of order is maintained—there are a multitude of isomorphs.3 Thus, a binary-limited sequence can always be transformed into a number, and in the electronic envi-
ronment, information is a number. A number does not change over time and space but is always fixed and static. When stored on a memory disk, a number is reflected by a “painting” of the disk surface with magnetic domains of various orientations. It is said that a computer’s memory stores data, understood as the fixed form of existence of electronic information: data are numbers.
The purpose of any sort of protection is to ensure the stability (fixation) of the given properties of the protected object in all points of its life cycle. The degree of protection of an object is determined by comparing the standard (the object at an initial point in space and time) with the result (the object at the moment of observation). In our case, at the point of observation (when the electronic document is received) there is only very limited contextual information about the standard (content of the initial electronic document), although we have full information on the result (the document as observed). This means that the electronic document must contain attributes attesting to its compliance with technical and technological requirements, particularly the immutability of the information at all stages of document creation and transmission. One such attribute might be authentication security codes.2
• Protection of documents during creation. An authentication security code must be produced on a device basis during the creation of a document. Before code production begins, the isolation of the software environment must be ensured. There must be no opportunity for copying the document onto an external storage disk before the security code is produced. If a document is created by an operator, the code must indicate the operator’s identity. If a document is created by the software component of the automated system, the code must indicate which software component it was.
• Protection of documents during transmission. Protection of the document during its transmission over external (open) communications channels must be implemented through the application of certified cryptographic tools, including those involving electronic digital signatures, for every document transmitted. Another option is also possible, in which a packet of documents is signed with an electronic digital signature, and each document is verified with another analog of a handwritten signature, for example, an authentication security code.
• Protection of documents during processing, storage, and execution. During these stages, document protection is ensured with the use of authentication security codes, which are required at the start and finish of each stage. These codes must be produced on a device basis and be linked to the processing procedure (the stage of information technology). For an incoming document with an authentication security code (ASC) and electronic digital signature, ASC2 is produced and only then is the digital signature removed. Furthermore, at the next stage (n), ASCn+1 is produced and ASCn-1 is removed. Thus, at any moment, the document is protected by two codes—ASCn and ASCn+1. Authentication security codes must be produced and verified for any document placed in the operating
memory of a computer in which software environment isolation has been established and maintained. ASCn-1 is removed after ASCn+1 is put in place.
• Protection of documents during access from the external environment. When a document is being accessed from the external environment, its protection involves two mechanisms that have already been described above, namely, identification/authentication of remote users and limitation of access to documents, computer resources, and networks.
PROTECTION OF DATA IN COMMUNICATIONS CHANNELS
Channel coders have traditionally been used to protect data in communications channels, and there are no alternatives. Two points must be kept in mind: (1) certification and (2) channels transmit not only data but also control signals.
PROTECTION OF INFORMATION TECHNOLOGIES
Electronic documents in automated systems are not only stored but also processed. A computer represents memory and computation. When a document is processed, some data disappear and others appear, but the information remains the same. The numbers change but the information does not, as the isomorphism is maintained between the multitudes of binary signals in the old and new formats. In the electronic environment, there must be in principle some new form of existence for information accompanying the process of data transformation—information cannot disappear between the start and conclusion of the process. We must therefore assume that information exists in a dynamic form, in the form of a process.
A process is by definition dynamic, involving the changing of something over time, while information must be constant. To avoid a contradiction, a dynamic process must have some time-fixed or static feature. Such a feature exists: the fixation of the description of the process in time, no matter what point in space (a computer) or moment in time the process was observed. In fact, the specific process of information processing in a computer is determined by a fixed algorithm, procedure, or protocol. Assuming there are two forms for representing information in the electronic environment—static, in the form of an object, and dynamic, in the form of a process—we can therefore also assume that there are two fundamentally different classes of elements in the electronic environment. As soon as the first class is defined as numbers, the second class may logically be termed functions (transformations, representations). At the start of a function, we have numbers (data), while at the end, we see new numbers (data). At any moment in time and any point in space, the function remains a function. The function (or in related terms, the representation, algorithm, transformation) is unchanged.
In passive form (storage), an electronic document is a fixed object in an analog environment (memory device), while in an active form, an electronic
document exists as a fixed process in an electronic environment. Accordingly, let us identify two components involved in protection: protection of data (numbers), or the electronic document itself as a physical object, and the protection of processes (functions), representing the active form of existence of the electronic document. Information (data) is defined as a multitude with a specified relation of order. Protection of functions (algorithms) means protection of the computing environment in unvarying form with regard to the information or data processed in it. Electronic technology also represents an ordered multitude (of operations or processes) and therefore can be formally recognized as information technology. The internal unity of the components of protection is revealed: It is protection of information data and protection of information technology. Thus, the status of the document includes not only the identity of the document itself (its compliance with a standard) but also the compliance of the information technologies used with standard requirements.
Despite the obvious similarity, mechanisms for the protection of the electronic document as an object (number, data) and as a process (function, computing environment) are radically different. In contrast to the situation with protection of an electronic document, with protection of information technology, the characteristics of the required technology standard are reliably known, but there is limited information about the compliance with these requirements by the technology actually used, that is, the result. The electronic document itself, or more accurately its attributes, is the only object that could carry information about the actual technology (such as sequence of operations). As before, one such attribute would be the authentication security code. The equivalence of technologies could be established more accurately with a greater number of functional operations linked through this security code. The mechanisms would not differ from those used in protecting electronic documents. Furthermore, the presence of a particular authentication security code could be considered to signify the presence of a corresponding operation in the technological process, and the value of the code could also indicate the integrity of data at a given stage in the technological process.
LIMITATION OF ACCESS TO DATASTREAMS
As a rule, routers with a virtual private network builder function are used to limit access to datastreams. This function may be carried out reliably only with the help of cryptographic tools. In such situations, special attention must be paid to the key system and the reliability of key storage. Naturally, requirements for access policy in the differentiation of datastreams are completely different from those in the limitation of access to files and catalogs. In the latter, only the simplest mechanism is possible—access is either permitted or forbidden.
Complying with the requirements discussed above ensures a sufficient level of protection for electronic documents as the most important type of reports processed in information systems.