Availability Problems Experienced in Information Technology Systems
Numerous well-publicized failures of major systems show that current technology and operating practices are not meeting expectations. For example, the 3-year-old central computer system that monitors the position of trains in the Washington, D.C., Metrorail system reportedly crashed 50 times in the first 15 months after its deployment. In September 1999, it failed for unknown reasons, delaying morning startup by 45 minutes and causing significant delays in the rush hour. A number of high-profile Internet companies have also experienced problems with World Wide Web sites for electronic commerce, many stemming from problems in upgrading systems and growing traffic volume. Charles Schwab's online brokerage service, for example, experienced more than a dozen outages in 1999, during which users could not access real-time quotes, check account information and margin balances, or execute trades. Online retailer Beyond.com experienced an extended outage in October 1999 as a result of complications stemming from a scheduled upgrade, in 1998, problems with unscheduled maintenance caused Amazon.com to take its site offline for several hours; eBay and E*Trade Securities are experiencing intermittent outages as the volume of visitors to their Web sites increases. Indeed, a survey conducted in late 1999 by the consulting company Deloitte & Touche found that the primary business concerns of online brokerage firms were system outages and an inability to accommodate growing numbers of online investors. Performance and reliability were also cited as significant concerns.
SOURCES: Junnarkar (1999), Layton (1999), Luenig (1999), and Meehan(2000).
scale systems themselves reliable is more difficult. The telephone system, which is based heavily on software, may be the closest to reaching this goal, but its robustness has been achieved only at considerable cost and with delays in development.28 The race to develop new critical applications, driven by the rapid pace of innovation in Internet applications and services, has resulted in inadequate, even dangerously poor, robustness. Often prototypes or simplistic implementations become so popular so quickly that expectations far exceed the reliability achievable with the initial design. Moreover, even when systems are designed carefully to address reliability concerns, their complexity makes it doubly difficult to achieve reliability and robustness goals.
The spread of IT bears witness to the fact that, overall, hardware reliability has advanced significantly but software reliability has lagged