Hints of a Different Way— Case Studies in Practice-Based Evidence
The Institute of Medicine Roundtable on Evidence-Based Medicine seeks “the development of a learning healthcare system that is designed to generate and apply the best evidence for the collaborative healthcare choices of each patient and provider; to drive the process of discovery as a natural outgrowth of patient care; and to ensure innovation, quality, safety, and value in health care” (Roundtable on Evidence-Based Medicine 2006). Generating evidence by driving the process of discovery as a natural outgrowth and product of care is the foundational principle for the learning healthcare system. This has been termed by some “practice-based evidence” (Greene and Geiger 2006). Practice-based evidence focuses on the needs of the decision makers, and narrowing the research-practice divide by identifying questions most relevant to clinical practice and conducting effectiveness research in typical clinical practice environments and unselected populations (Clancy 2006 [July 20-21]).
This chapter highlights several examples of the use of healthcare experience as a practical means of both generating and successfully applying evidence for health care. In the first paper, Peter B. Bach discusses how the Coverage with Evidence Development policy at Centers for Medicare and Medicaid Services (CMS) has aided the development of important evidence on effectiveness for a range of interventions, including lung volume reduction surgery (LVRS), PET (positron emission tomography) scanning for oncology, and implantable cardioverter defibrillators. By identifying information needed for improved understanding of intervention risks and
benefits and designing appropriate trials or mechanisms to accumulate such evidence, CMS has accelerated access to health innovations. Moreover, generation of needed evidence from clinical practice is a means to better inform some of the many difficult clinical decisions inherent to medical practice. In the specific case of LVRS, the work of CMS identified an unproven approach that could have had an adverse impact on many patients before enough evidence was collected. By taking the lead, CMS helped develop timely information useful to other payers, clinicians, and patients.
Many risks or benefits of health technologies are not evident when initially introduced into the marketplace, and Jed Weissberg demonstrates the value of collecting, linking, and utilizing data for pharmacovigilance purposes in his paper on Kaiser Permanente’s use of accumulated data for a post-market evaluation of cyclooxygenase-2 (COX-2) inhibitors. In this case, analysis was hypothesis driven and Weissberg notes that substantial work is needed to achieve a system in which such insights are generated customarily as a by-product of care. For such a transformation, we must improve our ability to collect and link data but also make the organizational and priority changes necessary to create an environment that values “learning”—a system that understands and values data and has the resources to act upon such data for the betterment of care.
Stephen Soumerai discusses the potential for quasi-experimental study designs to inform the entire process of care. His examples highlight well-designed studies that have been used to analyze health outcomes and demonstrate unintended consequences of policy decisions. He notes that widespread misperception of observational trials belies their strength in generating important information for decision making. Sean Tunis expands this argument by illustrating how practical clinical trials (PCTs) could serve as an effective means to evaluate issues not amenable to analyses by randomized controlled trials (RCTs), using the example of a PCT designed to evaluate the use of PET for diagnosing Alzheimer’s disease. Alan H. Morris’ work with computerized protocols—termed adequately explicit methods—demonstrates the considerable potential for such protocols to enhance a learning healthcare system. In his example, protocols for controlling blood glucose with IV insulin (eProtocol-insulin) provide a replicable and exportable experimental method that enables large-scale complex clinical studies at the holistic clinical investigation scale while reducing bias and contributing to generalizability of trial results. These protocols were also integrated into clinical care electronic health records (EHRs) demonstrating their utility to also improve the translation of research methods into clinical practice. Additionally, they could represent a new way of developing and distributing knowledge both by formalizing experiential learning and by enhancing education for clinicians and clinical researchers.
COVERAGE WITH EVIDENCE DEVELOPMENT
Peter B. Bach, M.D., M.A.P.P.1
Centers for Medicare and Medicaid Services
Coverage with Evidence Development (CED) is a form of National Coverage Decision (NCD) implemented by CMS that provides an opportunity to develop evidence on the effectiveness of items or services that have great promise but where there are potentially important gaps between efficacy and effectiveness, the potential for harm without benefit in sub-populations, or an opportunity to greatly enrich knowledge relevant to everyday clinical decision making. Most Medicare coverage determinations are made at a local level through carriers and fiscal intermediaries under contract with CMS. However, a few times each year, an NCD is made at the central level that dictates coverage policy for the entire country.
Whether the coverage determination is made locally or through an NCD, these determinations are based on historical data regarding the risks and benefits of items or services. Once coverage decisions are made, Medicare very rarely evaluates utilization, whether or not beneficiaries receiving the services are similar to those studied, or assesses whether outcomes of the covered services match those in the reports used to make the determination. At the extreme, there are many instances in Medicare coverage where determinations are made regarding coverage based on a brief trial of a handful of volunteer research subjects and then the service is provided to hundreds of thousands of patients for a far greater duration, where the patients are also more elderly and have a greater degree of comorbid illness than any of the patients included in the original study. This lack of information collection about real-world utilization and outcomes, the potential for differences between effectiveness and efficacy, and different trade-offs between benefits and risks is viewed by many as an important “forgone opportunity” in health care.
CED aims to integrate further evidence development into service delivery. Technically, CED is one form of “coverage with restrictions,” where the restrictions include limiting coverage to specific providers or facilities (e.g., the limitation on which facilities can perform organ transplants), limiting coverage to particular patients, or in the case of CED, limiting coverage to contexts in which additional data are collected. From an implementation standpoint, CED requires that, when care is delivered, data collection occurs. Not a requirement of CED per se, but an expectation of it, is that
the additional data generated will lead to new knowledge that will be integrated both into the CMS decision-making process to inform coverage reconsideration and into the knowledge base available for clinical decision making. Two case studies illustrate how CED can be used to directly or indirectly develop evidence that augments healthcare decision-making and CMS coverage policy.
Specific Examples of CED
The National Emphysema Treatment Trial (NETT), funded by CMS, was a multicenter clinical trial designed to determine the role, safety, and effectiveness of bilateral lung volume reduction surgery (LVRS) in the treatment of emphysema. The study had, as a secondary objective, to develop criteria for identifying patients who are likely to benefit from the procedure. While conducted prior to the coinage of the term “coverage with evidence development,” the trial was implemented through a CMS NCD that eliminated coverage of LVRS outside of the trial but supported coverage for the surgery and routine clinical costs for Medicare beneficiaries enrolled in the trial. NETT demonstrates how coverage decisions can be leveraged to directly drive the development of evidence necessary for informed decision making by payers, physicians, and patients. The trial clarified issues of risk and benefit associated with the procedure and defined characteristics to help identify patients who were likely to benefit—information that was incorporated into the revised CMS NCD on lung volume reduction surgery and had significant impact on guidance offered for treatment of emphysema.
Emphysema is a major cause of death and disability in the United States. This chronic lung condition leads to the progressive destruction of the fine architecture of the lung that reduces its capacity to expand and collapse normally—leaving patients increasingly unable to breathe. The presence of poorly functioning portions of the lung is also thought to impair the capacity of healthy lung tissue to function. For patients with advanced emphysema, LVRS was hypothesized to confer benefit by removing these poorly functioning lung portions—up to 25-30 percent of the lung—and reducing lung size, thus pulling airways open and allowing breathing muscles to return to normal positioning, increasing the room available for healthy lung function, and improving the ability of patients to breathe. Prior to the trial, evidence for LVRS consisted of several case series that noted high up-front mortality and morbidity associated with the surgery and anecdotes of sizable benefit to some patients. At the time of the NCD, the procedure was a high-cost item with the operation and months of rehabilitation costing more than $50,000 on average. Many health economists predicted that utilization would rise rapidly with tens of thousands of patients eligible for
the procedure and an estimated cost to Medicare predicted to be as much as $15 billion per year (Kolata 2006).
Because of the surgery’s risks and the absence of clear evidence on its efficacy, patient selection criteria, and level of benefit, CMS initiated an interagency project with the National Heart, Lung, and Blood Institute (NHLBI) and the Agency for Healthcare Research and Quality (AHRQ). AHRQ’s Center for Health Care Technology and NHLBI carried out independent assessments of LVRS; they concluded that the current data on the risks and benefits were inconclusive to justify unrestricted Medicare reimbursement for the surgery and suggested a trial to assess the effectiveness of the surgery. NHLBI conducted a scientific study of LVRS to evaluate the safety and efficacy of the current best available medical treatment alone and in conjunction with LVRS by excision. CMS funded the routine and interventional costs. The trial was conducted with the expectation that it would provide answers to important clinical questions about the benefits and risks of the surgery compared with good medical therapy, including the duration of any benefits, and clarification of which subgroups experienced benefit. Some initial barriers included resistance by the public, which considered it unethical to pay for some patients but not others to receive treatment.
The trial evaluated four subgroups prespecified by the case series studies and physiological hypotheses. One group was dropped early (homogeneous lung, severe obstruction, very low diffusing capacity) due to severe adverse outcomes including a high up-front mortality. The other three subgroups experienced some level of benefit and patients were followed for two years. On average, patients with severe emphysema who underwent LVRS with medical therapy were more likely to function better and did not face an increased risk of death compared to those who received only medical therapy. However results for individual patients varied widely. The study concluded that overall, LVRS increased the chance of improved exercise capacity but did not confer a survival advantage over medical therapy. The overall mortality was the same for both groups, but the risk of up-front mortality within the first three months was significantly increased for those receiving therapy (Ries et al. 2005). In addition to identifying patients that were poor candidates for the procedure, the trial identified two characteristics that could be used to predict whether an individual participant would benefit from LVRS, allowing clinicians to better evaluate risks and benefits for individual patients. CMS responded by covering the procedure for all three subgroups with any demonstrated benefit.
In this case, a well-designed and implemented CED NCD led to the creation of data that clarified the CMS coverage decision, refined questions in need of future research, and provided the types of evidence important to guide treatment evaluation by clinicians (subgroups of patients who might benefit or be at increased risk from LVRS) and patients (symptoms and
quality-of-life data not previously available). Such evidence development led to informal and formative impressions among patients and providers that caused them to reconsider the intervention’s value. As a result, from January 2004 to September 2005, only 458 Medicare beneficiaries received LVRS at a total cost to the government of less than $10.5 million (Kolata 2006).
Alternatively, CED can indirectly provide a basis for evidence development. PET is a diagnostic imaging procedure that has the ability to differentiate cancer from normal tissue in some patients, and thus can help in diagnosing and staging cancer and monitoring a patient’s response to treatment. While the available evidence indicated that PET can provide more reliable guidance than existing imaging methods on whether the patient’s cancer has spread, more data were required to help physicians and patients make better-informed decisions about the effective use of PET scanning.
CMS implemented an NCD to cover the costs of PET scanning for diagnosis, staging, re-staging, and monitoring of cancer patients, with the requirement that additional clinical data be collected into a registry. This type of CED allowed CMS to ensure patients would receive treatment benefit and build upon emerging evidence that PET was safe and effective by creating a platform from which other questions of clinical interest could be addressed. The NCD articulated questions that could lead to a reevaluation of the NCD, such as whether and in what specific instances PET scanning altered treatment decisions or other aspects of management of cancer patients. CMS required that information about PET scan be submitted to a registry. The registry then conducted research by following up with physicians to ask why a PET scan was ordered and whether the results of the PET scan altered disease outcomes. Participating patients and physicians were given the opportunity to give consent for their data to be used for research purposes, and other HIPAA (Health Insurance Portability and Accountability Act) issues were avoided by restricting research to the registry. While such research questions are simple and not likely to be independently pursued by agencies engaged in broader investigations such as the National Institutes of Health (NIH), they are typical of the kinds of evidence often needed to ensure the delivery of appropriate and effective health care.
Overarching Issues Affecting CED
Several overarching issues will affect the long-term viability of CED as a robust policy that spurs the development of a learning healthcare system. Of particular interest are the statutory authorities on which CED is based, the implications for patients who are eligible for services covered under CED, the role that the private-public interface must play for the learning to take place, and the issue of capacity in the healthcare system more broadly
for such data collection, analysis, interpretation, and dissemination. Each of these issues has been extensively considered in the development of existing CED determinations, so moving forward the implementation of further CED determinations should be somewhat more straightforward.
In describing CED, CMS released a draft guidance followed by a final guidance that articulated the principles underpinning CED and the statutory authorities on which it is based. Both are available on the CMS coverage web site (www.cms.hhs.gov/coverage). In truth, there are two separate authorities, depending on the type of CED determination. When participation in a clinical trial is required as part of coverage, as in the NETT, the authority being used by CMS is based on section 1862(a)(1)(E) of the Social Security Act. CMS terms this “Coverage with Clinical Study Participation (CSP).” This section of the act allows CMS to provide coverage for items or services in the setting of a clinical research trial, and the use of this authority clarifies further that the item or service is not “reasonable and necessary” under section 1862(a)(1)(A) of the Social Security Act—the authority under which virtually all routine services are covered. The CED guidance further articulates that decisions such as NETT, in which coverage is provided only within the context of a clinical study, is meant as a bridge toward a final coverage determination regarding the service being “reasonable and necessary” under section 1862(a)(1)(A). Coverage, such as that provided for the PET registry, is based on the 1862(a)(1)(A) section of the Social Security Act because CMS has made the determination that the service is reasonable and necessary for the group of patients and indications that are covered, but that additional data are required to ensure that the correct service is being provided to the correct patient with the correct indications. As such, the registry is being used to collect additional data elements needed to better clarify the details of the service, patient, and indication. CMS terms this type of CED “Coverage with Appropriateness Determination” (CAD).
Implications for Patients
Unlike an NCD that provides coverage without restrictions, all NCDs that include restrictions affect how or where or which beneficiaries can receive services. As in coverage for organ transplants being provided only in certain hospitals, CED requires that patients receive services in locales where evidence can be collected. This limitation may be quite significant in terms of its effect on access or not significant at all. For instance, the NETT was conducted at only a handful of centers throughout the United States,
so Medicare beneficiaries who wanted to receive the service had to travel to one of these centers and be evaluated. The coverage of fluorodeoxyglucose (FDG) PET for cancer is also limited to those PET facilities that have put in place a registry; but in this case, virtually all facilities in the country have been able to do so relatively easily. Not only can CED in some cases limit geographic access, but when CED requires clinical research participation, patients may have to undergo randomization in order to have a chance to receive the service. In the case of the NETT trial, some patients were randomized to best medical care instead of the surgery. In general, it is not unethical to offer services that are unproven only in the context of a clinical trial, when the scientific community is in equipoise regarding the risks and benefits of the service versus usual care and the data are insufficient to support a determination that the service is reasonable and necessary. Sometimes, patients may also be asked to provide “informed consent” to participate in research as part of CED, as in the NETT. However, patients have not been required to allow their data to be used for research when receiving a service such as the FDG-PET scan under CED. Rather, patients have been able to elect to have their data used for research, or not, but their consent has not been required for the service to be covered. (Early reports suggest that about 95 percent of Medicare beneficiaries are consenting to have their data used for research.) Theoretically, under some scenarios in which registries are being used to simply gather supplementary medical information, a requirement for informed consent could be waived due to the minimal risk posed and the impracticability of obtaining it.
The Private-Public Interaction Necessitated by CED
Because CED leads only to the requirement for data collection, but not to the requirement for other steps needed for evidence development, such as data analysis, scientific hypothesis testing, or publication and dissemination, CED requires a follow-on process to achieve its broader policy goals. To date, these goals have been achieved through partnerships with other federal agencies, providers, professional societies, academic researchers, and manufacturers. For instance, in the NETT, as noted above, the scientific design of data collection and the analysis and publication of study results were orchestrated through NHLBI, which engaged and funded investigators at multiple participating institutions. The involvement of the NHLBI and investigators from around the country ensured that CED would lead to a better understanding of the clinical role of LVRS in the Medicare population. In the case of the FDG-PET registry, the registry was required by CED, but was set up through a collaboration involving researchers at several academic institutions, and professional societies, to form the National Oncologic PET Registry (NOPR). These researchers constructed a research
study design around the CED requirement, such that there is a high probability that both clinicians and CMS will have a far better understanding of the role of FDG-PET scans in the management of Medicare patients with cancer.
Recently, CMS issued another CED decision covering implantable cardiac defibrillators (ICDs), in which all patients in Medicare receiving ICDs for primary prevention are required to submit clinical data to an ICD registry. The baseline registry, which captures patient and disease characteristics for the purpose of gauging appropriateness (i.e., CAD) forms the platform for a 100 percent sample of patients receiving this device for this indication in Medicare. The entity running the registry has since engaged other private payers, cardiologists, researchers, and device manufacturers in order that a follow-on data collection can be put in place to capture the frequency of appropriate ICD “firings” (where the device restores the patient’s heart to an appropriate rhythm). In other words, because CMS requires only the core data elements to be submitted, evidence development is driven only indirectly by CED. However, the establishment of the registry mechanism and baseline data creates the framework for a powerful and important tool that, if utilized, provides the opportunity to conduct and support the kind of research necessary for a learning approach to health care.
The NETT has been completed, and as previously noted, the results of the trial substantially altered clinical practice. The FDG-PET registry and the ICD registry, as well, are still ongoing. These are only two more examples of how needed clinical evidence could be gathered through the CED to ensure that the best available information about utilization, effectiveness, and adverse events is made available to clinicians and policy makers. It is easy to imagine that for these two decisions, it will not be difficult to find qualified researchers to analyze the data or journals interested in publishing the findings. However, these few CED decisions are just a model for what could theoretically become a far more common process in coverage, not only at CMS but more broadly. As the healthcare system moves toward increasing standardization of medical information and toward adoption of EHRs more extensively, better clinical detail should be readily available to satisfy CAD requirements, and longitudinal data should be readily accessible to address study questions. At that point, the current scientific infrastructure, the number of qualified researchers, and the appetite of peer-reviewed journals for such data analyses may constitute obstacles to a learning healthcare system. Aware of these potential system-level limitations, were CED to be implemented more broadly, CMS has cautiously applied the policy in settings where the infrastructure and science were in place or could quickly be
put into place. Going forward, CMS will likely make relatively few CED determinations, judiciously choosing those areas of medical care in which routine data collection could enhance the data on which coverage determinations are made and improve the quality of clinical care.
USE OF LARGE SYSTEM DATABASES
Jed Weissberg, M.D.
Kaiser Permanente Medical Care Program
Integrated delivery systems and health maintenance organizations (HMOs) have a long history of epidemiologic and health services research utilizing linked, longitudinal databases (Graham et al. 2005; East et al. 1999; Friedman et al. 1971; Selby 1997; Platt et al. 2001; Vogt et al. 2004). Research networks currently supported by the government are examining healthcare interventions in diverse populations, representative of the U.S. citizenry. Hypothesis-driven research utilizing existing clinical and administrative databases in large healthcare systems is capable of answering a variety of questions not answered when drugs, devices, and techniques come to market. The following case study illustrates the value of collecting, linking, and utilizing data for pharmacovigilance purposes, outlines key elements necessary to encourage similar efforts, and hints at changes that might develop the potential to discover such insights as a natural outcome of care within a learning healthcare system.
A project using a nested, case-control design to look at the cardiovascular effects of the COX-2 inhibitor, rofecoxib, in a large HMO population within Kaiser Permanente (KP) (Graham et al. 2005) demonstrates the potential value of pharmacoepidemiological research and the opportunities offered with the advent of much greater penetration of full EHRs to rapidly increase knowledge about interventions and delivery system design. Much can be learned from this case study on what it will take to move to a learning system capable of utilizing data, so that valid conclusions, strong enough on which to base action, can be identified routinely. While the potential for such a system exists, many barriers including technical data issues, privacy concerns, analytic techniques, cost, and attention of managers and leaders will need to be overcome.
Nonsteroidal anti-inflammatory drugs (NSAIDs) are widely used to treat chronic pain, but this treatment is often accompanied by upper gastrointestinal toxicity leading to admission to the hospital for ulcer complications in around 1 percent of users annually (Graham et al. 2005). This is due to NSAID inhibition of both isoforms of COX: COX-1, which is associated with gastro protection as well as the COX-2 isoform, which is induced at sites of inflammation. The first COX-2 selective inhibitors,
rofecoxib and celecoxib, were thus developed with the hope of improving gastric safety. In the five years from the approval and launch of rofecoxib to its withdrawal from the market, there were signs of possible cardiovascular risk associated with rofecoxib use.
Using Kaiser Permanente data, Graham et al. examined the potential adverse cardiovascular effects of “coxibs.” The nested, case-control study design was enabled by the availability of a broad set of data on Kaiser Permanente members, as well as the ability to match data from impacted and non-impacted members. As a national integrated managed care organization providing comprehensive health care to more than 6.4 million residents in the State of California, Kaiser Permanente maintains computer files of eligibility for care, outpatient visits, admissions, medical procedures, emergency room visits, laboratory testing, outpatient drug prescriptions, and mortality status for all its members. While the availability of prescription and dispensing data as well as longitudinal patient data (demographics, lab, pathology, radiology, diagnosis, and procedures) was essential to conduct such a study, several other elements related to organizational culture precipitated and enabled action.
The organization and culture of KP created an environment that can be described as the “prepared mind” (Bull et al. 2002). The interest of clinicians and pharmacy managers in the efficacy, safety, and affordability of the entire class of COX-2 drugs had resulted in the relatively low market share of COX-2 drugs within KP NSAID use (4 percent vs. 35 percent in the community, Figure 1-1). This 4 percent of patients was selected based on a risk score developed in collaboration with researchers to identify appropriate patients for COX-2 therapy (Trontell 2004). Of additional importance to the investigation was the presence of a small, internally funded group within KP—the Pharmacy Outcomes Research Group (PORG)—with access to KP prescription information and training in epidemiology. These elements were brought together when a clinician expressed concern about the cardiovascular risk associated with COX-2 drugs and suggested that KP could learn more based on its own experiences. A small grant from the Food and Drug Administration (FDA), combined with the operating budget of the PORG, enabled the authors to design and execute a case-control study. The study concluded that rofecoxib usage at a dose greater than 25 mg per day increased the risk of acute myocardial infarction (AMI) and sudden cardiac death (SCD). Additional insights of the study pointed to the differences between other NSAIDs and the inability to assume cardiovascular safety for other COX-2 drugs. The conduct of this study contributed to the FDA’s scrutiny of rofecoxib, which resulted in the manufacturer’s decision to withdraw the drug from the marketplace. In addition, the initial release of the study abstract stimulated similar analyses that clarified the clinical risks associated with this drug class and illustrated the gap between
marketing promises and real-world outcomes. Similar observational studies conducted elsewhere, some including propensity scoring, have confirmed these results. In addition, a meta-analysis of related observational studies was also conducted and offers a promising method for strengthening the credibility of observational studies.
As we implement EHRs widely throughout the healthcare system, we will have unprecedented opportunities for data capture. Well-designed functionalities will allow generation of data, whether coded or though analysis of free text, as a by-product of the usual documentation of care. The benefits are obvious: detection of rare events or other insights that do not readily emerge from pre-marketing studies with small sample sizes, comparisons of practice to gain insights into what factors drive variation in outcomes, the creation of tools to track the dissemination of new technologies and health impacts, real-time feedback systems to improve healthcare system design, safety, and management, and the collection of data demonstrating compliance in process—which will provide the opportunity to demonstrate proof of the value of adherence to evidence-based guidelines. All these are critical elements as we move to a system of evidence-based medicine and management.
The accumulation and availability of data, however, is simply a start, and much work will need to be done to resolve methodological issues to ensure that the data are comprehensive and of high quality. For example, current managed care databases do not include over-the-counter medi-
cations and have incomplete information on potential confounders. The experience of Kaiser Permanente in establishing such databases indicates that the technical challenges are manifold. Within KP, epidemiologists and clinical researchers have indicated that missing data, particularly in registries, complicate analysis, and relevant data often are not collected. In KP’s migration to a common EHR platform (Garrido et al. 2005) we discovered more than 12 different ways to code for gender, so standards will have to be set on even the most basic level to encourage the use of consistent data definitions. Clinicians will need to adhere to certain charting conventions in order to find relevant data, because many EHRs allow storage of the same coded or free text data elements in multiple locations. Various studies have documented that while EHR data can be richer than routine claims data, even EHR data can misclassify patients based on the gold standard of combining such data with careful, expert review of free text medical documentation (Persell et al. 2006).
Other ongoing challenges will be how to define cases and “normal ranges” for things such as viral load or different laboratory assay systems. Standardized nomenclature and messaging (e.g. SNOMED, LOINC, HL7) are vital. Additionally, unlike in clinical trials, the timing of data collection for registries and in clinical practice varies. Therefore, as we shift to collection of data as part of patient care, nonstandard time points will be a variation that will increasingly compound the issue of missing data. Quality-of-life and functional surveys are essential to evaluating outcomes of care. Routine administration via multiple modalities, storage, and association with clinical events will require standardized input from patients that still poses a challenge for current EHRs.
Finally, it will be necessary to ensure that the data are captured and routinely available in a timely fashion. In this respect, HIPAA is an issue as well as proprietary data relating to maintaining business advantage. For example, at KP, our use of various pharmaceuticals and our ability to move market share are critical in maintaining affordability for our membership. These are not data that we share lightly. We also use registry data from our total joint implant experience to drive purchasing decisions from implant vendors. Timeliness is another concern. Even something as basic as identifying when a patient dies is not straightforward. Via Social Security and Medicare death tapes, with reasonable accuracy we can determine deaths within four months of occurrence for our Medicare beneficiaries. However, death of commercial members may not appear in state death files reliably for several years.
While standardization will make data amenable to comprehensive data analysis, an even greater challenge will be to create a learning system in which meaningful insights emerge organically. Currently, data mining is the source of hundreds of associations, but analysis is most efficiently done
when a particular hypothesis is pursued. Non-hypothesis-driven analysis will likely yield far more false positive associations than true positives, and each association might require further study. While improved methods or search capacity may partially alleviate this problem, there will need to be thought regarding how to prioritize which findings merit further study. Moreover, these analyses will take time, interest, and money; so as we think about expanding our capacity to conduct such research, we also need to ensure that we develop the funding resources and expertise necessary to pursue these associations. While federal, pharmaceutical, and health provider funding exists for such studies, it is often driven by regulatory requirements, business needs, or the specific interests of researchers and will not be adequate for exploratory approaches to data mining.
Reaping the benefits of large, linked systems of EHRs will require more work on interoperability, data definitions, and statistical techniques. Making health information technology (HIT) truly interoperable will require testing, utilization of standardized nomenclatures, and vendor cooperation. Users of EHRs will have to learn to document consistently in order to best aggregate data among sites of care. Also, we probably will have to develop more robust natural language processing algorithms in order to glean important medical information from noncoded sections of the EHR. However, as this case study reveals, data are only the means to an end. Data collection must be embedded within a healthcare system, like Kaiser’s, that can serve as a prepared mind—an environment in which managers and clinicians are trained to discern patterns of interest and work within a system with the inclination, resources, and capacity to act upon such findings (Garrido et al. 2005). We already have the opportunity to build upon the many analyses being done for the varied purposes of quality improvement (QI), quality assurance (QA), utilization studies, patient safety, and formal research; however, only prepared minds can see patterns and data of interest beyond their own inquiries, and only a system that understands and values data will act on it for the betterment of patient care.
QUASI-EXPERIMENTAL DESIGNS FOR POLICY ASSESSMENT
Stephen Soumerai, Sc.D.
Harvard Medical School and Harvard Pilgrim Health Care
Although randomized controlled trials produce the most valid estimates of the outcomes of health services, strong quasi-experimental designs (e.g., interrupted time series) are rigorous and feasible alternative methods, especially for evaluating sudden changes in health policies occurring in large populations (Cook and Campbell 1979). These methods are currently underutilized but have the potential to inform and impact policy decision
making (Wagner et al. 2002). These case examples illustrate the application of such designs to the quality and outcomes of care and demonstrate how these studies influence state and national health policies, including the Medicare drug benefit. These case studies will focus on evaluation of the unintended impacts of statewide pharmaceutical cost containment policies, and how quasi-experimental design can contribute to the understanding and modification of health policy. Since randomization of health policies is almost never feasible (Newhouse et al. 1981), quasi-experimental designs represent the strongest methods for evaluating policy effects.
We often speak of randomized controlled trials as the gold standard in research design to generate evidence and of other trial designs as alternatives to RCTs. These many alternative designs, however, should not be grouped together as “observational studies” because they provide vastly different ranges of validity of study conclusions. For example, we have previously reported that the weakest nonexperimental designs, such as uncontrolled pre-post designs or post-only cross-sectional designs frequently produce biased estimates of policy impacts (Soumerai et al. 1993; Soumerai 2004). However, there are several strong quasi-experimental designs that have been effectively used to evaluate health policies and thereby affect policy decision making. These methods have been used to analyze health services, but here, we will focus on the evaluation of health policy (Park 2005). Two strong quasi-experimental designs are the pre-post with non-equivalent control group design that observes outcomes before and after an intervention in a study and comparison group and the interrupted time series design with and without comparison series. Such designs, when carefully implemented, have the potential to control for many threats to validity, such as secular trends. Although these designs have been used for many decades, they are not often applied, in part because they are not emphasized in medical and social science training. Because these are natural experiments that use existing data and can be conducted in less time and expense than many RCTs, they have great potential for contributing to the evidence base. The following case studies will focus on the use of one of the strongest quasi-experimental designs, the interrupted time series design.
Health policy interventions can be analyzed using interrupted time series using segmented, linear regression. The interrupted time series (ITS) method assumes that the counterfactual experience of patients—or the trend had the policy not been implemented—is reflected by the extrapolation of the pre-policy trend. Changes in level of slope after the intervention as well as the immediate magnitude of change following implementation can give information on the effect of the intervention measured as a discontinuity in the time-series as compared with the counterfactual or extrapolation of the baseline trend (Figure 1-2). Because of the use of a baseline trend, one can actually control for a number of threats to validity such as
history (e.g., ongoing changes in medical knowledge and practice), maturation (e.g., aging of the study population), and changing composition of the study population.
A good illustration of this, using real-world data, was a study that looked at the effects of a Medicaid drug coverage limit (cap) in New Hampshire on use of essential and nonessential drugs (Soumerai et al. 1987) (see Figure 1-3). After implementation of the cap, prescriptions filled by chronically ill Medicaid patients in New Hampshire dropped sharply and affected both essential (e.g., cardiac and antidiabetic) and nonessential drugs. The time series clearly shows a 46 percent sudden reduction in the level of medication use from the pre-intervention trend, and an immediate increase in both level and trend when the policy was suspended, adding to the internal validity of the findings (off-on-off design). A similar approach was used to look at the effects of this drug cap on nursing home admissions. Segmented survival analysis shows that this policy increased nursing home admissions among chronically ill elderly (Soumerai et al. 1991). Taken together, these data indicate the intended and unintended consequences of this policy. Similar work has been done on schizophrenia, in which the limitations on drug coverage affected the use of psychotropic agents and substantially increased utilization of acute mental health services intended to treat psychotic episodes among patients with schizophrenia (Soumerai et al. 1994). The Medicaid costs of such increased treatment exceeded the drug savings from the cap by a factor of 17 to 1. Similar time series data again show the remarkable and unintended consequences of this policy.
These types of studies can clearly provide good evidence for the effectiveness of health policies but they also have a significant effect on policy. While this is somewhat unusual for most academic research, the impact of
this study has in part supported the decision of six states and several countries in changing their health policy, was instrumental in developing reports from public advocacy groups such as the National Alliance on the Mentally Ill (NAMI) and the AARP to make the case for adequate drug benefits, and contributed to more rational state policy decisions. Recently, these data were used by CMS as evidence to support subsidies for up to 5 million poor or near-poor Medicare beneficiaries to achieve greater economic access to medications in the Medicare drug benefit (Tunis 2005). The policy impact of these studies was due, in part, to their very visible effects and because they produced usable and understandable information for policy makers. A key need however will be a system that values and actively seeks out such information. These data had significant impact but required the efforts of individuals who were trained to recognize the value of such data and able to reach out and find results that were relevant to their decision making (Soumerai et al. 1997). Based on the data reported in these two natural experiments (Soumerai et al. 1991; Soumerai et al. 1994), if the findings were applied to all 18 states with drug benefit caps today, it might be possible to reduce hundreds or thousands of nursing home admissions and psychotic episodes, while reducing net government health expenditures.
These types of quasi-experimental studies can answer a range of questions in the field of health services research and pharmacoepidemiology, and the use of interrupted time series designs is increasing (Smalley 1995; Tamblyn et al. 2001; Ray et al. 2003). One example is a study done by Tamblyn et al., in which interrupted time series analysis looked at the effect of changes in cost sharing on the elderly and welfare populations in Quebec in terms of the use of essential and nonessential drugs, rate of emergency department visits, and serious adverse events associated with reductions in drug use before and after policy implementation. The study, which has significant application to the Medicare drug benefit in the United States, showed that increased cost sharing led to a reduction in use of essential drugs, particularly among the adult welfare population. This was associated with higher rates of serious adverse events and emergency department (ED) visits. The findings of this study, combined with public advocacy pressure, caused the policy of increased cost sharing to be rescinded, with major impacts on illness and potential mortality. A similar study (Roblin et al. 2005) was used to determine the effects of increased cost sharing on oral hypoglycemic use and found that within 5 HMOs, an increase of ≥$10 in cost sharing for the intervention resulted in an immediate and persistent decline in oral hypoglycemic use, which was not observed with smaller incremental increases (see Figure 1-4).
As a final example of the potential utility of this approach, an interrupted time series (Ross-Degnan et al. 2004; Pearson 2006) examined the effect of a triplicate prescription policy on the likelihood of benzodiazepine use in Medicaid patients. To set the background, there are several clinical issues regarding the utilization of benzodiazapines, which are commonly used as hypnotics, anxiolytics, and muscle relaxants, but are also used for seizure and bipolar disorders (APA 1998; Bazil and Pedley 1998; Henriksen 1998). In elderly populations, they are considered to confer an increased risk for fall—given the side effects—and are controlled substances under schedule IV of the Drug Enforcement Administration (DEA) conferring an additional risk of dependence. Thus policies have been enacted to attempt to reduce inappropriate prescribing of this drug class. The study demonstrated that, upon initiation of a triplicate prescription policy for benzodiazapines, there was an abrupt reduction in the prescribing of the entire class of drug, with equal effects on likely appropriate (e.g., short-term, low-dose) and inappropriate use. Examination of the effect of this policy by race showed that despite the observation that black populations were about half as likely to use benzodiazapines to begin with, the triplicate prescription policy disproportionately affected blacks, with an approximately 50 percent greater likelihood of prescription stoppage due to the policy (Pearson 2006) (see Figure 1-5). Thus, while there may be a high rate of inappropriate use of this class of medication overall, the policy caused
an unintended decrease in appropriate use of benzodiazepines in a way that disproportionally affects black populations. This leads to a different sort of question: Is it ever appropriate to halt the use of an entire class of drug? Even if 50 percent of the use of that class is inappropriate, the other 50 percent may be very important for the health and well-being of those patients. To further confound the issue, our study published in the January 2007 issue of the Annals of Internal Medicine shows that the above reduction in benzodiazepine use among elderly Medicaid or Medicare patients did not result in any change in rates of hip fracture, which casts doubt on the conventional wisdom derived from 20 years of epidemiological research (Wagner et al. 2007 [in press]).
In summary, quasi-experimental design has many benefits and can clearly delineate effects of policy on health outcomes and healthcare utilization. Interrupted time series designs address many threats to validity. As natural experiments, these studies are cheaper and faster than RCTs, can use existing data, and are useful when RCTs are not feasible (e.g., most policies cannot be randomized). Because analysis often produces very visible
effects (especially if interventions are applied suddenly) and these visible effects are often significant, this type of research conveys an intuitive understanding of the effects of policy decisions and has had a significant impact on changing and influencing health policy decisions. The use of comparison series increases validity and can be used to look at high-risk subgroups and unintended outcomes. In addition, these studies can also illuminate the mechanism of these effects, for example, by observing simultaneous changes in processes (e.g., adherence) and health outcomes (e.g., hospital admission). Currently much of this research is conducted at the Centers for Education and Research in Therapeutics and is funded primarily by AHRQ. More quasi-experimental studies of natural experiments are needed, especially on the effects of numerous changes in coverage, cost sharing, and utilization in the Medicare drug benefit. This will require more extensive training of clinical and social scientists, journal editors, and federal study sections to recognize the potential strengths of quasi-experimental design in health policy evaluation. In light of the substantial impact that health policies can have on the population’s health, there is a need to redress the relative scarcity of scientific data on the outcomes of policy interventions.
PRACTICAL CLINICAL TRIALS
Sean Tunis, M.D., M.Sc.
Health Technology Center
Developing useful and valid evidence for decision making requires several steps, all of which must be successfully completed with full engagement of the key decision makers—patients, clinicians, and payers. These include (1) identifying the right questions to ask, (2) selecting the most important questions for study, (3) choosing study designs that are adequate to answer the questions, (4) creating or partnering with organizations that are equipped to implement the studies, and (5) finding sufficient resources to pay for the studies. This paper discusses specific case studies that highlight progress and challenges related to each of these steps from the perspective of those decision makers who are trying to decide whether particular healthcare services are worth paying for. The primary focus is on identifying relevant questions, real-world (practical, pragmatic) study designs and funding, which are listed in their logical order, as well as in the order of increasing challenge. Considerable work is needed to develop more efficient and affordable methods for generating reliable data on the comparative effectiveness of healthcare services. While observational methods may have value for selected questions, increasing attention is necessary to developing strategies to design and implement faster, larger, and more efficient prospective intervention studies. Serious attention, along with some creative thinking, will also be required to ensure that adequate and sustainable real-time funding is available to support what might be called “decision-based evidence making.”
Clinical trials can be thought of as falling into two broad categories: explanatory trials and pragmatic (or practical) trials. Explanatory trials focus on the mechanism of disease and whether things can work under optimal circumstances. Pragmatic or practical clinical trials are designed to inform choices between feasible alternatives or two different treatment options by estimating real-world outcome probabilities for each (Schwartz and Lellouch 1967; Tunis et al. 2003). That is, PCTs are purposefully designed to answer the specific policy or clinical questions of those who make policy and clinical decisions. Key features of a practical trial include meaningful comparison groups (i.e., generally not placebo, but rather comparisons to other reasonable alternatives), broad patient inclusion criteria, such that the patient population offers the maximum opportunity for generalizability of study results; multiple outcome measures including functional status, resource utilization, etc., conducted in real world setting (where research is not the primary organizational purpose), and minimal interventions to ensure patient and clinician compliance with the interventions being studied.
The underlying notion of PCTs is to design studies in ways that maximize the chances that the results will be translatable and implementable. The usual concept of translation is to design studies without careful consideration of the needs of decision makers, and then use various communication techniques to encourage adoption of the results. However, the major barrier to translation may be how the research question was framed, which patients were enrolled, et cetera. PCTs make a serious effort to anticipate potential translation problems and address those through the design of the study.
One key consideration in designing a study is to determine at the onset whether a non-experimental design would be adequate to persuade decision makers to change their decision. For example, most technology assessment organizations exclude nonrandomized studies from systematic reviews because they know that payers will not consider such studies in their coverage decisions. Given that, one would need to think carefully about the value of conducting observational studies if those studies are intended to influence coverage decisions. There are certain questions of real-world significance that cannot be answered through the collection of data generated in the routine provision of care, whether administrative data or data extracted from EHRs. In those circumstances, creative methods will need to be developed for conducting prospective real-world comparative effectiveness studies.
Over the past several years, a working group of methodologists—the Pragmatic Randomized Controlled Trials in Healthcare, or PRaCTIHC, workgroup—has been working to identify a set of domains upon which trials can be rated according to the degree to which they are pragmatic or explanatory. These domains include study eligibility criteria, flexibility of the intervention, practitioner expertise, follow-up intensity, follow-up duration, participant compliance, practitioner adherence to protocol and intervention, and primary analysis scope and specification. For each domain, there are definitions that indicate whether a study is highly explanatory, highly pragmatic, or somewhere in between. For example, on the patient eligibility domain, a maximally pragmatic approach would enroll “all comers” who might benefit from the intervention regardless of prior information about their risk, responsiveness to the intervention, or past compliance. The maximally explanatory trial would enroll only individuals or clusters thought, on the basis of prior information, to be at high risk, highly responsive to the intervention, and who have demonstrated high compliance in a pre-trial test. Similarly, the highly pragmatic and highly explanatory descriptors for the expertise of practitioners taking care of the patients are as follows: A maximally pragmatic approach would involve care applied by usual practitioners (regardless of expertise), with only their usual monitoring for dose setting and side effects (and no external monitoring). The explanatory
approach would involve care applied by expert practitioners who monitor patients more closely than usual for optimal dose setting and the early detection of side effects. These domains can be represented pictorially on numbered scales, creating a two-dimensional image that provides information about the degree to which the trial is pragmatic or explanatory, and on what specific domains. These maps then convey the nature of a trial and give a sense to clinical or health policy makers on how best to consider that trial in informing their decisions. For example, mapping a highly pragmatic trial such as the Directly Observed Therapy for TB study, gives a very different map than that of the highly explanatory North American Symptomatic Carotid Endarterectomy Trial. Such a visual representation could be very useful for decision makers in evaluating evidence by providing a common framework in which to understand strengths and weaknesses of evidence. The instrument is currently being piloted and refined by a Cochrane workgroup and various other Evidence-Based Medicine (EBM) experts, with the goal of developing a valid and reliable instrument for scoring the studies included in systematic reviews.
The Medicare program has attempted to build interest in pragmatic clinical trials by highlighting the value of such trials in making national coverage decisions. Once such example is the pragmatic trial of FDG-PET for suspected dementia—which Medicare called for in the context of an NCD in 2004. The original noncoverage decision for use of FDG-PET in the context of suspected dementia was issued in April 2003. The decision was based in part on a decision analysis conducted for and reviewed by the Medicare Coverage Advisory Committee. It was concluded that, even though FDG-PET had greater sensitivity and specificity than expert evaluation by a neurologist, the scan provided no measurable benefit because existing treatments are very safe and of limited benefit. The improved accuracy of the diagnosis therefore offered no clinical benefit to the patient undergoing the study. Because this was a controversial decision, CMS agreed to continue to review the technology with the assistance of an expert panel convened by NIH. CMS also committed to conducting some type of demonstration project of FDG-PET for suspected dementia. The NIH panel analysis concluded that FDG-PET was a promising diagnostic tool but recommended that additional evidence be developed before broad adoption of the technology. In September 2004, a revised NCD was issued providing limited coverage of FDG-PET for suspected dementia when the differential diagnosis included frontotemporal dementia (FTD) and Alzheimer’s disease (AD). The NCD also allowed for broad coverage of FDG-PET in the context of a large community-based practical clinical trial designed to evaluate the clinical utility of this test.
Since this NCD was issued in September 2004, a group at the University of California-Los Angeles (UCLA) has designed such a PCT. The pro-
posed study would enroll 710 patients with diminished cognitive function at nine academic centers that specialize in the evaluation and management of AD. Every patient enrolled in the study would undergo an FDG-PET scan, but would be randomized as to whether the results of the scan are made available at the time of the scan or whether they are sealed and available only at the end of the two-year trial. The outcomes to be measured include initial or working diagnosis, initial management plan, measures of cognitive decline, utilization of other imaging studies, functional status, and percentage of patients admitted to nursing homes within two years. CMS has reviewed this study, approved the trial design, and agreed to cover the costs of PET scans for all patients enrolled in the study. However, as of late 2006, funding had not been approved for the research costs of this study. Efforts are still under way to secure this funding, but more than two years have passed since the coverage decision was issued. Some of the questions that the case study poses are the following:
Is it a good idea or is it necessary to do a pragmatic trial of PET for AD? Such a trial would require substantial resources to organize and implement, and it would be necessary to ensure that the value of the information would be sufficient to justify the effort and expense. Some experts have proposed an approach called “value of information analysis,” and this type of careful consideration would be essential to ensuring that the proposed question was worth answering, that the optimal method would be a PCT, and that this topic would be a high priority in the context of the range of questions that might be addressed.
Is it necessary to conduct a study prior to adoption widely into practice; are there alternative methodologies to the method developed; could this be better evaluated through a quasi-experimental study design? Would observational methods possibly provide sufficiently reliable information? For the most part, such questions have not been addressed prior to the design and funding of clinical research, probably because decision makers have not taken a prominent role in the selection and design of clinical research studies.
Then, for whatever study is proposed and necessary, how is it going to happen? Who will design, fund, and implement the study? While Medicare now has coverage in place for the use of FDG-PET in cases of suspected dementia and at least two reasonable trial protocols have been developed, funding of the studies has not yet been secured, and there is still currently no access for Medicare patients to PET scanning for AD. Conversations are now under way to discuss how best to expand the infrastructure to support comparative effectiveness research, including PCTs such as the one envisioned by Medicare.
Where and how would it best be implemented? How could you actually connect the efforts of the UCLA group with test beds such as AHRQ’s Devel-
oping Evidence to Inform Decisions about Effectiveness (DEcIDE) network, the HMO research network, practice-based research networks, the American College of Radiology Imaging Network, et cetera? With adequate funding, existing research networks could be engaged to conduct these studies, and new networks would likely be established as demand increases.
Although there are groups willing to design and implement PCTs, an important limitation faced by these groups is the lack of current organizational capacity to focus on their design and implementation. One effort underway to address this need is a project recently initiated at Center for Medical Technology Policy (CMTP). CMTP is based in San Francisco and funded by California Health Care Foundation and Blue Shield of California Foundation. It provides a neutral forum for decision makers—payers, purchasers, patients, and clinicians—to take leadership in creating evidence about comparative clinical effectiveness, with an emphasis on those questions for which prospective, experimental studies may be required. The CMTP will focus on identifying critical knowledge gaps and priority setting of studies from the perspective of decision makers; study design from a pragmatic perspective (i.e., focused on information important to decision makers); and implementation that is rapid, affordable, and still provides reliable evidence for decision making. The work of CMTP is intended to be complementary to the work of the Institute of Medicine (IOM) Roundtable on Evidence-Based Medicine, since its focus is to develop pilot projects around evidence gaps and on promising technologies of high demand or with a major health impact. Some examples include CT angiography, molecular diagnostic tests, intensity-modulated radiation therapy (IMRT), tele-ICU (intensive care unit), and bariatric surgery (minimally invasive). Each of these examples is a technology that is promising, important, and unlikely to be evaluated through the current structure and funding mechanism of the current clinical research enterprises.
COMPUTERIZED PROTOCOLS TO ASSIST CLINICAL RESEARCH
Alan H. Morris, M.D.
Latter Day Saints Hospital and University of Utah
Adequately explicit methods and computerized protocols could allow researchers to efficiently conduct large-scale complex clinical studies and enable translation of research methods into clinical practice. Additionally, they could formalize experiential learning and provide an innovative means of enhancing education for clinicians and clinical researchers. These tools should be embraced by the nation’s clinical research infrastructure.
Clinicians and clinical investigators do not have tools that enable uniform decision making. As a result, currently conducted clinical trials, es-
pecially non-blinded trials, do not use replicable experimental and clinical care methods. This may explain why many critical care clinical trials have failed to produce evidence of clinical benefit, in spite of large investments of resources (Marshall 2000). The disappointingly low quality of critical care clinical trials (Cronin et al. 1995; Lefering and Neugebauer 1995; Cook et al. 1996) could, in part, be due to the widespread use of suboptimal methods. Meta-analyses cannot overcome this low clinical trial quality since meta-analyses can generate credible conclusions only if the analyzed clinical trial data are credible and representative (Morris and Cook 1998; LeLorier et al. 1997). Meta-analyses focus on methodology at the trial design scale (e.g., were true randomization and effective blinding employed?) but do not deal with the methodologic details of the patient-clinician encounter for either outpatient (Johnson 2004) or critical care (Cronin et al. 1995; Lefering and Neugebauer 1995; Cook et al. 1996) clinical trials. The medical community is thus challenged to develop replicable clinical trial methods and to use them to produce more rigorous clinical experiments and results.
Similar challenges exist in the translation of findings to practice. Two major problem areas impede effective healthcare responses and healthcare learning: information overload and absence of effective tools to aid decision-makers at the point and time of decision. Human decision-makers are limited by short-term memory constraints, making them able to deal effectively with only about four individual constructs when making a decision (Morris 2006). This contrasts, strikingly, with the hundreds of patient variables, thousands of constructs, and tens to hundreds of thousands of published documents faced by clinical decision-makers. The recent introduction of genomics and proteomics into medicine has only compounded the problem and will likely increase the information overload by orders of magnitude. Nevertheless, we still depend on a four-year medical school model for learning, influenced by the early twentieth century Flexner Report. Even with current extended postgraduate medical training, we need a new approach to learning in medicine.
Clinician Performance and Clinical Trial Reproducibility
These issues are highlighted by the unnecessary variation in clinical practice that was brought to the healthcare community’s attention in the 1970s (Wennberg and Gittelsohn 1973) and appears to be an unavoidable feature of modern medicine (Senn 2004; Lake 2004; Wennberg and Gittelsohn 1973; Wennberg 2002; Morris 2004). The argument that variability is desirable because of individual patient needs incorporates two assumptions. First is the assumption that clinicians can consistently tailor treatment to a patient’s specific needs, particularly when reliable evidence for preferable therapies is absent. However, clinicians cannot easily predict who will re-
spond to a specific intervention (Senn 2004; Lake 2004) and frequently fail to deal correctly with the individualized needs of patients and thereby cause harm (Silverman 1993; IOM 2001; Horwitz et al. 1996; Redelmeier et al. 2001; Berwick 2003; Runciman et al. 2003; Barach and Berwick 2003; Sox 2003; Senn 2004; Lake 2004; Corke et al. 2005; Berwick 2005; Redelmeier 2005; Warren and Mosteller 1993; IOM 1999). In general, variability is fostered by incorrect perceptions (Morris 2006; Arkes and Hammond 1986; Arkes 1986; Morris 1998, 2000a) and is associated with unwanted and widespread error (IOM 1999; IOM 2001; Runciman et al. 2003; Leape et al. 2002; Kozer et al. 2004; Schiff et al. 2003; Lamb et al. 2003; Zhang et al. 2002). For many, if not most medical interventions, the medical community and the community of patients can only draw conclusions about the balance between potential good and harm through examination of the results of systematic investigations. Second is the assumption that nonuniformity is itself desirable because it fosters insight and innovation. However, many questions addressed in modern medicine frequently involve small improvements (odds ratios of 3 or less) that will escape the attention of most observers if not examined within systematic studies (Hulley and Cummings 1988). The mismatch between human decision-making ability (Redelmeier et al. 2001; Redelmeier 2005; Kahneman et al. 1982) and the excess information clinicians routinely encounter probably contributes to both the variability of performance and the high error rate of clinical decisions (Tversky and Kahneman 1982; Jennings et al. 1982; McDonald 1976; Abramson et al. 1980; Morris et al. 1984; Morris 1985; Morris et al. 1985; Iberti et al. 1990; Iberti et al. 1994; Leape 1994; Gnaegi et al. 1997; Wu et al. 1991). The clinical process improvement movement has successfully adopted a standardization approach (Berwick 2003; James and Hammond 2000; Berwick 1994; Horn and Hopkins 1994; James et al. 1994). Without standardization, our chances of detecting promising elements of clinical management are reduced and frequently low.
It is not enough that the medical community develops standards—clinicians must also follow standards. There is a chasm between perception and practice as well as between healthcare delivery goals and their achievement (IOM 2001). These lead to error and reduce the quality of medical care (IOM 1999). This, in part, has led to the NIH Roadmap call for strategies and tools for translation of research results into clinical practice. This effort should involve a serious engagement with clinical trials. Even with compelling clinical trial results, compliance of physicians with evidence-based treatments or guidelines is low across a broad range of healthcare topics (Evans et al. 1998; Nelson et al. 1998; Schacker et al. 1996; Kiernan et al. 1998; Galuska et al. 1999; Dickerson et al. 1999) and persists even when guidelines based on reputable evidence are available (Akhtar et al. 2003; Rubenfeld et al. 2004; Safran et al. 1996; Redman 1996). Many
factors, including cultural issues and health beliefs, influence compliance (Cochrane 1998; Jones 1998). Widespread distribution of evidence-based guidelines (Schultz 1996; Lomas et al. 1989; Greco and Eisenberg 1993) and education programs (Singer et al. 1998; Pritchard et al. 1998; Teno et al. 1997a; Teno et al. 1997b; Lo 1995) have had only limited impact on this low compliance. However, both paper-based and computerized decision support tools that provide explicit, point-of-care (point-of-decision making) instructions to clinicians have overcome many of these problems and have achieved clinician compliance rates of 90-95 percent (Morris 2000a; East et al. 1999; Acute Respiratory Distress Syndrome Network 2000a, 2006b). There is no threshold beyond which protocols are adequately explicit. Our current operational definition of adequate clinician compliance is: clinicians adequately comply when they accept and carry out at least 90 percent of protocol instructions (Morris 2006, 1998, 2003).
Adequately Explicit Methods
Replication of an experimental result requires a detailed experimental method. This is a challenge for clinical trials for two reasons. First, editorial policies severely restrict methodologic detail in publications. Second and more important, most clinical trials are not conducted with adequately explicit methods. For example, high-frequency ventilation studies in neonates (Courtney et al. 2002; Johnson et al. 2002) were described (Stark 2002) as “rigorously controlled conditions with well-defined protocols” (Courtney et al. 2002) with a method that includes “aggressive weaning if blood gases … remained … in range.” This method statement will not lead different clinicians to the same interpretations and actions. Thus the method is not adequately explicit. Adequately explicit methods include detailed and specific rules such as “if (last PaO2 – current PaO2) < 10, and (current PaO2 time – last PaO2 time) < 2 hours and > 10 minutes, and FIO2 > 0.8, and PEEP > 15 cm H2O), then decrease PEEP by 1 cm H2O.” A rule such as this can lead multiple clinicians to the same decision.
The explicitness of protocols varies continuously. An adequately explicit protocol has detail adequate to generate specific instructions (patient-specific orders). An adequately explicit protocol can elicit the same decision from different clinicians when faced with the same clinical information. Inadequately explicit protocols omit important details (Armstrong et al. 1991; Don 1985; Karlinsky et al. 1991) and elicit different clinical decisions from different clinicians because clinical decision-makers must fill in gaps in the logic of inadequately explicit protocols or guidelines. Judgment, background, and
experience vary among clinicians and so will their choices of the rules and variables they use to fill in the gaps of inadequately explicit guidelines and protocols. In addition, because humans are inconsistent, any single clinician may produce different choices at different times, even though faced with the same patient data (Morris 2006; Morris 1998; Morris 2003).
Computerized adequately explicit protocols can contain the greatest detail (East et al. 1992) and may lead to the upper limit of achievable uniformity of clinician decision making with open-loop control (East et al. 1999; Henderson et al. 1992; Morris et al. 1994; Morris 2000b) (closed-loop controllers automate processes and automatically implement decisions; (Sheppard et al. 1968; Blesser 1969; Sheppard 1980; Sheppard et al. 1974). Paper-based versions can also contain enough detail to be adequately explicit (Acute Respiratory Distress Syndrome Network 2000a, 2000b). Adequately explicit protocols, unlike general guidelines, can serve as physician orders and can function as dynamic standing orders since they can respond to changes in patient state. In continuous quality improvement terms, an adequately explicit method is part of the “stabilization of process” necessary to improve quality (Deming 1986; Shewart 1931; Walton 1986; Deming 1982).
In manuscripts currently being prepared, computerized protocol decision support provides exportable and replicable adequately explicit methods for clinical investigation and clinical care. These computerized protocols enable replicable clinician decisions in single or multiple clinical sites. They thus enable a rigorous scientific laboratory at the holistic clinical investigation scale. We have, during the past two decades (Morris 2006; East et al. 1999; East et al. 1992; Henderson et al. 1992; Morris et al. 1994; Morris 2000b), used computerized protocols for more than 1 million hours in thousands of patients in multiple hospitals with support from NIH (Extra-corporeal CO2 Removal in ARDS, Acute Respiratory Distress Syndrome Network, Reengineering Clinical Research in Critical Care) and AHRQ (Computerized Protocols for Mechanical Ventilation in ARDS).
Our current work with a computerized protocol to control blood glucose with intravenous (IV) insulin (eProtocol-insulin) provides a case study. We compared three 80-110 mg/dL blood glucose target decision support strategies with different detail and process control: (1) a simple guideline (without a bedside tool), (2) a bedside paper-based protocol, and (3) a bedside computerized protocol (eProtocol-insulin). The distributions of blood glucose were significantly different (P < .001) as were the mean blood glucose values and fractions of measurements within the 80-110 mg/dL target range. Thereafter, eProtocol-insulin was introduced and used
at these multiple clinical sites, located in different cultures, after which the blood glucose distributions became almost superimposable. We conclude that eProtocol-insulin provides a replicable and exportable experimental method in different cultures (southeastern and western United States, Asian). eProtocol-insulin has also been used, with similar results, in pediatric ICUs, leading to the conclusion that a common set of rules can operate successfully in both pediatric and adult medicine.
The rules and knowledge base of eProtocol-insulin were embedded in a clinical care electronic medical record in Intermountain Health Care, Inc. Multiple ICUs in different hospitals used eProtocol-insulin for blood glucose management in the usual clinical care of thousands of patients. Instructions for patient-tailored treatment with insulin were generated for more than 100,000 blood glucose measurements with high clinician bedside acceptance (>90 percent of the instructions) and adequate safety. This is a direct approach with computerized decision support tools (eProtocolinsulin) to rapidly translate research results (eProtocol-insulin developed for clinical investigation) into usual clinical practice.
In addition, in studies of computerized protocols for mechanical ventilation of patients with lung failure, the performance of the computerized protocols exceeds that of clinicians who adopt the same goals and metarules as those in the computerized protocol. This suggests that computerized adequately explicit protocols can function as enabling tools that lead clinician decision-makers to more consistently produce the clinical decisions they desire.
Adequately Explicit Methods and Scientific Experimental Requirements
Guidelines and protocols can reduce variation and increase compliance with evidence-based interventions, can effectively support clinical decision making (Grimshaw and Russell 1993), and can influence favorably both clinician performance and patient outcome (Safran et al. 1996; East et al. 1999; Grimm et al. 1975; Wirtschafter et al. 1981; Johnston et al. 1994; Mullett et al. 2001). They likely reduce error (Morris 2002), but this has not been formally studied. Simple protocols, such as physician reminders for a serum potassium measurement when a diuretic is ordered, are commonly employed (Hoch et al. 2003) and have an intuitive appeal to clinicians. More complex protocols have the same potential to aid clinicians and reduce error, but they are more difficult to comprehend. Decision support tools such as guidelines and protocols (Miller and Goodman 1998) are intended to standardize some aspect of clinical care and thereby help lead to uniform implementation of clinical interventions (IOM 1990; Tierney et al. 1995; Fridsma et al. 1996; Ely et al. 2001; MacIntyre 2001). However, many guidelines and protocols lack specific instructions and
are useful only in a conceptual sense (Tierney et al. 1995; Fridsma 1996; Audet, Greenfield, and Field 1990; Fletcher and Fletcher 1990; Hadorn et al. 1992; Miller and Frawly 1995; Tierney et al. 1996). They neither standardize clinical decisions nor lead to uniform implementation of clinical interventions. Guidelines are general statements with little instruction for making specific decisions (Guidelines Committee Society of Critical Care Medicine 1992). In contrast, protocols are more detailed and can provide specific instructions. Unfortunately, even systematic and scholarly collections of flow diagrams commonly lack the detail necessary to standardize clinical decisions (Armstrong et al. 1991; Don 1985; Karlinsky et al. 1991). The distinction between guidelines and protocols, particularly adequately explicit protocols, is crucial (Morris 1998, 2000a; Holcomb et al. 2001). Most clinical investigators do not seem to recognize the difference between common ordinary protocols and guidelines and the uncommon adequately explicit methods that satisfy the scientific requirement of replicability (Morris 1998, 2000a; Hulley and Cummings 1988; Morris 2003; Pocock 1983; Atkins 1958).
Replicability of experimental results (confirmation of an observation) is a fundamental requirement for general acceptance of new knowledge in scientific circles (Campbell and Stanley 1966; Guyatt et al. 1993, 1994; Justice et al. 1999; Emanuel et al. 2000; Babbie 1986; Barrow 2000; Hawe et al. 2004). Actual or potential replicability of results is a basic requirement of all rigorous scientific investigation, regardless of scale (Campbell and Stanley 1966; Barrow 2000; Giancoli 1995; Pocock 1983; Bailey 1996; Piantadosi 1997; Brook et al. 2000; Hulley et al. 2001; Sackett et al. 1991; Babbie 1990). It applies equally to reductionist research in cell biology and to holisitic research in the integrated clinical environment. Recognition of the scale of inquiry (investigation) is important because results at one scale may be inapplicable at another (Morris 1998; Mandelbrot 1983). The Cardiac Arrhythmia Suppression Trial provides one example. While the test drugs effectively suppressed premature ventricular contractions (at the electrophysiologic scale) they were associated with an excess death rate in the treatment group at the holistic clinical care scale (Cardiac Arrhythmia Suppression Trial [CAST] Investigators 1989; Greene et al. 1992). The disparity between the results at the electrophysiologic scale and those at the holistic clinical care scale is a sobering example of emergent properties of complex systems (Schultz 1996) and a striking reminder of the need for replicable holistic clinical outcome data from rigorously conducted clinical studies.
Evidence-based clinician decision-making information emanates from the holistic clinical environment and comes from two major sources: clinical studies and clinical experience. Advances at lower scales of inquiry cannot replace the study of sick patients in the holistic clinical environment (Morris 1998; Schultz 1996). Clinical studies include observational studies and the
more scientifically rigorous clinical trials (experiments). Usual experiential learning by clinicians can lead to important sentinel observations, but it also contributes to unnecessary variation in practice. Experiential learning is achieved within local contexts that contribute local bias. Since local contexts vary, this leads to highly variable, although strongly held, clinical opinions. Adequately explicit methods can also formalize experiential learning by enabling common methods among multiple users, sites, and disciplines (e.g., pediatrics, internal medicine). Interactions between different users of a common computerized protocol in an extended (distributed) development laboratory permit greater refinement of protocol rules than is likely to be realized in a single clinical development site. This reduces local bias and contributes to generalizability.
Attention to sources of nonuniformity between experimental groups is an essential part of experimental design (Pocock 1983; Hulley et al. 2001; Hennekens and Buring 1987; Rothman and Greenland 1998; Friedman et al. 1998; Chow and Liu 2004; Piantadosi 1997). Confounding variables (confounders) exist among the multiple variables that may determine subject outcome and can alter or reverse the results of clinical trials (Pocock 1983; Hulley et al. 2001; Hennekens and Buring 1987; Rothman and Greenland 1998; Cochrane-Collaboration 2001). Confounders can be present both before and after random allocation of subjects to the experimental groups of a clinical trial (Rothman and Greenland 1998). Those confounders present before allocation, commonly recognized in epidemiology texts (Hennekens and Buring 1987; Rothman and Greenland 1998), are usually adequately addressed by randomization and by restriction of randomized subjects. However, confounders introduced after subject assignment to the experimental groups include cointerventions (Hulley et al. 2001; Cochrane-Collaboration 2001; Sackett et al. 1991). Like the experimental intervention, cointerventions result from the interaction of the subject with the clinical environment (e.g., mechanical ventilation strategy, drug therapy for hypotension, intravenous fluid therapy, diagnostic strategies for suspected infection, monitoring intervals, laboratory tests, antibiotic therapy, sedation). They are easily overlooked but can alter or invalidate the results of clinical trials. Unlike the experimental intervention, cointerventions rarely receive adequate attention in the protocols of RCTs.
The Potential for Adequately Explicit Methods
Large-Scale Distributed Research Laboratories
Consider the difference in the number (N) of units of analysis for a chemical experiment involving 10 mL of 1M HCl and that of units of analysis for a clinical trial. The chemical experiment engages 6.02 × 1021
interactions (0.01 × Avogadro’s number). A clinical trial usually involves no more than 1,000 patients. Because of the constraints of small N, clinical trials require sophisticated, and frequently difficult to understand, statistical analyses. To overcome this serious clinical experimental limitation, the medical community should, in my opinion, develop large-scale distributed human outcomes research laboratories. These laboratories could be developed within the clinical care environment on multiple continents if we had easily distributable, replicable clinical experimental and clinical care methods. Such laboratories could, for example, deliver 200,000 ICU experimental subjects with a few months. They could enable the experimental definition of dose-response curves, rather than the common current goal of comparing two experimental groups. They could avoid the onerous attributes of current clinical trial techniques, including loss of enthusiasm among investigative teams, and pernicious secular changes.
Adequately explicit computerized protocols have already been implemented for continuous quality improvement, for clinical trials, and for clinical care. Work for the past two decades has been focused on the ICU, because of two enabling ICU attributes: a highly quantified environment and rapid evolution of clinical problems. Many clinical environments do not posses these attributes. However, certain clinical problems in non-ICU environments seem good targets for application of computerized protocols as well. These include the clinical care (e.g., medication titration) of outpatients with congestive heart failure and outpatients with insulin-dependent diabetes mellitus. Both of these problems could benefit from titration in the home with capture of only a few clinically important data elements. Other clinical problems such as psychiatric disorders seem less obvious targets for application of adequately explicit protocols. However, even early work suggests that this domain of clinical practice is also appropriate for application of rule-based decision-support systems (Meehl and Rosen 1955).
Emergency situations may preclude clinician keyboard-based interactions, such as those we have been using. However, voice recognition, handheld wireless communication devices, and other technologies will likely be widely available in the future and will enable more efficient and less distracting interactions with adequately explicit protocols. Future developments will likely reduce or remove the barrier that keyboard data entry now represents for emergency and other urgent clinician decisions that require immediate and repeated interactions. We believe that our work with adequately explicit computerized protocols has just scratched the surface. Much more exploration will be necessary before appropriate, and certainly before comprehensive, answers to these questions can be offered.
At this early stage of application and evaluation, it likely that many, but not all, clinical problems will be appropriate targets for adequately explicit protocols (computerized or not). This seems a reasonable conclusion if only
because of the extensive literature indicating favorable outcome changes when decision support tools of many kinds are employed to aid clinician decision makers. It has seemed to my colleagues and to me that application of adequately explicit computerized treatment protocols for titration of clinical problems is more easily achieved than application of protocols for diagnosis. The diagnostic challenges frequently seem broader and more encompassing than the treatment challenges once a diagnosis has been made. Furthermore, many clinical decisions should embrace the wishes of patients or their surrogates. Capturing patient or surrogate assessments of outcome utilities and incorporating them in the rules of adequately explicit protocols seems a daunting but surmountable challenge. More systematic work is needed to define the roles of adequately explicit computerized protocols in many diagnostic and therapeutic arenas.
The evaluation of a potential target includes assessment of the reliability of available measurements and other replicable data. A measurement-rich and quantified clinical setting increases the likelihood of driving adequately explicit rules with patient-specific data. However, even difficult-to-define constructs such as “restlessness” can be made more replicable by listing the specific observations a clinician might use to identify the construct. We all have only five senses through which we receive information from the world about us. The challenge of knowledge engineering is to specify the few elements received by these senses that drive specific decisions. Our experience during the past two decades indicates that this is manageable.
Formalizing Experiential Learning as a Means of Enabling a Learning Healthcare System
Adequately explicit computerized protocols could supplement traditional peer-reviewed publication with direct electronic communication between research investigators and thereafter between investigators and clinical care users. This could introduce a new way of developing and distributing knowledge. Evidence-based knowledge for clinical decision making comes from two sources: first, from formal studies that include observational and experimental work (RCTs provide the most compelling results); second, from experiential knowledge. Currently, this experiential knowledge is derived primarily from individual experience and thus is influenced by local factors and bias. This individual experience contributes to strongly held but variable opinions that lead to unnecessary variation in clinical practice (Wennberg 2002). Adequately explicit computerized protocols could formalize this experiential learning in two sequential stages.
In the first stage, knowledge could be captured through multiple investigator and center participation in development and refinement of protocol rules. We have used this process successfully in our current NIH Roadmap
contract work with blood glucose management. Our current computerized protocol for blood glucose management with IV insulin (eProtocol-insulin) was developed and refined by collaborators who include multiple pediatric and adult intensivists in more than 14 U.S. and Canadian clinical sites. This diminishes local factor and bias concerns; they become smaller as the number of different participants and institutions increase.
In the second stage, education of practitioners could occur during utilization of a protocol for clinical care. Adequately explicit computerized protocols could take advantage of an electronic infrastructure and translate research experience into clinical practice by adopting a direct electronic education strategy at the point of care or point of decision making. For example, the adequately explicit instructions of eProtocol-insulin could be linked to a new on-demand explanatory educational representation of the protocol logic. A user could question the specific eProtocol-insulin instruction at whatever level of detail the user wishes. The knowledge captured by the protocol developers during the first stage could thus be presented at the time, and within the context, of a specific clinical care question, but only when demanded and without requiring the user to address the published literature. This new educational strategy could complement traditional knowledge transfer through education based on reading and coursework and through published work. For some activities this new educational strategy could become the dominant learning strategy for clinicians. For example, when protocols are modified to incorporate new knowledge, the updated electronic protocol, once validated appropriately, could become the expected and most direct route for transferring this new knowledge to the clinical practitioner.
Abramson, N, K Wald, A Grenvik, D Robinson, and J Snyder. 1980. Adverse occurrences in intensive care units. Journal of the American Medical Association 244:1582-1584.
Acute Respiratory Distress Syndrome Network. 2000a. Mechanical Ventilation Protocol. Available from www.ardsnet.org or NAPS Document No 05542 (Microfiche Publications, 248 Hempstead Turnpike, West Hempstead, NY).
———. 2000b. Ventilation with lower tidal volumes as compared with traditional tidal volumes for acute lung injury and the Acute Respiratory Distress Syndrome. New England Journal of Medicine 342(18):1301-1308.
Akhtar, S, J Weaver, D Pierson, and G Rubenfeld. 2003. Practice variation in respiratory therapy documentation during mechanical ventilation. Chest 124(6):2275-2282.
American Psychiatric Association (APA). 1998. Practice Guideline for the Treatment of Patients with Bipolar Disorder. Washington, DC: APA.
Arkes, H. 1986. Impediments to accurate clinical judgment and possible ways to minimize their impact. In Judgment and Decision Making: An Interdisciplinary Reader, edited by H Arkes and K Hammond. Cambridge, UK: Cambridge University Press.
Arkes, H, and K Hammond, eds. 1986. Judgment and Decision Making: An Interdisciplinary Reader. Cambridge, UK: Cambridge University Press.
Armstrong, R, C Bullen, S Cohen, M Singer, and A Webb. 1991. Critical Care Algorithms. New York: Oxford University Press.
Atkins, H. 1958 (December). The three pillars of clinical research. British Medical Journal 2(27):1547-1553.
Audet, A-M, S Greenfield, and M Field. 1990. Medical practice guidelines: current activities and future directions. Annals of Internal Medicine 113:709-714.
Babbie, E. 1986. Observing Ourselves: Essays in Social Research. Belmont, CA: Wadsworth Publishing Co.
———. 1990. Survey Research Methods. Belmont, CA: Wadsworth Publishing Co.
Bailey, R. 1996. Human Performance Engineering. 3rd ed. Upper Saddle River: Prentice Hall.
Barach, P, and D Berwick. 2003. Patient safety and the reliability of health care systems. Annals of Internal Medicine 138(12):997-998.
Barrow, J. 2000. The Book of Nothing. New York: Pantheon Books.
Bazil, C, and T Pedley. 1998. Advances in the medical treatment of epilepsy. Annual Review of Medicine 49:135-162.
Berwick, D. 1994. Eleven worthy aims for clinical leadership of health system reform. Journal of the American Medical Association 272(10):797-802.
———. 2003. Disseminating innovations in health care. Journal of the American Medical Association 289(15):1969-1975.
———. 2005. My right knee. Annals of Internal Medicine 142(2):121-125.
Blesser, W. 1969. A Systems Approach to Biomedicine. New York: McGraw-Hill Book Company.
Brook, R, E McGlynn, and P Shekelle. 2000. Defining and measuring quality of care: a perspective from U.S. researchers. International Journal of Quality Health Care 12(4):281-295.
Bull, S, C Conell, and D Campen. 2002. Relationship of clinical factors to the use of Cox-2 selective NSAIDs within an arthritis population in a large HMO. Journal of Managed Care Pharmacy 8(4):252-258.
Campbell, D, and J Stanley. 1966. Experimental and Quasi-Experimental Designs for Research (reprinted from Handbook of Research on Teaching, 1963). Boston, MA: Houghton Mifflin Co.
Cardiac Arrhythmia Suppression Trial (CAST) Investigators. 1989. Preliminary report: effect of encainide and flecainide on mortality in a randomized trial of arrhythmia suppression after myocardial infarction. New England Journal of Medicine 321:406-412.
Chow, S-C, and J-P Liu. 2004. Design and Analysis of Clinical Trials. Hoboken, NJ: John Wiley & Sons, Inc.
Clancy, C. 2006 (July 20-21) Session 1: Hints of a Different Way—Case Studies in Practice-Based Evidence, Opening Remarks. Presentation at the Roundtable on Evidence-Based Medicine Workshop, The Learning Health Care System. Washington, DC: Institute of Medicine, Roundtable on Evidence-Based Medicine.
Cochrane, G. 1998. Compliance in asthma. European Respiratory Review 8(56):239-242.
Cochrane-Collaboration. 2001. The Cochrane Reviewer’s Handbook Glossary. Version 4.1.4. M Clarke and A Oxman, eds. In The Cochrane Library, Issue 4, 2001. Oxford: Update Software.
Cook, D, B Reeve, G Guyatt, D Heyland, L Griffith, L Buckingham, and M Tryba. 1996. Stress ulcer prophylaxis in critically ill patients. Resolving discordant meta-analyses. Journal of the American Medical Association 275(4):308-314.
Cook, T, D Campbell. 1979. Quasi-Experimentation: Design and Analyses Issues for Field Settings. Boston, MA: Houghton Mifflin Co.
Corke, C, P Stow, D Green, J Agar, and M Henry. 2005. How doctors discuss major interventions with high risk patients: an observational study. British Medical Journal 330(7484):182.
Courtney, S, D Durand, J Asselin, M Hudak, J Aschner, and C Shoemaker. 2002. High-frequency oscillatory ventilation versus conventional mechanical ventilation for very-low-birth-weight infants. New England Journal of Medicine 347(9):643-652.
Cronin, L, D Cook, J Carlet, D Heyland, D King, M Lansang, and C Fisher, Jr. 1995. Corticosteroid treatment for sepsis: a critical appraisal and meta-analysis of the literature. Critical Care Medicine 23(8):1430-1439.
Deming, W. 1982. Quality, Productivity, and Competitive Position. Cambridge, MA: Massachusetts Institute of Technology, Center for Advanced Engineering Study.
———. 1986. Out of the Crisis. Cambridge: Massachusetts Institute of Technology, Center for Advanced Engineering Study.
Dickerson, J, A Hingorani, M Ashby, C Palmer, and M Brown. 1999. Optimisation of antihypertensive treatment by crossover rotation of four major classes. Lancet 353(9169):2008-2013.
Don, H, ed. 1985. Decision Making in Critical Care, Clinical Decision Making Series. Philadelphia, PA: BC Decker Inc.
East, T, S Böhm, C Wallace, T Clemmer, L Weaver, J Orme Jr., and A Morris. 1992. A successful computerized protocol for clinical management of pressure control inverse ratio ventilation in ARDS patients. Chest 101(3):697-710.
East, T, L Heermann, R Bradshaw, A Lugo, R Sailors, L Ershler, C Wallace, A Morris, G McKinley, A Marquez, A Tonnesen, L Parmley, W Shoemaker, P Meade, P Taut, T Hill, M Young, J Baughman, M Olterman, V Gooder, B Quinnj, W Summer, V Valentine, J Carlson, B Bonnell, B deBoisblanc, Z McClarity, J Cachere, K Kovitz, E Gallagher, M Pinsky, D Angus, M Cohenj, L Hudson, and K Steinberg. 1999. Efficacy of computerized decision support for mechanical ventilation: results of a prospective multi-center randomized trial. Proceedings of the American Medical Informatics Association Symposium 251-255.
Ely, E, M Meade, E Haponik, M Kollef, D Cook, G Guyatt, and J Stoller. 2001. Mechanical ventilator weaning protocols driven by nonphysician health-care professionals: evidence-based clinical practice guidelines. Chest 120(90060):454S-463S.
Emanuel, E, D Wendler, and C Grady. 2000. What makes clinical research ethical? Journal of the American Medical Association 283(20):2701-2711.
Evans, R, S Pestotnik, D Classen, T Clemmer, L Weaver, J Orme Jr., J Lloyd, and J Burke. 1998. A computer-assisted management program for antibiotics and other anti-infective agents. New England Journal of Medicine 338(4):232-238.
Fletcher, R, and S Fletcher. 1990. Clinical practice guidelines. Annals of Internal Medicine 113:645-646.
Fridsma, D, J Gennari, and M Musen. 1996. Making Generic Guidelines Site-Specific. Paper read at Proceedings 1996 AMIA Annual Fall Symposium (Formerly SCAMC), at Washington, DC.
Friedman, G, M Collen, L Harris, E Van Brunt, and L Davis. 1971. Experience in monitoring drug reactions in outpatients. The Kaiser-Permanente Drug Monitoring System. Journal of the American Medical Association 217(5):567-572.
Friedman, L, C Furberg, and D DeMets. 1998. Fundamentals of Clinical Trials. 3rd ed. New York: Springer-Verlag.
Galuska, D, J Will, M Serdula, and E Ford. 1999. Are health care professionals advising obese patients to lose weight? Journal of the American Medical Association 282(16): 1576-1578.
Garrido, T, L Jamieson, Y Zhou, A Wiesenthal, and L Liang. 2005. Effect of electronic health records in ambulatory care: retrospective, serial, cross sectional study. British Medical Journal 330(7491):581.
Giancoli, D. 1995. Physics. 3rd ed. Englewood Cliffs, NJ: Prentice Hall.
Gnaegi, A, F Feihl, and C Perret. 1997. Intensive care physicians’ insufficient knowledge of right-heart catheterization at the bedside: time to act?. Critical Care Medicine 25(2): 213-220.
Graham, D, D Campen, R Hui, M Spence, C Cheetham, G Levy, S Shoor, and W Ray. 2005. Risk of acute myocardial infarction and sudden cardiac death in patients treated with cyclo-oxygenase 2 selective and non-selective non-steroidal anti-inflammatory drugs: nested case-control study. Lancet 365(9458):475-481.
Greco, P, and J Eisenberg. 1993. Changing physicians’ practices. New England Journal of Medicine 329(17):1271-1274.
Greene, H, D Roden, R Katz, R Woosley, D Salerno, R Henthorn, and CASTinvestigators. 1992. The Cardiac Arrythmia Suppression Trial: First CAST … Then CAST-II. Journal of the American College of Cardiology 19:894-898.
Greene, S, and A Geiger. 2006. A review finds that multicenter studies face substantial challenges but strategies exist to achieve Institutional Review Board approval. Journal of Clinical Epidemiology 59(8):784-790.
Grimm, R, K Shimoni, W Harlan, and E Estes. 1975. Evaluation of patient-care protocol use by various providers. New England Journal of Medicine 292:507-511.
Grimshaw, J, and I Russell. 1993. Effect of clinical guidelines on medical practice: a systematic review of rigorous evaluations. Lancet 342:1317-1322.
Guidelines Committee Society of Critical Care Medicine. 1992. Guidelines for the care of patients with hemodynamic instability associated with sepsis. Critical Care Medicine 20(7):1057-1059.
Guyatt, G, D Sackett, and D Cook. 1993. User’s guide to the medical literature: II. How to use an article about therapy or prevention: A. Are the results of the study valid? Journal of the American Medical Association 270(21):2598-2601.
———. 1994. User’s guide to the medical literature: II. How to use and article about therapy or prevention; B. What were the results and will they help me in caring for my patient? Journal of the American Medical Association 271(1):59-63.
Hadorn, D, K McCormick, and A Diokno. 1992. An annotated algorithm approach to clinical guideline development. Journal of the American Medical Association 267(24): 3311-3314.
Hawe, P, A Shiell, and T Riley. 2004. Complex interventions: how “out of control” can a randomised controlled trial be? British Medical Journal 328(7455):1561-1563.
Henderson, S, R Crapo, C Wallace, T East, A Morris, and R Gardner. 1992. Performance of computerized protocols for the management of arterial oxygenation in an intensive care unit. International Journal of Clininical Monitoring and Computing 8:271-280.
Hennekens, C, and J Buring. 1987. Epidemiology in Medicine. Edited by S Mayrent. 1st ed. 1 volume. Boston, MA: Little, Brown and Company.
Henriksen, O. 1998. An overview of benzodiazepines in seizure management. Epilepsia 39 (Suppl. 1):S2-S6.
Hoch, I, A Heymann, I Kurman, L Valinsky, G Chodick, and V Shalev. 2003. Countrywide computer alerts to community physicians improve potassium testing in patients receiving diuretics. Journal of the American Medical Informatics Association 10(6):541-546.
Holcomb, B, A Wheeler, and E Ely. 2001. New ways to reduce unnecessary variation and improve outcomes in the intensive care unit. Current Opinion in Critical Care 7(4):304-311.
Horn, S, and D Hopkins, eds. 1994. Clinical Practice Improvement: A New Technology for Developing Cost-Effective Quality Health Care. Vol. 1, Faulkner & Gray’s Medical Outcomes and Practice and Guidelines Library. New York: Faulkner & Gray, Inc.
Horwitz, R, B Singer, R Makuch, and C Viscoli. 1996. Can treatment that is helpful on average be harmful to some patients? A study of the conflicting information needs of clinical inquiry and drug regulation. Journal of Clinical Epidemiology 49(4):395-400.
Hulley, S, and S Cummings. 1988. Designing Clinical Research. Baltimore, MD: Williams and Wilkins.
Hulley, S, S Cummings, S Warren, D Grady, N Hearst, and T Newman. 2001. Designing Clinical Research. 2nd ed. Philadelphia, PA: Lippincott Williams and Wilkins.
Iberti, T, E Fischer, M Leibowitz, E Panacek, J Silverstein, T Albertson, and PACS Group. 1990. A multicenter study of physician’s knowledge of the pulmonary artery catheter. Journal of the American Medical Association 264:2928-2932.
Iberti, T, E Daily, A Leibowitz, C Schecter, E Fischer, and J Silverstein. 1994. Assessment of critical care nurses’ knowledge of the pulmonary artery catheter. The Pulmonary Artery Catheter Study Group. Critical Care Medicine 22:1674-1678.
IOM (Institute of Medicine). 1990. Clinical Practice Guidelines: Directions for a New Program. Washington, DC: National Academy Press.
———. 1999. To Err Is Human: Building a Safer Health System. Washington, DC: National Academy Press.
———. 2001. Crossing the Quality Chasm: A New Health System for the 21st Century. Washington, DC: National Academy Press.
James, B, and M Hammond. 2000. The challenge of variation in medical practice. Archives of Pathology & Laboratory Medicine 124(7):1001-1003.
James, B, S Horn, and R Stephenson. 1994. Management by fact: What is CPI and how is it used? In Clinical Practice Improvement: A New Technology for Developing Cost-Effective Quality Health Care, edited by S Horn and D Hopkins. New York: Faulkner & Gray, Inc.
Jennings, D, T Amabile, and L Ross. 1982. Informal covariation assessment: data-based versus theory-based judgments. In Judgment Under Uncertainty: Heuristics and Biases, edited by D Kahneman, P Slovic, and A Tversky. Cambridge, UK: Cambridge University Press.
Johnson, A, J Peacock, A Greenough, N Marlow, E Limb, L Marston, and S Calvert. 2002. High-frequency oscillatory ventilation for the prevention of chronic lung disease of prematurity. New England Journal of Medicine 347(9):633-642.
Johnson, B. 2004. Review: prophylactic use of vitamin D reduces falls in older persons. Evidence Based Medicine 9(6):169.
Johnston, M, K Langton, B Haynes, and A Mathieu. 1994. Effects of computer-based clinical decision support systems on clinician performance and patient outcome. Annals of Internal Medicine 120:135-142.
Jones, P. 1998. Health status, quality of life and compliance. European Respiratory Review 8(56):243-246.
Justice, A, K Covinsky, and J Berlin. 1999. Assessing the generalizability of prognostic information. Annals of Internal Medicine 130(6):515-524.
Kahneman, D, P Slovik, and A Tversky. 1982. Judgment Under Uncertainty: Heuristics and Biases. Cambridge, UK: Cambridge University Press.
Karlinsky, J, J Lau, and R Goldstein. 1991. Decision Making in Pulmonary Medicine. Philadelphia, PA: BC Decker.
Kiernan, M, A King, H Kraemer, M Stefanick, and J Killen. 1998. Characteristics of successful and unsuccessful dieters: an application of signal detection methodology. Annals of Behavioral Medicine 20(1):1-6.
Kolata, G. 2006. Medicare says it will pay, but patients say ‘no thanks’. New York Times, March 3.
Kozer, E, W Seto, Z Verjee, C Parshuram, S Khattak, G Koren, and D Jarvis. 2004. Prospective observational study on the incidence of medication errors during simulated resuscitation in a paediatric emergency department. British Medical Journal 329(7478):1321.
Lake, A. 2004. Every prescription is a clinical trial. British Medical Journal 329(7478):1346.
Lamb, R, D Studdert, R Bohmer, D Berwick, and T Brennan. 2003. Hospital disclosure practices: results of a national survey. Health Affairs 22(2):73-83.
Leape, L. 1994. Error in medicine. Journal of the American Medical Association 272: 1851-1857.
Leape, L, D Berwick, and D Bates. 2002. What practices will most improve safety? Evidence-based medicine meets patient safety. Journal of the American Medical Association 288(4):501-507.
Lefering, R, and E Neugebauer. 1995. Steroid controversy in sepsis and septic shock: a meta-analysis. Critical Care Medicine 23(7):1294-1303.
LeLorier, J, G Gregoire, A Benhaddad, J Lapierre, and F Derderian. 1997. Discrepancies between meta-analyses and subsequent large randomized, controlled trials. New England Journal of Medicine 337(8):536-542.
Lo, B. 1995. Improving care near the end of life: why is it so hard? Journal of the American Medical Association 274:1634-1636.
Lomas, J, G Anderson, K Domnick-Pierre, E Vayda, M Enkin, and W Hannah. 1989. Do practice guidelines guide practice? The effect of a consensus statement on the practice of physicians. New England Journal of Medicine 321(19):1306-1311.
MacIntyre, N. 2001. Evidence-based guidelines for weaning and discontinuing ventilatory support: a collective task force facilitated by the American College of Chest Physicians; the American Association for Respiratory Care; and the American College of Critical Care Medicine. Chest 120 (90060):375S-396S.
Mandelbrot, B. 1983. The Fractal Geometry of Nature. New York: W. H. Freeman and Company.
Marshall, J. 2000. Clinical trials of mediator-directed therapy in sepsis: what have we learned? Intensive Care Medicine 26(6 Suppl. 1):S75-S83.
McDonald, C. 1976. Protocol-based computer reminders, the quality of care and the non-perfectability of man. New England Journal of Medicine 295:1351-1355.
Meehl, P, and A Rosen. 1955. Antecedent probability and the efficiency of psychometric signs, patterns, or cutting scores. Psychological Bulletin 52(3):194-216.
Miller, P, and S Frawly. 1995. Trade-offs in producing patient-specific recommendations from a computer-based clinical guideline: a case study. Journal of the American Medical Informatics Association 2:238-242.
Miller, R, and K Goodman. 1998. Ethical challenges in the use of decision-support software in clinical practice. In Ethics, Computing, and Medicine: Informatics and the Transformation of Health Care, edited by K Goodman. Cambridge, UK: Cambridge University Press.
Morris, A. 1985. Elimination of pulmonary wedge pressure errors commonly encountered in the ICU. Cardiologia (Italy) 30(10):941-943.
———. 1998. Algorithm-based decision making. In Principles and Practice of Intensive Care Monitoring, edited by M Tobin. New York: McGraw-Hill, Inc.
———. 2000a. Developing and implementing computerized protocols for standardization of clinical decisions. Annals of Internal Medicine 132:373-383.
———. 2000b. Evaluating and Refining a Hemodynamic Protocol for use in a multicenter ARDS clinical trial. American Journal of Respiratory and Critical Care Medicine (ATS Proceedings Abstracts) 161(3 Suppl.):A378.
———. 2002. Decision support and safety of clinical environments. Quality and Safety in Health Care 11:69-75.
———. 2003. Treatment algorithms and protocolized care. Current Opinions in Critical Care 9 (3):236-240.
———. 2004. Iatrogenic illness: a call for decision support tools to reduce unnecessary variation. Quality and Safety in Health Care 13(1):80-81.
———. 2006. The importance of protocol-directed patient management for research on lung-protective ventilation. In Ventilator-Induced Lung Injury, edited by D Dereyfuss, G Saumon, and R Hubamyr. New York: Taylor & Francis Group.
Morris, A, and D Cook. 1998. Clinical trial issues in mechanical ventilation. In Physiologic Basis of Ventilatory Support, edited by J Marini and A Slutsky. New York: Marcel Dekker, Inc.
Morris, A, R Chapman, and R Gardner. 1984. Frequency of technical problems encountered in the measurement of pulmonary artery wedge pressure. Critical Care Medicine 12(3):164-170.
———. 1985. Frequency of wedge pressure errors in the ICU. Critical Care Medicine 13:705-708.
Morris, A, C Wallace, R Menlove, T Clemmer, J Orme, L Weaver, N Dean, F Thomas, T East, M Suchyta, E Beck, M Bombino, D Sittig, S Böhm, B Hoffmann, H Becks, N Pace, S Butler, J Pearl, and B Rasmusson. 1994. Randomized clinical trial of pressure-controlled inverse ratio ventilation and extracorporeal CO2 removal for ARDS [erratum 1994;149(3, Pt 1):838, Letters to the editor 1995;151(1):255-256, 1995;151(3):1269-1270, and 1997;156(3):1016-1017]. American Journal of Respiratory Critical Care Medicine 149(2):295-305.
Mullett, C, R Evans, J Christenson, and J Dean. 2001. Development and impact of a computerized pediatric antiinfective decision support program. Pediatrics 108(4):E75.
Nelson, E, M Splaine, P Batalden, and S Plume. 1998. Building measurement and data collection into medical practice. Annals of Internal Medicine 128:460-466.
Newhouse, J, W Manning, C Morris, L Orr, N Duan, E Keeler, A Leibowitz, K Marquis, M Marquis, C Phelps, and R Brook. 1981. Some interim results from a controlled trial of cost sharing in health insurance. New England Journal of Medicine 305(25):1501-1507.
Park, S, D Ross-Degnan, A Adams, J Sabin, and S Soumerai. 2005. A population-based study of the effect of switching from typical to atypical antipsychotics on extrapyramidal symptoms among patients with schizophrenia. British Journal of Psychiatry 187:137-142.
Pearson, S, S Soumerai, C Mah, F Zhang, L Simoni-Wastila, C Salzman, L Cosler, T Fanning, P Gallagher, and D Ross-Degnan. 2006. Racial disparities in access following regulatory surveillance of benzodiazepines. Archives of Internal Medicine 166:572-579.
Persell, S, J Wright, J Thompson, K Kmetik, and D Baker. 2006. Assessing the validity of national quality measures for coronary artery disease using an electronic health record. Archives of Internal Medicine 166(20):2272-2272.
Piantadosi, S. 1997. Clinical Trials: A Methodologic Perspective. New York: John Wiley & Sons, Inc.
Platt, R, R Davis, J Finkelstein, A Go, J Gurwitz, D Roblin, S Soumerai, D Ross-Degnan, S Andrade, M Goodman, B Martinson, M Raebel, D Smith, M Ulcickas-Yood, and K Chan. 2001. Multicenter epidemiologic and health services research on therapeutics in the HMO Research Network Center for Education and Research on Therapeutics. Pharmacoepidemiology and Drug Safety 10(5):373-377.
Pocock, S. 1983. Clinical Trials: A Practical Approach. Original edition. New York: John Wiley & Sons, Inc.
Pritchard, R, E Fisher, J Teno, S Sharp, D Reding, W Knaus, J Wennberg, and J Lynn. 1998. Influence of patient preferences and local health system characteristics on the place of death. SUPPORT Investigators. Study to Understand Prognoses and Preferences for Outcomes and Risks of Treatment. Journal of the American Geriatrics Society 46(10):1242-1250.
Ray, W, J Daugherty, and K Meador. 2003. Effect of a mental health “carve-out” program on the continuity of antipsychotic therapy. New England Journal of Medicine 348(19):1885-1894.
Redelmeier, D. 2005. The cognitive psychology of missed diagnoses. Annals of Internal Medicine 142(2):115-120.
Redelmeier, D, L Ferris, J Tu, J Hux, and M Schull. 2001. Problems for clinical judgment: introducing cognitive psychology as one more basic science. Canadian Medical Association Journal 164(3):358-360.
Redman, B. 1996. Clinical practice guidelines as tools of public policy: conflicts of purpose, issues of autonomy, and justice. Journal of Clinical Ethics 5(4):303-309.
Ries, A, B Make, S Lee, M Krasna, M Bartels, R Crouch, and A Fishman. 2005. The effects of pulmonary rehabilitation in the national emphysema treatment trial. Chest 128(6):3799-3809.
Roblin, D, R Platt, M Goodman, J Hsu, W Nelson, D Smith, S Andrade, and S Soumerai. 2005. Effect of increased cost-sharing on oral hypoglycemic use in five managed care organizations: how much is too much? Medical Care 43 (10):951-959.
Ross-Degnan, D, L Simoni-Wastila, J Brown, X Gao, C Mah, L Cosler, T Fanning, P Gallagher, C Salzman, R Shader, T Inui, and S Soumerai. 2004. A controlled study of the effects of state surveillance on indicators of problematic and non-problematic benzodiazepine use in a Medicaid population. International Journal of Psychiatry in Medicine 34(2):103-123.
Rothman, K, and S Greenland. 1998. Modern Epidemiology. 2nd ed. Philadelphia, PA: Lippincott-Raven.
Roundtable on Evidence-Based Medicine. 2006. Institute of Medicine Roundtable on Evidence-Based Medicine Charter and Vision Statement. Available from http://www.iom.edu/CMS/28312/RT-EBM/33544.aspx (accessed April 27, 2007).
Rubenfeld, G, C Cooper, G Carter, T Thompson, and L Hudson. 2004. Barriers to providing lung-protective ventilation to patients with acute lung injury. Critical Care Medicine 32(6):1289-1293.
Runciman, W, A Merry, and F Tito. 2003. Error, blame and the law in health care: an antipodean perspective. Annals of Internal Medicine 138(12):974-979.
Sackett, D, R Haynes, G Guyatt, and P Tugwell. 1991. Clinical Epidemiology: A Basic Science for Clinical Medicine. 2nd ed. Boston, MA: Little, Brown and Company.
Safran, C, D Rind, R Davis, D Ives, D Sands, J Currier, W Slack, D Cotton, and H Makadon. 1996. Effects of a knowledge-based electronic patient record on adherence to practice guidelines. MD Computing 13(1):55-63.
Schacker, T, A Collier, J Hughes, T Shea, and L Corey. 1996. Clinical and epidemiologic features of primary HIV infection [published erratum appears in Ann Intern Med 1997 Jan 15;126(2):174]. Annals of Internal Medicine 125(4):257-264.
Schiff, G, D Klass, J Peterson, G Shah, and D Bates. 2003. Linking laboratory and pharmacy: opportunities for reducing errors and improving care. Archives of Internal Medicine 163(8):893-900.
Schultz, S. 1996. Homeostasis, humpty dumpty, and integrative biology. News in Physiological Science 11:238-246.
Schwartz, D, and J Lellouch. 1967. Explanatory and pragmatic attitudes in therapeutical trials. Journal of Chronic Diseases 20(8):637-648.
Selby, J. 1997. Linking automated databases for research in managed care settings. Annals of Internal Medicine 127(8 Pt 2):719-724.
Senn, S. 2004. Individual response to treatment: is it a valid assumption? British Medical Journal 329(7472):966-968.
Sheppard, L. 1980. Computer control of the infusion of vasoactive drugs. Annals of Biomedical Engineering 8(4-6):431-434.
Sheppard, L, N Kouchoukos, and M Kurtis. 1968. Automated treatment of critically ill patients following operation. Annals of Surgery 168:596-604.
Sheppard, L, J Kirklin, and N Kouchoukos. 1974. Chapter 6: computer-controlled interventions for the acutely ill patient. Computers in Biomedical Research. New York: Academic Press.
Shewart, W. 1931. Economic control of quality of manufactured product. New York: D. Van Nostrand Co., Inc. (republished in 1980, American Society for Quality Control, Milwaukee, WI).
Silverman, W. 1993. Doing more good than harm. Annals of the New York Academies of Science 703:5-11.
Singer, M, R Haft, T Barlam, M Aronson, A Shafer, and K Sands. 1998. Vancomycin control measures at a tertiary-care hospital: impact of interventions on volume and patterns of use. Infection Control and Hospital Epidemiology 19(4):248-253.
Smalley, W, M Griffin, R Fought, L Sullivan, and W Ray. 1995. Effect of a prior authorization requirement on the use of nonsteroidal anti-inflammatory drugs by Medicaid patient. New England Journal of Medicine 332:1641-1645.
Soumerai, S. 2004. Benefits and risks of increasing restrictions on access to costly drugs in Medicaid. Health Affairs 23(1):135-146.
Soumerai, S, J Avorn, D Ross-Degnan, and S Gortmaker. 1987. Payment restrictions for prescription drugs under Medicaid. Effects on therapy, cost, and equity. New England Journal of Medicine 317(9):550-556.
Soumerai, S, D Ross-Degnan, J Avorn, T McLaughlin, and I Choodnovskiy. 1991. Effects of Medicaid drug-payment limits on admission to hospitals and nursing homes. New England Journal of Medicine 325(15):1072-1077.
Soumerai, S, D Ross-Degnan, E Fortess, and J Abelson. 1993. A critical analysis of studies of state drug reimbursement policies: research in need of discipline. Milbank Quarterly 71(2):217-252.
Soumerai, S, T McLaughlin, D Ross-Degnan, C Casteris, and P Bollini. 1994. Effects of a limit on Medicaid drug-reimbursement benefits on the use of psychotropic agents and acute mental health services by patients with schizophrenia. New England Journal of Medicine 331(10):650-655.
Soumerai, S, D Ross-Degnan, E Fortess, and B Walser. 1997. Determinants of change in Medicaid pharmaceutical cost sharing: does evidence affect policy? Milbank Quarterly 75(1):11-34.
Sox, H. 2003. Improving patient care. Annals of Internal Medicine 138(12):996.
Stark, A. 2002. High-frequency oscillatory ventilation to prevent bronchopulmonary dysplasia—are we there yet? New England Journal of Medicine 347(9):682-684.
Tamblyn, R, R Laprise, J Hanley, M Abrahamowicz, S Scott, N Mayo, J Hurley, R Grad, E Latimer, R Perreault, P McLeod, A Huang, P Larochelle, and L Mallet. 2001. Adverse events associated with prescription drug cost-sharing among poor and elderly persons. Journal of the American Medical Association 285(4):421-429.
Teno, J, J Lynn, A Connors, N Wenger, R Phillips, C Alzola, D Murphy, N Desbiens, and W Knaus. 1997a. The illusion of end-of-life resource savings with advance directives. SUPPORT Investigators. Study to Understand Prognoses and Preferences for Outcomes and Risks of Treatment. Journal of the American Geriatrics Society 45(4):513-518.
Teno, J, S Licks, J Lynn, N Wenger, A Connors, R Phillips, M O’Connor, D Murphy, W Fulkerson, N Desbiens, and W Knaus. 1997b. Do advance directives provide instructions that direct care? SUPPORT Investigators. Study to Understand Prognoses and Preferences for Outcomes and Risks of Treatment. Journal of The American Geriatrics Society 45(4):508-512.
Tierney, W, J Overhage, B Takesue, L Harris, M Murray, D Vargo, and C McDonald. 1995. Computerizing guidelines to improve care and patient outcomes: the example of heart failure. Journal of the American Medical Informatics Association 2(5):316-322.
Tierney, W, J Overhage, and C McDonald. 1996. Computerizing Guidelines: Factors for Success. Paper read at Proceedings of the 1996 American Medical Informatics Association Annual Fall Symposium (formerly SCAMC), Washington, DC.
Trontell, A. 2004. Expecting the unexpected—drug safety, pharmacovigilance, and the prepared mind. New England Journal of Medicine 351(14):1385-1387.
Tunis, S. Chief Medical Officer, Centers for Medicare and Medicaid Services, Office of Clinical Standards and Quality. February 25, 2005, letter.
Tunis, S, D Stryer, and C Clancy. 2003. Practical clinical trials: increasing the value of clinical research for decision making in clinical and health policy. Journal of the American Medical Association 290(12):1624-1632.
Tversky, A, and D Kahneman. 1982. Availability: a heuristic for judging frequency and probability. In Judgment Under Uncertainty: Heuristics and Biases, edited by D Kahneman, P Slovic and A Tversky. Cambridge, UK: Cambridge University Press.
Vogt, T, J Elston-Lafata, D Tolsma, and S Greene. 2004. The role of research in integrated healthcare systems: the HMO Research Network. American Journal of Managed Care 10(9):643-648.
Wagner, A, S Soumerai, F Zhang, and D Ross-Degnan. 2002. Segmented regression analysis of interrupted time series studies in medication use research. Journal of Clinical Pharmacy and Therapeutics 27(4):299-309.
Wagner, A, D Ross-Degnan, J Gurwitz, F Zhang, D Gilden, L Cosler, S Soumerai. 2007 (in press). Restrictions on benzodiazepine prescribing and rates of hip fracture New York state regulation of benzodiazepine prescribing is associated with fewer prescriptions but no reduction in rates of hip fracture. Annals of Internal Medicine.
Walton, M. 1986. The Deming Management Method. New York: Putnam publishing group (Perigee books).
Warren, K, and F Mosteller, eds. 1993. Doing More Good than Harm: The Evaluation of Health Care Interventions. Vol. 703, Annals of the New York Academies of Science. New York: The New York Academy of Sciences.
Wennberg, J. 2002. Unwarranted variations in healthcare delivery: implications for academic medical centres. British Medical Journal 325(7370):961-964.
Wennberg, J, and A Gittelsohn. 1973. Small area variation analysis in health care delivery. Science 142:1102-1108.
Wirtschafter, D, M Scalise, C Henke, and R Gams. 1981. Do information systems improve the quality of clinical research? Results of a randomized trial in a cooperative multi-institutional cancer group. Compututers and Biomedical Research 14:78-90.
Wu, A, S Folkman, S McPhee, and B Lo. 1991. Do house officers learn from their mistakes? Journal of the American Medical Association 265:2089-2094.
Zhang, J, V Patel, and T Johnson. 2002. Medical error: is the solution medical or cognitive? Journal of the American Medical Informatics Association 9(6 Suppl.):S75-S77.