Biomarkers are characteristics that are objectively measured and evaluated as indicators of normal biological processes, pathogenic processes, or pharmacologic responses to an intervention. Cholesterol and blood sugar levels are biomarkers, as are blood pressure, enzyme levels, measurements of tumor size from magnetic resonance imaging (MRI) or computed tomography (CT), and the biochemical and genetic variations observed in age-related macular degeneration. Biomarkers can enable faster, more efficient clinical trials for life-saving and health-promoting interventions. They can help improve understanding of healthy dietary choices, and they can help public health professionals to identify and track health concerns. Biomarkers help health care practitioners and their patients make decisions about patient care. The use of biomarkers depends on the quality of data that supports their use and on the context in which they are applied. Evaluation of the quality of the measurements and data linking the biomarkers to clinical outcomes is important for assessing biomarker utility.
The Food and Drug Administration (FDA) requested the Institute of Medicine (IOM) to recommend a framework for the evaluation of biomarkers. The committee has recommended such a framework, with critical components of analytical validity, evidentiary qualification, and utilization analysis (Box S-1). The framework is intended to bring consistency and transparency to a previously non-uniform process. During its deliberations, the committee identified a need for the FDA to evaluate biomarker use with the same degree of scientific rigor across the product categories regulated by the agency, including drugs, biologics, devices, foods, and supplements. The committee has also recommended strategies for implementing the evaluation framework, supporting the use of evidence-based regulation and the protection and promotion of public health.
Biomarkers are measurements that indicate biological processes (see Box S-2 for definitions of key terms). Biomarkers include physiological measurements, blood tests, and other chemical analyses of tissue or bodily fluids, genetic or metabolic data, and measurements from images. Cholesterol and blood sugar levels are biomarkers, as are blood pressure, enzyme levels, measurements of tumor size from MRI or CT, and the biochemical and genetic variations observed in age-related macular degeneration. Emerging technologies have also enabled the use of simul-
Analytical Validation: “assessing [an] assay and its measurement performance characteristics, determining the range of conditions under which the assay will give reproducible and accurate data.”a
Biomarker: “a characteristic that is objectivelyb measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a[n] … intervention.”c Example: cholesterol level.
Chronic Disease: a culmination of a series of pathogenic processes in response to internal or external stimuli over time that results in a clinical diagnosis/ailment and health outcomes. Example: diabetes.
Clinical Endpoint: “a characteristic or variable that reflects how a patient [or consumer] feels, functions, or survives.”c Example: death.
Fit-for-Purpose: being guided by the principle that an evaluation process is tailored to the degree of certainty required for the use proposed.
Qualification: “evidentiary process of linking a biomarker with biological processes and clinical endpoints.”d
Surrogate Endpoint: “a biomarker that is intended to substitute for a clinical endpoint. A surrogate endpoint is expected to predict clinical benefit (or harm or lack of benefit or harm) based on epidemiologic, therapeutic, pathophysiologic, or other scientific evidence.”c Example: blood pressure for trials of several classes of antihypertensive drugs.e
taneously measured “signatures,” or patterns of co-occurring sets, of genetic sequences, peptides, proteins, or metabolites as biomarkers. These signatures can also be combinations of several of these types of measurements; ideally, each component of a signature is identified.
Biomarkers are used to describe risk, exposures, intermediate effects of treatment, and biologic mechanisms; as surrogate endpoints, biomarkers are used to predict health outcomes. Biomarkers can provide information about risk and physiological parameters that is useful in a variety of contexts: (1) insight into the health and well-being of patients and consumers, (2) the status of patient and consumer response to an intervention, (3) a basis for interpreting research results and comparing results across studies, (4) indications of health status and disease risk in population groups, and (5) important data for planning and evaluating public health programs. Biomarker measurements support the practice
' of modern medicine; the development of effective drugs, biologics, and devices; the communication of information about healthy food1 choices and dietary habits; and the planning and monitoring of public health initiatives; in some circumstances, use of biomarkers is essential for these goals. A variety of biomarkers and uses have advantages for patients and consumers, physicians and other healthcare practitioners, scientists and researchers, industry, payers, regulators, and policy makers.
It is important to note the distinction between biomarkers, risk factors, and endpoints. Biomarkers are patient and consumer characteristics that are measured and evaluated. As measurements, they are subject to measurement quality issues such as accuracy, precision, reliability, reproducibility, and the need for standards and quality control. Risk factors are variables that predict outcomes and are composed of biomarkers and social and environmental factors. The value of a risk factor depends on the degree to which it can predict an event. Finally, there are endpoints—which often include biomarkers, alone or in combination with clinical events. Endpoints range from something a patient or consumer clearly experiences, such as mortality, or a variable that is to some degree related to events impacting a patient or consumer’s life. An example of an endpoint more closely related to patient or consumer experience would be acute myocardial infarction with full recovery and without impact on a patient or consumer’s quality of life, and a less clearly related example is an LDL cholesterol level (more accurately, non-HDL cholesterol), as associated with cardiovascular disease mortality. The value of an endpoint increases in relation to the degree to which it conveys information about the effect of an intervention on a patient or consumer’s experience of life. For endpoints that are less clearly related to patient or consumer experience, there is a need to acknowledge that we cannot know with certainty whether a beneficial change in the endpoint will impact a patient or consumer’s experience of life. Further, the committee notes that endpoints can be conceptualized in a spectrum. At one end are endpoints defined by biomarkers alone that have less relationship to patient or consumer experience; in the middle are clinical events that depend on biomarkers as part of the definition; further along the spectrum are endpoints that are more closely related to events that affect patients’ and consumers’ lives; and at the other end of the spectrum are the clearest clinical endpoints, such as death.
Following the recommendations from the 2007 Institute of Medicine report Cancer Biomarkers: Challenges of Improving Detection and Treatment (IOM, 2007), the Center for Food Safety and Applied Nutrition of the FDA asked the IOM to generate recommendations on the evaluation process for biomarkers, with focus on biomarkers and surrogate endpoints in chronic disease. The committee was to recommend a framework for biomarker evaluation and test it using case studies of biomarkers and surrogate endpoints in various diseases, including low-density and high-density lipoprotein cholesterol levels as biomarkers of coronary heart disease.
Focusing on this charge, the committee outlined considerations for determining the appropriate use of biomarkers across a variety of contexts, including foods, drugs, biologics, and devices.
FINDINGS, CONCLUSIONS, AND RECOMMENDATIONS
The recommendations developed by the committee fall into two main categories: the biomarker evaluation process and strengthening evidence-based regulation. Recommendation 1 is meant to be applicable to all uses of biomarkers. Recommendations 2, 3, and 4 are focused on uses of biomarkers that result in regulatory decisions and the impacts these decisions have on public health, whether for drugs, biologics, or device development; for relationships between diet or nutrients/food substances and disease; or for public health monitoring and interventions. Recommendations 5 and 6 are ancillary recommendations that provide for efficient and effective implementation of Recommendations 1–4. The report will explain why scientific rigor is important when describing relationships among food, biomarkers, and chronic disease. This report uses biomarkers of cardiovascular disease for many of its illustrative examples, but examples from other diseases are also considered.
Biomarker Evaluation Process
The committee concluded that it was important to address several challenges revealed by previous biomarker evaluation efforts. First, pre-analytical and analytical validation of biomarker tests has often been underemphasized in that it has not been considered an integral component of biomarker qualification. Therefore, the committee has included preanalytical and analytical validation as a necessary component, and it has used the term “biomarker evaluation” to include both validation and qualification. Second, in general, the evidentiary assessment and utilization or context-of-use components of qualification are not adequately separated. The committee’s proposed process separates these steps so that
the different investigative and analytical processes required to evaluate evidence and contexts of use are defined. Finally, previous evaluation frameworks have not explicitly incorporated a process for reevaluation of analytical validation, evidentiary assessment, and context of use based on new data. The committee also recognizes that some biomarker evaluation steps may occur concurrently.
The evaluation framework is intended to be applicable across a wide range of biomarker uses, from exploratory uses for which less evidence is required to surrogate endpoint uses for which strong evidence is required. The framework is meant for, but not limited to, use in research, clinical, product, and claim development in food, drug, and device industries, and public health settings, and it is intended to function for panels of biomarkers in addition to single biomarkers and for circulating, genetic, and imaging biomarkers. The committee employed case studies to illustrate the use of the evaluation framework because different biomarkers and uses will emphasize different aspects of the general principles set forth in the report.
The biomarker evaluation process should consist of the following three steps:
Analytical validation: analyses of available evidence on the analytical performance of an assay;
Qualification: assessment of available evidence on associations between the biomarker and disease states, including data showing effects of interventions on both the biomarker and clinical outcomes; and
Utilization: contextual analysis based on the specific use proposed and the applicability of available evidence to this use. This includes a determination of whether the analytical validation and qualification conducted provide sufficient support for the use proposed.
It is important to emphasize that the steps listed above are interrelated; they are not necessarily separated in time, and conclusions in one step may require revisions or additional work in other steps (see Figure S-1).
Recommendation 2 provides further guidance on the application of the framework to uses of biomarkers that have regulatory impact. Specifically omitted from this recommendation are biomarker discovery activities and biomarkers for use in drug discovery, development, or other pre-clinical uses. The committee sought ways to achieve a rigorous evaluation framework without stifling innovation. Experts qualified by experience
and training are needed to conduct the evaluation reviews, focusing on the utilization step, because case-by-case analyses are the only way to ensure proper use of biomarkers given the state of the science.
Due to the complexity and progressive increase in the amount of data, the need for fit-for-purpose and context-of-use analysis, and the need to deal with sometimes contradictory evidence, expert input is essential to provide scientific judgment in areas of uncertainty. Likewise, as evidence evolves even after a biomarker is evaluated, it is imperative that biomarkers be reevaluated on a continuing basis so that both the scientific evidence and context-of-use analyses capture the current state of the science. Recommendation 2 will be discussed in the context of each of the three steps of Recommendation 1.
For biomarkers with regulatory impact, the FDA should convene expert panels to evaluate biomarkers and biomarker tests.
Initial evaluation of analytical validation and qualification should be conducted separately from a particular context of use.
The expert panels should reevaluate analytical validation, qualification, and utilization on a continual and a case-by-case basis.
Biomarker evaluation is a dynamic process. By considering additional evidence, it is possible that the expert panel may alter its past findings by revoking recommendations for a previously accepted biomarker use, choosing not to recommend a biomarker for uses similar to those for which it was granted permission in the past, providing a more nuanced explanation as to how a biomarker should be used, or qualifying the biomarker for use in new contexts. The panels may resemble FDA advisory committees. The panelists should possess relevant scientific expertise and experience; a variety of stakeholders should have opportunity for input; and attention should be paid to conflict-of-interest standards in a manner similar to government and IOM advisory committees. By continual, the committee refers to the need for regular reevaluation on the basis of new scientific developments and data.
The first step of the proposed evaluation framework is to catalogue the data addressing the analytical validity of the biomarker in question. In the utilization step of the framework, evaluators will determine whether a suitable biomarker test possesses appropriate validation given the proposed use of the biomarker or whether further data gathering is needed. As mentioned earlier, preanalytical and analytical validation is a necessary prerequisite for biomarker qualification. The terminology used in the recommendation, analytical performance, is not meant to describe how well a biomarker correlates with the clinical outcomes of interest. Instead, analytical validation of an assay includes the biomarker’s limit of detection, limit of quantitation, reference (normal) value cutoff concentration, and the total imprecision at the cutoff concentration. Depending on the use, biomarker tests need to be reliable, need to be reproducible across multiple laboratories and clinical settings, and possess adequate sensitivity and specificity for the biomarker being measured before data based on their use can be relevant in the subsequent biomarker evaluation steps. Appropriate standards for ensuring quality and reproducibility in different clinical and laboratory settings and across relevant populations should be available. Validation of biomarker tests should be done on a test-by-test
basis and must then be deemed sufficient for the use proposed in the utilization step. Validation may also include efforts to determine the extent for which data from different tests for the same biomarker may be compared to one another. When comparability is achieved, it both strengthens the biomarker itself and adds power to retrospective analyses of data related to the biomarker. As indicated in Recommendation 2, the expert panel will need to reevaluate the validation assessments on a continuing and as-needed basis and evaluate new tests that become available.
The second step of the committee’s evaluation framework incorporates a factual description of the available evidence. The first component of qualification is to evaluate the prognostic value of the biomarker–disease relationship, or the nature and strength of evidence about whether the biomarker is associated with disease outcomes. This is discussed further below. The second component is to gather available evidence showing the biomarker’s ability to predict the effects of interventions on clinical endpoints of interest; this evidence may also be used to support the associations described in the first component. If the biomarker–clinical endpoint relationship persists over multiple interventions, it is considered more generalizable. It is important to note, however, that the type of reasoning that may be used in qualification is probabilistic rather than deterministic. Although deterministic reasoning ultimately means that every contributing factor to the biomarker–intervention–clinical endpoint link is defined and understood, probabilistic reasoning emphasizes epidemiological and statistical relationships, acknowledging that all contributing factors are generally not fully understood and that some factors may be fundamentally random.
Related to the first component of qualification, prognostic value can be assessed by using concepts described by criteria proposed for establishing causation of non-infectious diseases (Advisory Committee to the Surgeon General, 1964; Hill, 1965). These criteria evaluate characteristics such as temporality, strength of association, biological plausibility, and consistency, among others. Given that biomarkers are “indicators”—in that they are not necessarily causal—and that an abnormal value or a gradient in level over time is not necessarily informative or predictive depending on the clinical situation, the committee instead used these criteria as a structure for assessing the prognostic value, or degree of association between the biomarker and the clinical outcomes of interest absent any interventions. For a surrogate endpoint, or a biomarker deemed useful as a substitute for a defined, disease-relevant clinical endpoint, prognostic value is a necessary—but not sufficient—criterion for
the evaluation. Depending on the situation, not all of the criteria must be fulfilled; temporality, strength of association, and consistency are particularly important, however. Observational data in human populations and preliminary clinical data (e.g., phase I or II data) are considered. Nonetheless, determination of whether a biomarker can be used as a surrogate endpoint for a specified intervention is done in the utilization step of the evaluation process.
To address the second component of qualification, robust, adequately controlled clinical study data using clinical endpoints (i.e., phase III data or equivalent studies) are necessary. In the description of the evidence about the biomarker, applicable populations and conditions for use need to be articulated and taken into consideration in the utilization step of the biomarker evaluation framework for all types of proposed uses, including those for dietary and nutritional purposes.
The third step of the committee’s biomarker evaluation framework is a contextual analysis of the available evidence about a biomarker with regard to the proposed use of the biomarker. It is most essential that this analysis be carried out by a panel of experts, as scientific and medical judgment is necessary to weigh the possible advantages and disadvantages of the proposed biomarker use. These evaluations should take place on a per use basis, because use depends on the context of use proposed and because knowledge and technology continually evolve. Applicable populations and conditions for use need to be articulated. Utilization can be divided into several components. The first is a determination of the general category of use for which the biomarker is intended (e.g., prevention in the general population or a diseased population, diagnosis, treatment, or mitigation); this can guide the panel in determining important factors to consider in the second component of utilization. The second component is consideration of factors such as the prevalence, morbidity, and mortality of the disease; the risks and benefits associated with the intervention; opportunity cost; and whether the biomarker is being considered for use as a surrogate endpoint.
Strong evidence and a compelling context are needed for the utilization of a biomarker as a surrogate endpoint in situations with regulatory impact. In the case of chronic disease, where there are multiple pathogenetic pathways leading to development of clinical outcomes and multiple manifestations of disease, the probabilistic nature of predictions made using biomarker data means that no biomarker can give absolute certainty of an event’s future occurrence nor absolute certainty of the timing of
the predicted event. Nonetheless, there are situations in which use of a biomarker as a surrogate endpoint in situations with regulatory impact may be supported, such as in situations where the need for interventions is urgent or where studies including clinical endpoints are not feasible because of technical or ethical reasons. Situations with regulatory impact are defined in Chapter 3. Again, this is not meant to discourage use of biomarkers in product development; biomarkers play an important role in research and decision-making. Finally, it is essential to remember that the information that an individual surrogate endpoint or clinical endpoint can give is inherently limited; as a result, it is important to emphasize the need to evaluate data relating to adverse events and unintended effects of biomarker use. As will be discussed and shown in Chapters 3 and 4, the status of a biomarker as a surrogate endpoint is context specific, and a biomarker cannot be assumed to be a general surrogate endpoint separate from a designated use.
The committee does not intend to imply that selection of endpoints for clinical trials would be simple or risk free if investigators were simply to avoid surrogate endpoints. Clinical and surrogate endpoints have been defined in a way that may imply a clear distinction between the two, in that clinical endpoints typically reflect patient or consumer experience and surrogate endpoints do not. However, there is discussion surrounding this issue, which illustrates the scientific complexity of the distinction between clinical and surrogate endpoints. Some clinical endpoints have many similarities with biomarkers, and can be thought of as a step removed from patient or consumer experience, and therefore subject to similar potential failings as surrogate endpoints (i.e., pain scales). Some surrogate endpoints are highly robust (e.g., HIV-1 RNA for effectiveness of antiretroviral medications in the treatment of HIV infection). Clinical endpoints share many features of biomarkers, such as the need for analytical validation, but they differ from biomarkers in that clinical endpoints address how a patient or consumer feels, functions, or survives and also commonly utilize multiple diagnostic criteria. The committee recognizes that selection of clinical endpoints is beyond the scope of this report. Nonetheless, there are many important interests at stake in this discussion and some issues, such as the best way to choose endpoints for trials, may be context specific. In such settings, stakeholders such as industry, the public as represented by government and community representatives, and academic researchers may benefit from convening to discuss these issues.
Scientific Process Harmonization
The FDA should use the same degree of scientific rigor for evaluation of biomarkers across regulatory areas, whether they are proposed for use in the arenas of drugs, medical devices, biologics, or foods and dietary supplements.
The importance of rigorous biomarker evaluation has been discussed for decades in the context of drug development. For foods, supplements, and devices, however, based on legislative and legal mandates, the FDA’s regulation of claims and the scientific standards for evaluating such claims are governed by different regulatory frameworks as compared to drugs; legislation may be required to revise the science-based standards and regulatory processes for these non-drug products. The committee concluded that the same standards of scientific evidence are required across regulatory areas and different products in the various FDA centers as well as for comparative effectiveness research because decisions about foods, drugs, biologics, and devices need to evaluate the evidence for claimed benefits within the context of use. The public health implications are important, and a critical evaluation of the strength of the evidence on safety is an important component of the context-of-use considerations for health claims on foods. Although it may be tempting to assume, for example, that health claims on foods have less potential risk for adverse consequences than is the case for drugs, it is important to realize that health claims on foods potentially impact a far greater portion of the population than do drug claims, that health claims are not interpreted with the mediation of a trained health professional, and that misleading or poorly substantiated health claims—or those later discovered to be incorrect due to insufficient evidence—can result in harm. These potential harms emphasize the need to weigh a biomarker’s potential context of use in the utilization step.
The committee’s biomarker evaluation framework is intended to accomplish the goal of consistent evaluation of biomarkers across different types of products and contexts of use. The committee recognizes the differences between scientific assessments of data and policy decisions. The first two steps of the evaluation framework are scientific steps. The third step provides a framework in which scientists and other experts can use rigorous scientific information to make recommendations for complex policy decisions.
The FDA should take into account a nutrient’s or food’s source as well as any modifying effects of the food or supplement that serves
as the delivery vehicle and the dietary patterns associated with consumption of the nutrient or food when reviewing health-related label claims and the safety of food and supplements.
Drugs, biologics, and devices are evaluated for efficacy and safety on the basis of the whole products. Recommendation 4 seeks to extend this approach to foods and supplements. The differing health effects of individual nutrients or other food substances in food or supplement products composed of multiple substances are important. Due to this, for foods, focusing on a single nutrient or food substance contained in a food or in several different foods can be misleading because it fails to take into account potential modifying effects of the source of the substance and matrix effects of other components in the food, meal, and diet. When these evaluations are taking place based on biomarker data, the difficulties that arise due to incomplete data on unintended effects and side effects are compounded. While review of proposed health claims takes into account the relationship of the specific substance that is the subject of the health claim to the health outcome of interest, it may not adequately consider the modifications of the substance’s effect on the disease outcome by other bioactive components in that food or the diet.
An individual substance or product composed of multiple substances may impact one or more biological pathways, each raising or lowering risk for a chronic disease or condition. An intervention may also have multiple health outcomes, and although it would be difficult or infeasible to discover or assess all of these effects, it is important to acknowledge them. Figure S-2 illustrates the multiplicity of possibilities inherent in the presence of multiple ingredients, each potentially impacting multiple pathways, in turn leading to multiple outcomes.
Effective implementation of the committee’s biomarker evaluation framework process across all contexts of use will benefit from coordination within the FDA and with other government agencies. Useful components of this coordination include the systematic collection of data, building and supporting needed information technology infrastructure, and strengthening the surveillance systems required for linking biomarker and clinical outcome data. The FDA needs these tools to gather and use evidence when making the regulatory decisions, which have important effects across the spectrum of research, clinical practice, and public health surveillance. Recommendations 5 and 6 address this need.
Improving Evidence-Based Regulation
Advisory Committee to the Surgeon General. 1964. Report of the Advisory Committee to the Surgeon General. Washington, DC: U.S. Department of Health, Education, and Welfare.
Biomarkers Definitions Working Group. 2001. Biomarkers and surrogate endpoints: Preferred definitions and conceptual framework. Clinical Pharmacology and Therapeutics 69(3):89–95.
Hill, A. B. 1965. The environment and disease: Association or causation? Proceedings of the Royal Society of Medicine 58:295–300.
IOM (Institute of Medicine). 2007. Cancer biomarkers: The promises and challenges of improving detection and treatment. Washington, DC: The National Academies Press.
Wagner, J. A. 2002. Overview of biomarkers and surrogate endpoints in drug development. Disease Markers 18(2):41–46.
Wagner, J. A. 2008. Strategic approach to fit-for-purpose biomarkers in drug development. Annual Review of Pharmacology and Toxicology 48:631–651.