While adding the annual awards numbers of the five agencies would seem to define the larger sample, the process was more complicated. Agency reports usually involve some estimating and anticipation of successful negotiation of selected proposals. Agencies rarely correct reports after the fact. Setting limitations on the number of projects to be surveyed from each firm required knowing how many awards each firm had received from all five agencies. Thus, the first step was to obtain all of the award databases from each agency and combine them into a single database. Defining the database was further complicated by variations in firm identification, location, phone numbers, and points of contact within individual agency databases. Ultimately, we determined that 4,085 firms had been awarded 11,214 Phase II awards (an average of 2.7 Phase II awards per firm) by the five agencies during the 1992–2001 time frame. Using the most recent awards, the firm information was updated to the most current contact information for each firm.
The Phase II survey used an array of sampling techniques, to ensure adequate coverage of projects to address a wide range both of outcomes and potential explanatory variables, and also to address the problem of skew. That is, a relatively small percentage of funded projects typically account for a large percentage of commercial impact in the field of advanced, high-risk technologies.
Random samples. After integrating the 11,214 awards into a single database, a random sample of approximately 20 percent was sampled. Then a random sample of 20 percent was ensured for each year; e.g., 20 percent of the 1992 awards, of the 1993 awards, etc. Verifying the total sample one year at a time allowed improved ability to adapt to changes in the program over time, as otherwise the increased number of awards made in recent years might dominate the sample.
Random sample by agency. Surveyed awards were grouped by agency; additional respondents were randomly selected as required to ensure that at least 20 percent of each agency’s awards were included in the sample.
Firm surveys. After the random selection, 100 percent of the Phase IIs that went to firms with only one or two awards were polled. These are the hardest firms to find for older awards. Address information is highly perishable, particularly for earlier award years. For firms that had more than two awards, 20 percent were selected, but no less than two.
Top performers. The problem of skew was dealt with by ensuring that all Phase IIs known to meet a specific commercialization threshold (total of $10 million in the sum of sales plus additional investment) were surveyed (derived from the DoD commercialization database). Since 56 percent of all awards were