The Selection of Data
Deborah, a third-year graduate student, and Kamala, a postdoctoral fellow, have made a series of measurements on a new experimental semiconductor material using an expensive neutron test at a national laboratory. When they return to their own laboratory and examine the data, a newly proposed mathematical explanation of the semiconductor’s behavior predicts results indicated by a curve.
During the measurements at the national laboratory, Deborah and Kamala observed electrical power fluctuations that they could not control or predict were affecting their detector. They suspect the fluctuations affected some of their measurements, but they don’t know which ones.
When Deborah and Kamala begin to write up their results to present at a lab meeting, which they know will be the first step in preparing a publication, Kamala suggests dropping two anomalous data points near the horizontal axis from the graph they are preparing. She says that due to their deviation from the theoretical curve, the low data points were obviously caused by the power fluctuations. Furthermore, the deviations were outside the expected error bars calculated for the remaining data points.
Deborah is concerned that dropping the two points could be seen as manipulating the data. She and Kamala could not be sure that any of their data points, if any, were affected by the power fluctuations. They also did not know if the theoretical prediction was valid. She wants to do a separate analysis that includes the points and discuss the issue in the lab meeting. But Kamala says that if they include the data points in their talk, others will think the issue important enough to discuss in a draft paper, which will make it harder to get the paper published. Instead, she and Deborah should use their professional judgment to drop the points now.