National Academies Press: OpenBook
« Previous: 4 The Way Forward: Using Statistics to Improve Reproducibility
Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×

References

Allen, L., A. Brand, J. Scott, M. Altman, and M. Hlava. 2014. Publishing: Credit where credit is due. Nature 508(7496):312-313.

Alogna, V., M. Attaya, P. Aucoin, Š. Bahník, S. Birch, B. Bornstein, A.R. Birt, et al. 2014. Registered replication report: Schooler & Engstler-Schooler (1990). Perspectives on Psychological Science 9(5):556-578.

Altman, M. 2002. A review of JMP 4.03 with special attention to its numerical accuracy. The American Statistician 56(1):72-75.

Altman, M. 2008. A fingerprint method for scientific data verification. Pp. 311-316 in Advances in Computer and Information Sciences and Engineering (T. Sobh, ed.). Springer, The Netherlands.

Altman, M., and M. Crosas. 2013. The evolution of data citation: From principles to implementation. IASSIST Quarterly 37.

Altman, M., and G. King. 2007. A proposed standard for the scholarly citation of quantitative data. D-lib Magazine 13(3/4).

Altman, M., and M.P. McDonald. 2001. Choosing reliable statistical software. Political Science and Politics 34(03):681-687.

Altman, M., and M.P. McDonald. 2003. Replication with attention to numerical accuracy. Political Analysis 11(3):302-307.

Altman, M., L. Andreev, M. Diggory, G. King, E. Kolster, A. Sone, S. Verba, et al. 2001. Overview of the virtual data center project and software. Pp. 203-204 in JCDL ‘01: First Joint Conference on Digital Libraries.

Altman, M., J. Gill, and M.P. McDonald. 2004. Sources of inaccuracy in statistical computation. Numerical Issues in Statistical Computing for the Social Scientist. John Wiley & Sons, Hoboken, N.J.

Altman, M., J. Fox, S. Jackman, and A. Zeileis. 2011. A special volume on “Political Methodology.” Journal of Statistical Software 42(i01).

Azoulay, L., H. Yin, K.B. Filion, J. Assayag, A. Majdan, M.N. Pollak, and S. Suissa. 2012. The use of pioglitazone and the risk of bladder cancer in people with type 2 diabetes: Nested case-control study. The BJM 344:e3645.

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×

Baggerly, K., and K.R. Coombes. 2009. Deriving chemosensitivity from cell lines: Forensic bioinformatics and reproducible research in high-throughput biology. Annals of Applied Statistics 3(4):1309-1334.

Barni, M., and F. Perez-Gonzalez. 2005. Pushing science into signal processing [my turn]. IEEE Signal Processing Magazine 22(4):119-120.

Bayarri, M.J., and A.M. Mayoral. 2002. Bayesian design of ‘successful’ replications. The American Statistician 56:207-214.

Begley, C.G. 2013. Reproducibility: Six red flags for suspect work. Nature 497(7450):433-434.

Begley, C.G., and L.M. Ellis. 2012. Drug development: Raise standards for preclinical cancer research. Nature 483:531-533.

Begley, C.G., and J.P.A. Ioannidis. 2015. Reproducibility in science: Improving the standard for basic and preclinical research. Circulation Research 116(1):116-126.

Begley, S. 2012. In cancer science, many “discoveries” don’t hold up. Reuters, Health, March 28. http://www.reuters.com/article/2012/03/28/us-science-cancer-idUSBRE82R12P20120328.

Benjamini, Y., and Y. Hochberg. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological) 57(1):289-300.

Berk, R., L. Brown, A. Buja, K. Zhang, and L. Zhao. 2013. Valid post-selection inference. Annals of Statistics 41(2):802-837.

Bernau, C., M. Riester, A.L. Boulesteix, G. Parmigiani, C. Huttenhower, L. Waldron, and L. Trippa. 2014. Cross-study validation for the assessment of prediction algorithms. Bioinformatics 30(12):i105-i112.

Berry, D. 2012. Multiplicities in cancer research: Ubiquitous and necessary evils. Journal of the National Cancer Institute 104(15):1124-1132.

Berry, D.A. 2007. The difficult and ubiquitous problems of multiplicities. Pharmaceutical Statistics 6(3):155-160.

Bezeau, S., and R. Graves. 2001. Statistical power and effect sizes of clinical neuropsychology research. Journal of Clinical and Experimental Neuropsychology 23(3):399-406.

Blanken, I., N. van de Ven, M. Zeelenberg, and M.H.C. Meijers. 2014. Three attempts to replicate the moral licensing effect. Social Psychology 45(3):223-231.

Boos, D.D., and L.A. Stefanski. 2011. P-value precision and reproducibility. The American Statistician 65:213-221.

Bossuyt, P.M., J.B. Reitsma, D.E. Bruns, C.A. Gatsonis, P.P. Glasziou, L.M. Irwig, J.G. Lijmer, D. Moher, D. Rennie, and H.C. de Vet. 2003. Toward complete and accurate reporting of studies of diagnostic accuracy: The STARD initiative. Academic Radiology 10(6):664-669.

Brandt, M.J., H. IJzerman, and I. Blanken. 2014. Does recalling moral behavior change the perception of brightness? A replication and meta-analysis of Banerjee, Chatterjee, and Sinha (2012). Social Psychology 45(3):246-252.

Buckheit, J., and D.L. Donoho. 1995. Wavelab and reproducible research. Pp. 55-81 in Wavelets and Statistics (A. Antoniadis, ed.). Springer, New York, N.Y.

Buja, A., D. Cook. H. Hofmann, M. Lawrence, E.-K. Lee, D.F. Swayne, and H. Wickham. 2009. Statistical inference for exploratory data analysis and model diagnostics. Philosophical Transactions of the Royal Society A 367(1906):4361-4383.

Calin-Jageman, R.J., and T.L. Caldwell. 2014. Replication of the Superstition and Performance Study by Damisch, Stoberock, and Mussweiler (2010). Social Psychology 45(3):239-245.

Cardwell, C.R., C.C. Abnet, M.M. Cantwell, and L.J. Murray. 2010. Exposure to oral bisphosphonates and risk of esophageal cancer. Journal of the American Medical Association 304(6):657-663.

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×

Casella, G., and R.L. Berger. 1987. Reconciling Bayesian and frequentist evidence in the one-sided testing problem. Journal of the American Statistical Association 82(397):106-111.

Chan, A.-W., A. Hróbjartsson, M.T. Haahr, P.C. Gøtzsche, and D.G. Altman. 2004. Empirical evidence for selective reporting of outcomes in randomized trials: Comparison of protocols to published articles. Journal of the American Medical Association 291(20):2457-2465.

Clayton, J.A., and F.S. Collins. 2014. NIH to balance sex in cell and animal studies. Nature 509(7500):282-283.

Cloninger, D.O., and R. Marchesini. 2001. Execution and deterrence: A quasi-controlled group experiment. Applied Economics 33:569-576.

Cohen, J. 1962. The statistical power of abnormal-social psychological research: A review. Journal of Abnormal and Social Psychology 65:145-153.

Collins, F.S., and L.A. Tabak. 2014. NIH plans to enhance reproducibility. Nature 505(7485):612-613.

Colquhoun, D. 2014. An investigation of the false discovery rate and the misinterpretation of p-values. Royal Society Open Science 1(3):140216.

Commission on Presidential Debates. 2010. October 17, 2000 Debate Transcript. http://debates.org/index.php?page=october-17-2000-debate-transcript.

Cossins, D. 2014. Setting the record straight. Scientist 28(10):48-53.

Couzin-Frankel, J. 2015. Trust me, I’m a medical researcher. Science 347(6221):501-503.

Crabbe, J.C., D. Wahlsten, and B.C. Dudek. 1999. Genetics of mouse behavior: Interactions with laboratory environment. Science 284(5420):1670-1672.

Dezhbakhsh, H., and J. Shepherd. 2006. The deterrent effect of capital punishment: Evidence from a “judicial experiment.” Economic Inquiry 44(3):512-535.

Dezhbakhsh, H., P.H. Rubin, and J.M. Shepherd. 2003. Does capital punishment have a deterrent effect? New evidence from post moratorium panel data. American Law and Economics Review 5(2):344-376.

Donoho, D.L. 2010. An invitation to reproducible computational research. Biostatistics 11(3):385-388.

Donoho, D.L., and X. Huo. 2004. Beamlab and reproducible research. International Journal of Wavelets, Multiresolution and Information Processing 2(04):391-414.

Donoho, D.L., A. Maleki, I.U. Rahman, M. Shahram, and V. Stodden. 2009. Reproducible research in computational harmonic analysis. Computing in Science and Engineering 11(1):8-18.

Donohue, J.J., and J. Wolfers. 2005. Uses and abuses of empirical evidence in the death penalty debate. Stanford Law Review 58(3):791-845.

Donohue, J.J., and J. Wolfers. 2006. The death penalty: No evidence for deterrence. Economists’ Voice. http://www.deathpenaltyinfo.org/DonohueDeter.pdf.

Doshi, P., T. Jefferson, and C. Del Mar. 2012. The imperative to share clinical study reports: Recommendations from the Tamiflu experience. PLOS Medicine 9(4):e1001201.

Doyen, S., O. Klein, C. Pichon, and A. Cleeremans. 2012. Behavioral priming: It’s all in the mind, but whose mind? PLOS ONE 7(1):e29081.

Ehrlich, I. 1975. The deterrent effect of capital punishment: A question of life or death. American Economic Review 65(3):397-417.

Errington, T.M., E. Iorns, W. Gunn, F.E. Tan, J. Lomax, and B.A. Nosek. 2014. An open investigation of the reproducibility of cancer biology research. eLife 3:e04333.

Esarey, J., A. Wu, R.T. Stevenson, and R.K. Wilson. 2014. Editorial statement. The Political Methodologist 22(1):1-2.

Etminan, M., F. Forooghian, J.M. Brophy, S.T. Bird, and D. Maberley. 2012. Oral fluoroquinolones and the risk of retinal detachment. Journal of the American Medical Association 307(13):1414-1419.

Fanelli, D. 2010. Do pressures to publish increase scientists’ bias? An empirical support from U.S. states data. PLOS ONE 5(4):e10271.

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×

Fanelli, D., and J.P.A. Ioannidis. 2013. U.S. Studies may overestimate effect sizes in softer research. Proceedings of the National Academy of Sciences 110(37):15031-15036.

Fisher, R.A. 1926. The arrangement of field experiments. Journal of the Ministry of Agriculture 33:503-513.

Fisher, R.A. 1935. The Design of Experiments, II. Oliver and Boyd, Edinburgh, Scotland.

Freire, J., P. Bonnet, and D. Shasha. 2012. Computational reproducibility: State-of-the-art, challenges, and database research opportunities. Pp. 593-596 in SIGMOD ‘12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data.

Funio, E., I. Golani, and Y. Benjamini. 2012. Measuring behavior of animal models: Faults and remedies. Nature Methods 9(12):1167-1170.

Furman v. Georgia, 408 US 238 – Supreme Court 1972.

Gardner, T. 2014. A swan in the making. Science 345(6199):855.

Garijo, D., S. Kinnings, L. Xie, P.E. Bourne, and Y. Gil. 2013. Quantifying reproducibility in computational biology: The case of the Tuburculosis Drugome. PLOS ONE 8(11):e80278.

Garnett, A., M. Altman, L. Andreev, S. Barbarosa, E. Castro, M. Crosas, G. Durand, et al. 2013. Linking OJS and Dataverse. PKP Scholarly Publishing Conference. https://pkp.sfu.ca/pkp2013/paper/view/390.

Gelman, A. 2015. “Academics Should Be Made Accountable for Exaggerations in Press Releases about Their Own Work.” Blog. Statistical Modeling, Causal Inference, and Social Science. Posted February 22. http://andrewgelman.com/2015/02/22/academics-made-accountable-exaggerationspress-releases-work/.

Gelman, A., and E. Loken. 2014. The statistical crisis in science. American Scientist 102(6):460.

Gerber, A.S., and D.P. Green. 2000. The effects of canvassing, telephone calls, and direct mail on voter turnout: A field experiment. American Political Science Review 94(03):653-663.

Gibson, C.E., J. Losee, and C. Vitiello. 2014. A replication attempt of stereotype susceptibility (Shih, Pittinsky, and Ambady, 1999): Identity salience and shifts in quantitative performance. Social Psychology 45(3):194-198.

Goodfellow, I.J., D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio. 2013. “Maxout Networks.” Pp. 1319-1327 in Proceedings of the 30th International Conference on Machine Learning, JMLR Workshop and Conference Proceedings, Volume 28. http://jmlr.csail.mit.edu/proceedings/papers/v28/goodfellow13.pdf.

Goodman, S. 2001. Of p-values and Bayes: A modest proposal. Epidemiology 12(3):295-297.

Goodman, S.N. 1992. A comment of replication, p-values and evidence. Statistics in Medicine 11:875-879.

Goodman, S.N., D.G. Altman, and S.L. George. 1998. Statistical reviewing policies of medical journals. Journal of General Internal Medicine 13(11):753-756.

Green, J., G. Czanner, G. Reeves, J. Watson, L. Wise, and V. Beral. 2010. Oral bisphosphonates and risk of cancer of esophagus, stomach, and colorectum: Case-control analysis within a UK primary care cohort. The BMJ 341:c4444.

Gregg v. Georgia, 428 US 153 – Supreme Court 1976.

Harris, C.R., N. Coburn, D. Rohrer, and H. Pashler. 2013. Two failures to replicate high-performance-goal priming effects. PLOS ONE 8(8):e72467.

Hayes, D.N., S. Monti, G. Parmigiani, C.B. Gilks, K. Naoki, A. Bhattacharjee, M.A. Socinski, C. Perou, and M. Meyerson. 2006. Gene expression profiling reveals reproducible human lung adenocarcinoma subtypes in multiple independent patient cohorts. Journal of Clinical Oncology 24(31):5079-5090.

Heller, R., and D. Yekutieli. 2014. Replicability analysis for genome-wide association studies. Annals of Applied Statistics 8(1):481-498.

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×

Heller, R., M. Bogomolov, and Y. Benjamini. 2014. Deciding whether follow-up studies have replicated findings in a preliminary large-scale omics study. Proceedings of the National Academy of Sciences 111(46):16262-16267.

Hill, A.B. 1965. The environment and disease: Association or causation? Proceedings of the Royal Society of Medicine 58:295-300.

Hines, W.C., Y. Su, I. Kuhn, K. Polyak, and M.J. Bissell. 2014. Sorting out the FACS: A devil in the details. Cell Reports 6(5):779-781.

Hothorn, T., and F. Leisch. 2011. Case studies in reproducibility. Briefings in Bioinformatics 12(3):288-300.

Hume, D. 1738. A Treatise of Human Nature. Available as a Project Gutenberg EBook. Released February 13, 2010 [EBook #4705], last updated November 10, 2012. https://www.gutenberg.org/files/4705/4705-h/4705-h.htm.

IJzerman, H., I. Blanken, M.J. Brandt, J. M. Oerlemans, M.M.W. Van den Hoogenhof, S.J.M. Franken, and M.W.G. Oerlemans. 2014. Sex differences in distress from infidelity in early adulthood and in later life: A replication and meta-analysis of Shackelford et al. (2004). Social Psychology 45(3):202-208.

Imai, K. 2005. Do get-out-the-vote calls reduce turnout? The importance of statistical methods for field experiments. American Political Science Review 99(02):283-300.

Ioannidis, J.P.A. 2005. Why most published research findings are false. PLoS Medicine 2(8):e124.

Ioannidis, J.P.A., and M.J. Khoury. 2011. Improving validation practices in “omics” research. Science 334(6060):1230-1232.

Ioannidis, J.P.A., R. Tarone, and J.K. McLaughlin. 2011. The false-positive to false-negative ratio in epidemiologic studies. Epidemiology 22(4):450-456.

Jager, L.R., and J.T. Leek. 2014. An estimate of the science-wise false discovery rate and application to the top medical literature. Biostatistics 15(1):1-12.

John, L.K., G. Lowenstein, and D. Prelec. 2012. Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science 23(5):524-532.

Johnson, D.J., F. Cheung, and M.B. Donnellan. 2014. Does cleanliness influence moral judgments? A direct replication of Schnall, Benton, and Harvey (2008). Social Psychology 45(3):209-215.

Johnson, V.E. 2013a. Uniformly most powerful Bayesian tests. Annals of Statistics 41(1):1716-1741.

Johnson, V.E. 2013b. Revised standards for statistical evidence. Proceedings of the National Academy of Sciences 110(48):19313-19317.

Karr, A.F. 2014. Why data availability is such a hard problem. Journal of the International Association for Official Statistics 30(2):101-107.

Katz, L., S.D. Levitt, and E. Shustorovich. 2003. Prison conditions, capital punishment and deterrence. American Law and Economics Review 5(2):318-343.

Kerr, N.L. 1998. HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review 2(3):196-217.

King, G. 1995. Replication, replication. PS: Political Science & Politics 28(3):444-452.

Kiselycznyk, C., and A. Holmes. 2011. All (C57BL/6) mice are not created equal. Frontiers in Neuroscience 5(10).

Klein, R.A., K.A. Ratliff, M.Vianello, R.B. Adams, Jr., Š. Bahník, M.J. Bernstein, K. Bocian, et al. 2014. Investigating variation in replicability: A “many labs” replication project. Social Psychology 45(3):142-152.

Laine, C., S.N. Goodman, M.E. Griswold, and H.C. Sox. 2007. Reproducible research: Moving toward research the public can really trust. Annals of Internal Medicine 146(6):450-453.

Lambert, D., and W.J. Hall. 1982. Asymptotic lognormality of p-values. Annals of Statistics 10:44-64. Lash, T.L. 2015. Truth and consequences. Epidemiology 26(2):141-142.

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×

Leamer, E.E. 1978. Specification Searches: Ad Hoc Inference with Nonexperimental Data. Wiley & Sons, New York, N.Y.

Lee, C.-Y., S. Xie, P.W. Gallagher, Z. Zhang, and Z. Tu. 2015. Deeply supervised nets. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, Vol. 38. http://jmlr.org/proceedings/papers/v38/lee15a.pdf.

Leek, J.T., and R.D. Peng. 2015a. What is the question? Science 347(6228):1314-1315.

Leek, J.T., and R.D. Peng. 2015b. Opinion: Reproducible research can still be wrong: Adopting a prevention approach. Proceedings of the National Academy of Sciences 112(6):1645-1646.

Lehrer, J. 2010. The truth wears off: Is there something wrong with the scientific method? The New Yorker, December 13.

LeVeque, R.J. 2013. “Top ten reasons to not share your code (and why you should anyway). SIAM News. http://sinews.siam.org/DetailsPage/tabid/607/ArticleID/386/Top-Ten-Reasons-To-NotShare-Your-Code-and-why-you-should-anyway.aspx.

LeVeque, R.J., I.M. Mitchell, and V. Stodden. 2012. Reproducible research for scientific computing: Tools and strategies for changing the culture. Computing in Science and Engineering 14(4):13.

Li, Q., J.B. Brown, H. Huang, and P.J. Bickel. 2011. Measuring reproducibility of high-throughput experiments. Annals of Applied Statistics 5(3):1752-1779.

Liberati, A., D.G. Altman, J. Tetzlaff, C. Mulrow, P.C. Gøtzsche, J.P.A. Ioannidis, M. Clarke, P.J. Devereaux, J. Kleijnen, and D. Moher. 2009. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: Explanation and elaboration. Annals of Internal Medicine 151(4):W65-W94.

Lynott, D., K.S. Corker, J. Wortman, L. Connell, M.B. Donnellan, R.E. Lucas, and K. O’Brien. 2014. Replication of “Experiencing physical warmth promotes interpersonal warmth” by Williams and Bargh (2008). Social Psychology 45(3):216-222.

Madigan, D., P.E. Stang, J.A. Berlin, M. Schuemie, J.M. Overhang, M.A. Suchard, B. Dumouchel, A.G. Hartzema, and P.B. Ryan. 2014. A systemic statistical approach to evaluating evidence from observational studies. Annual Review of Statistics and Its Applications 1:11-39.

Makel, M.C., J.A. Plucker, and B. Hegarty. 2012. Replications in psychology research; how often do they really occur? Perspectives on Psychological Science 7(6):537-542.

Mann, C.C. 1994. Behavioral genetics in transition. Science 265(5166):1686-1689.

McNutt, M. 2014. Journals unite for reproducibility. Science 346(6210):679.

Miller, R.G. 1986. Beyond ANOA, Basics of Applied Statistics. Wiley, New York.

Mills, J.L. 1993. Data torturing. New England Journal of Medicine 329:1196-1199.

Mobley, A., S.K. Linder, R. Braeuer, L.M. Ellis, and L. Zwelling. 2013. A survey on data reproducibility in cancer research provides insights into our limited ability to translate findings from the laboratory to the clinic. PLOS ONE 8(5):e63221.

Mocan, H.N., and R.K. Gittings. 2003. Getting off death row: Commuted sentences and the deterrent effect of capital punishment. Journal of Law and Economics XLVI:453-478.

Mojirsheibani, M., and R. Tibshirani. 1996. Some results on bootstrap prediction intervals. Canadian Journal of Statistics 24:549-568.

Molina, H., G. Parmigiani, and A. Pandey. 2005. Assessing reproducibility of a protein dynamics study using in vivo labeling and liquid chromatography tandem mass spectrometry. Analytical Chemistry 77(9):2739-2744.

Moon, A., and S.S. Roeder. 2014. A secondary replication attempt of stereotype susceptibility (Shih, Pittinsky, and Ambady, 1999). Social Psychology 45(3):199-201.

Moore, J. 2013. “Onstage Speech Transcript: Actress in a Leading Role.” Transcript. http://www.oscars.org/press/onstage-speech-transcript-actress-leading-role.

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×

Mosteller, F., and J.W. Tukey. 1977. Data Analysis and Regression: A Second Course in Statistics. Addison-Wesley, Reading, Mass.

Motulsky, H.J. 2014. Common misconceptions about data analysis and statistics. British Journal of Pharmacology 172(8):2126-2132.

Müller, F., and K. Rothermund. 2014. What does it take to activate stereotypes? Simple primes don’t seem to be enough: A replication of stereotype activation (Banaji and Hardin, 1996; Blair and Banaji, 1996). Social Psychology 45(3):187-193.

NASEM (National Academies of Sciences, Engineering, and Medicine). 2015. Reproducibility Issues in Research with Animals and Animal Models. The National Academies Press, Washington, D.C.

Nature Methods. 2012. “All Things Being Equal.” Editorial. 9(2):111.

Nature Neuroscience. 2013. “Raising Standards.” Editorial. 16(5):517.

Nauts, S., O. Langner, I. Huijsmans, R. Vonk, and D.H.J. Wigboldus. 2014. Forming impressions of personality: A replication and review of Asch’s (1946) Evidence for a primacy-of-warmth effect in impression formation. Social Psychology 45(3):153-163.

Netzer, Y., T. Wang, A. Coates, A. Bissacco, B. Wu, and A.Y. Ng. 2011. “Reading Digits in Natural Images with Unsupervised Feature Learning.” Deep Learning and Un-supervised Feature Learning Workshop, NIPS. http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf.

Nicholson, J., and Y. Lazebnik. 2014. “The R-Factor: A Measure of Scientific Veracity.” The Winnower. https://thewinnower.com/papers/1-the-r-factor-a-measure-of-scientific-veracity.

Nosek, B.A. 2014. Registered reports: A method to increase the credibility of published results. Social Psychology 45(3):137-141.

Nosek, B.A., and D. Lakens. 2014. Registered reports. Social Psychology 45(3):137-141.

Nosek, B.A., J.R. Spies, and M. Motyl. 2012. Scientific utopia II. Restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science 7(6):615-631.

Nosek, B.A., G. Alter, G.C. Banks, D. Borsboom, S.D. Bowman, S.J. Breckler, S. Buck, et al. 2015. Promoting an open research culture. Science 348(6242):1422-1425.

NRC (National Research Council). 1966. Languages and Machines: Computers in Translation and Linguistics. National Academy Press, Washington, D.C.

NRC. 1978. Deterrence and Incapacitation: Estimating the Effects of Criminal Sanctions on Crime Rates. National Academy Press, Washington, D.C.

NRC. 1991. The Future of Statistical Software: Proceedings of a Forum. National Academy Press, Washington, D.C.

NRC. 2012. Deterrence and the Death Penalty. The National Academies Press, Washington, D.C.

NSF (National Science Foundation). 2014. “A Framework for Ongoing and Future National Science Foundation Activities to Improve Reproducibility, Replicability, and Robustness in Funded Research.” Prepared for the Office of Management and Budget. Submitted December 31, 2014. https://www.nsf.gov/attachments/134722/public/Reproducibility_NSFPlanforOMB_Dec31_2014.pdf.

NSF. 2015. Social, Behavioral, and Economic Sciences Perspectives on Robust and Reliable Science. Report of the Subcommittee on Replicability in Science Advisory Committee to the National Science Foundation Directorate for Social, Behavioral, and Economic Sciences. May. http://www.nsf.gov/sbe/AC_Materials/SBE_Robust_and_Reliable_Research_Report.pdf.

Obama, B. 2006. The Audacity of Hope: Thoughts on Reclaiming the American Dream. Random House Crown Publishing, New York, N.Y.

Open Science Collaboration. 2015. Estimating the reproducibility of psychological science. Science 349(6251):943-951.

Pallet, D.S. 1985. Performance assessment of automatic speech recognizers. Journal of Research of the National Bureau of Standards 90(5):371-387.

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×

Pashler, H., N. Coburn, and C.R. Harris. 2012. Priming of social distance? Failure to replicate effects on social and food judgments. PLOS ONE 7(8):e42510.

Pasternak, B., H. Svanström, M. Melbye, and A. Hviid. 2013. Association between oral fluoroquinolone use and retinal detachment. Journal of the American Medical Association 310(20):2184-2190.

Patel, C.J., and J.P.A. Ioannidis. 2014. Studying the elusive environment in large scale. Journal of the American Medical Association 311(21):2173-2174.

Patel, C.J., B. Burford, and J.P.A. Ioannidis. 2015. Assessment of vibration of effects due to model specification can demonstrate the instability of observational associations. Journal of Clinical Epidemiology 68 (2015):1046-1058.

Peers, I.S., P.R. Ceuppens, and C. Harbron. 2012. In search of preclinical robustness. Nature Reviews Drug Discovery 11(10):733-734.

Peng, R. 2009. Reproducible research and biostatistics. Biostatistics 10(3):405-408.

Peng, R. 2011. Reproducible research in computational science. Science 334(6060):1226-1227.

Peng, R.D., F. Dominici, and S.L. Zeger. 2006. Reproducible epidemiology research. American Journal of Epidemiology 163(9):783-789.

Pereira, T.V., R.I. Horwitz, and J.P.A. Ioannidis. 2012. Empirical evaluation of very large treatment effects of medical intervention. Journal of the American Medical Association 308(16):1676-1684.

Perrin, S. 2014. Preclinical research: Make mouse studies work. Nature 507(7493):423-425.

Pierce, J.R. 1969. Whither speech recognition. Journal of the Acoustical Society of America 46:1049-1050.

Piwowar, H., R.S. Day, and D.B. Fridsma. 2007. Sharing detailed research data is associated with increased citation rate. PLOS ONE 2(3):e308.

Political Science Journal Editors. 2014. “Data Access and Research Transparency (DA-RT): A Joint Statement.” http://www.dartstatement.org/.

Prinz, F., T. Schlange, and K. Asadullah. 2011. Believe it or not: How much can we rely on published data on potential drug targets? Nature Reviews Drug Discovery 10:712.

Rasko, J., and C. Power. 2015. What pushes scientists to lie? The disturbing but familiar story of Haruko Obokata. The Guardian, February 18.

Redelmeier, D.A., and S.M. Singh. 2001. Survival in Academy Award-winning actors and actresses. Annals of Internal Medicine 134:955-962.

Reiter, J.P., and S.K. Kinney. 2011. Sharing confidential data for research purposes: A primer. Epidemiology 22(5):632-635.

Rekdal, O.B. 2014. Academic urban legends. Social Studies of Science 44(4):638-654.

Richter, S.H., J.P. Garner, and H. Würbel. 2009. Environmental standardization: Cure or cause of poor reproducibility in animal experiments. Nature Methods 6:257-261.

Richter, S.H., J.P. Garner, B. Zipser, L. Lewejohann, N. Sachser, C. Touma, B. Schindler, et al. 2011. Effect of heterogenization on the reproducibility of mouse behaviour: A multi-laboratory study. PLOS ONE 6(1):e16461.

Rosenthal, R. 1979. The file drawer problem and tolerance for null results. Psychological Bulletin 86(3):638-641.

Rothman, K.J., and J.D. Boice, Jr. 1979. Epidemiologic Analysis with a Programmable Calculator. NIH Publication 79-1649. U.S. Government Printing Office, Washington, D.C.

Ryan, P.B., P.E. Stang, J.M. Overhage, M.A. Succhard, A.G. Hartzema, W. DuMouchel, C.G. Reich, M.J. Schuemie, and D. Madigan. 2013. A comparison of the empirical performance of methods for a risk identification system. Drug Safety 36(1):143-158.

Schooler, J.W. 2014. Turning the lens of science on itself: Verbal overshadowing, replication, and metascience. Perspectives on Psychological Science 9(5):579-584.

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×

Schuemie, M.J., P.B. Ryan, W. DuMouchel, M.A. Suchard, and D. Madigan. 2013. Interpreting observational studies; why empirical calibration is needed to correct p-values. Statistics in Medicine 33(2):209-218.

Scott, S., J.E. Kranz, J. Cole, J.M. Lincecum, K. Thompson, N. Kelly, A. Bostrom, et al. 2008. Design, power, and interpretation of studies in the standard murine model of ALS. Amyotrophic Lateral Sclerosis 9(1):4-15.

Sedlmeier, P., and G. Gigerenzer. 1989. Do studies of statistical power have an effect on the power of studies? Psychological Bulletin 105:309-316.

Sellin, T. 1959. The Death Penalty. American Law Institute, Philadelphia, Pa.

Sena, E., H.B. van der Worp, D. Howells, and M. Macleod. 2007. How can we improve the pre-clinical development of drugs for stroke? Trends in Neuroscience 30:433-439.

Shao, J., and S.-C. Chow. 2002. Reproducibility probability in clinical trials. Statistics in Medicine 21:1727-1742.

Shao, J., and D. Tu. 1996. The Jackknife and Bootstrap. Springer, New York, N.Y.

Shepherd, J.M. 2005. Deterrence versus brutalization: Capital punishment’s differing impacts among states. Michigan Law Review 104:248.

Simmons, J.P., N.D. Nelson, and U. Simonsohn. 2011. False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science 22(11):1359-1366.

Simons, D.J., A.O. Holcombe, and B.A. Spellman. 2014. An introduction to registered replication reports at perspectives on psychological science. Perspectives on Psychological Science 9(5):552-555.

Simonsohn, U. 2012. It does not follow: Evaluating the one-off publication bias critiques by Francis (2012a, 2012b, 2012c, 2012d, 2012e, in press). Perspectives on Psychological Science 7(6):597-599.

Sinclair, H.C., K.B. Hood, and B.L. Wright. 2014. Revisiting the Romeo and Juliet effect (Driscoll, Davis, and Lipetz, 1972): Reexamining the links between social network opinions and romantic relationship output. Social Psychology 45(3):170-178.

Spence, D. 2014. Evidence based medicine is broken. The BMJ 348.

Steward, O., P.G. Popovich, W.D. Dietric, and N. Kleitman. 2012. Replication and reproducibility in spinal cord injury. Experimental Neurology 233(2):597-605.

Stodden, V. 2009a. Enabling reproducible research: Open licensing for scientific innovation. International Journal of Communications Law and Policy. March 3. Available at Social Science Electronic Publishing, http://ssrn.com/abstract=1362040.

Stodden, V. 2009b. The legal framework for reproducible scientific research: Licensing and copyright. Computing in Science and Engineering 11(1):35-40.

Stodden, V. 2009a. Enabling reproducible research: Open licensing for scientific innovation. International Journal of Communications Law and Policy. March 3. Available at Social Science Electronic Publishing, http://ssrn.com/abstract=1362040.

Stodden, V. 2013. Resolving irreproducibility in empirical and computational research. IMS Bulletin Online. http://bulletin.imstat.org/2013/11/resolving-irreproducibility-in-empirical-andcomputational-research/.

Stodden, V. 2009a. Enabling reproducible research: Open licensing for scientific innovation. International Journal of Communications Law and Policy. March 3. Available at Social Science Electronic Publishing, http://ssrn.com/abstract=1362040.

Stodden, V., J. Borwein, and D. Bailey. 2013a. “Setting the default to reproducible” in computational science research. SIAM News 46:4-6.

Stodden, V. 2009a. Enabling reproducible research: Open licensing for scientific innovation. International Journal of Communications Law and Policy. March 3. Available at Social Science Electronic Publishing, http://ssrn.com/abstract=1362040.

Stodden, V., P. Guo, and Z. Ma. 2013b. Toward reproducible computational research: An empirical analysis of data and code policy adoption by journals. PLOS ONE 8(6):e67111.

Stodden, V. 2009a. Enabling reproducible research: Open licensing for scientific innovation. International Journal of Communications Law and Policy. March 3. Available at Social Science Electronic Publishing, http://ssrn.com/abstract=1362040.

Stodden, V., F. Leisch, and R.D. Peng. 2014. Implementing Reproducible Research. CRC Press, Boca Raton, Fla.

Stodden, V., S. Miguez, and J. Seiler. 2015. ResearchCompendia.org: Cyberinfrastructure for reproducibility and collaboration in computational science. Computing in Science and Engineering 17(1):12-19.

Sylvestre, M.-P., E. Husztl, and J.A. Hanley. 2006. Do Oscar winners live longer than less successful peers? A reanalysis of the evidence. Annals of Internal Medicine 145:361-363.

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×

Terrorist Penalties Enhancement Act of 2003: Hearing on H.R. 2934 Before the Subcomm. on Crime, Terrorism, and Homeland Security of the H. Comm. on the Judiciary, 108th Cong. 10-11 (2004). http://commdocs.house.gov/committees/judiciary/hju93224.000/hju93224_0f.htm.

Trafimow, D., and M. Marks. 2015. Editorial. Basic and Applied Social Psychology 37:1-2.

Vandewalle, P., J. Kovacevic, and M. Vetterli. 2009. Reproducible research in signal processing—What, why, and how. IEEE Signal Processing Magazine 26(3):37-47.

Vermeulen, I., A. Batenburg, C.J. Beukeboom, and T. Smits. 2014. Breakthrough or one-hit wonder? Three attempts to replicate single-exposure musical conditioning effects on choice behavior (Gorn, 1982). Social Psychology 45(3):179-186.

Wahlsten, D. 2001. Standardized tests of mouse behavior: Reasons, recommendations, and reality. Physiology Behavior 73(5):695-704.

Waldron, L., B. Haibe-Kains, A.C. Culhane, M. Riester, J. Ding, X.V. Wang, M. Ahmadifar, et al. 2014. Comparative meta-analysis of prognostic gene signatures for late-stage ovarian cancer. Journal of the National Cancer Institute 106(5):dju049.

Wei, L., T.M. MacDonald, and I.S. Mackenzie. 2013. Pioglitazone and bladder cancer: A propensity score matched cohort study. Journal of Clinical Pharmacology 75(1):254-259.

Wellcome Trust. 2014. Establishing Incentives and Changing Cultures to Support Data Access. May. http://www.wellcome.ac.uk/stellent/groups/corporatesite/@msh_peda/documents/web_document/wtp056495.pdf.

Wesselmann, E.D., K.D. Williams, J.B. Pryor, F.A. Eichler, D.M. Gill, and J.D. Hogue. 2014. Revisiting Schachter’s research on rejection, deviance, and communication (1951). Social Psychology 45(3):164-169.

White, H. 2000. A reality check for data snooping. Econometrica 68(5):1097-1126.

Wicherts, J.M., D. Borsboom, J. Kats, and D. Molenaar. 2006. The poor availability of psychological research data for reanalysis. American Psychologist 61(7):726-728.

Wilson, E.O. 1998. Consilience: The Unity of Knowledge. Random House, New York, N.Y.

Wolfinger, R.D. 2013. Reanalysis of Richter et al. (2010) on reproducibility. Nature Methods 10:373-374.

Würbel, H., S.H. Richter, and J.P. Garner. 2013. Reply to: “Reanalysis of Richter et al. (2010) on reproducibility.” Nature Methods 10:374.

Yale Roundtable Participants. 2010. Reproducible research. Computing in Science and Engineering 12(5):8-13.

Youden, W.J. 1972. Enduring values. Technometrics 14(1):1-14.

Young, S.S., and A. Karr. 2011. Deming, data, and observational studies: A process out of control and needing fixing. Significance 8(3):116-120.

Žeželj, I.L., and B.R. Jokić. 2014. Replication of experiments evaluating impact of psychological distance on moral judgment (Eyal, Liberman and Trope, 2008; Gong and Medin, 2012). Social Psychology 45(3):223-231.

Zilliox, M.J., and R. Irizarry. 2007. A gene expression barcode for microarray data. Nature Methods 4(11):911-913.

Zimmerman, P.R. 2004. State executions, deterrence, and the incidence of murder. Journal of Applied Economics VII(I):163-193.

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×
Page 97
Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×
Page 98
Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×
Page 99
Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×
Page 100
Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×
Page 101
Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×
Page 102
Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×
Page 103
Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×
Page 104
Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×
Page 105
Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2016. Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/21915.
×
Page 106
Next: Appendixes »
Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results: Summary of a Workshop Get This Book
×
Buy Paperback | $42.00 Buy Ebook | $33.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Questions about the reproducibility of scientific research have been raised in numerous settings and have gained visibility through several high-profile journal and popular press articles. Quantitative issues contributing to reproducibility challenges have been considered (including improper data measurement and analysis, inadequate statistical expertise, and incomplete data, among others), but there is no clear consensus on how best to approach or to minimize these problems.

A lack of reproducibility of scientific results has created some distrust in scientific findings among the general public, scientists, funding agencies, and industries. While studies fail for a variety of reasons, many factors contribute to the lack of perfect reproducibility, including insufficient training in experimental design, misaligned incentives for publication and the implications for university tenure, intentional manipulation, poor data management and analysis, and inadequate instances of statistical inference.

The workshop summarized in this report was designed not to address the social and experimental challenges but instead to focus on the latter issues of improper data management and analysis, inadequate statistical expertise, incomplete data, and difficulties applying sound statistic inference to the available data. Many efforts have emerged over recent years to draw attention to and improve reproducibility of scientific work. This report uniquely focuses on the statistical perspective of three issues: the extent of reproducibility, the causes of reproducibility failures, and the potential remedies for these failures.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!