Abstract
The "Evidence Based Practice" requires professionals to critically assess the results of psychological research. However, incorrect interpretations of p values of probability are abundant and repetitive. These misconceptions may affect professional decisions and compromise the quality of interventions and the accumulation of a valid scientific knowledge. Therefore, identifying the types of fallacies that underlying statistical decisions is fundamental for approaching and planning statistical education strategies designed to intervene in incorrect interpretations. Consequently, the aim of this study is to analyze the interpretation of p value among university students of psychology and academic psychologists. The sample was composed of 161 participants (43 academics and 118 students). The mean number of years as academic was 16.7 (SD = 10.07). The mean age of university students was 21.59 years (SD = 1.3). The findings suggest that college students and academics do not know the correct interpretation of p values. The inverse probability fallacy presented major problems of comprehension. In addition, the participants confused statistical significance and practical significance or clinical or the findings. There is a need for statistical education and statistical re-education.
American Psychological Association. (2005). Policy Statement on Evidence-Based Practice in Psychology. Washington, DC: Autor.
American Psychological Association. (2006). Evidence-based practice in psychology: APA Presidential Task Force on evidence-based practice. American Psychologist, 61, 271-285. http://dx.doi.org/10.1037/0003-066X.61.4.271
American Psychological Association. (2010). Publication manual of the American Psychological Association (6th. ed.). Washington, DC: Autor.
Babione, J. M. (2010). Evidence-Based Practice in Psychology: An ethical framework for graduate education, clinical training, and maintaining professional competence. Ethics & Behavior, 20, 443-453. http://dx.doi.org/10.1080/10508422.2010.521446
Badenes-Ribera. L., Frias-Navarro, D., Iotti, B., Bonilla-Campos, A., & Longobardi, C. (2016). Misconceptions of the p-value among Chilean and Italian academic psychologists. Frontiers in Psychology, 7, 1247. http://dx.doi.org/10.3389/fpsyg.2016.01247
Badenes-Ribera, L., Frias-Navarro, D., Monterde-i-Bort, H., & Pascual-Soler, M. (2015). Interpretation of the p value. A national survey study in academic psychologists from Spain. Psicothema, 27, 290-295. http://dx.doi.org/10.7334/psicothema2014.283
Balluerka, N., Gómez, J., & Hidalgo, D. (2005). The controversy over null hypothesis significance testing revisited. Methodology, 1, 55-70. http://dx.doi.org/10.1027/1614-1881.1.2.55
Balluerka, N., Vergara, A. I., & Arnau, J. (2009). Calculating the main alternatives to Null Hypothesis Significance testing in between subject experimental designs. Psicothema, 21(1), 141-151.
Berkson, J. (1938). Some difficulties of interpretation encountered in the application of the chi-square test. Journal of the American Statistical Association, 33, 526-536.
Beyth-Maron, R., Fidler, F., & Cumming, G. (2008). Statistical cognition: Towards evidence-based practice in statistics and statistics education. Statistics Education Research Journal, 7(2), 20-39.
Castro-Sotos, A. E., Vanhoof, S., Van den Noortgate, W., & Onghena, P. (2009). How confident are students in their misconceptions about hypothesis tests? Journal of Statistics Education, 17(2). (Número de servicio de reproducción de documentos ERIC EJ856367). Recuperado de http://www.amstat.org/publications/jse/v17n2/castrosotos.html
Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49(12), 997-1003. http://dx.doi.org/10.1037/0003-066X.49.12.997
Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. Nueva York: Routledge.
Cumming, G., Fidler, F., Leonard, M., Kalinowski, P, Christiansen, A., Kleinig, A.,… & Wilson, S. (2007). Statistical reform in psychology: Is anything changing? Psychological Science, 18, 230-232. http://dx.doi.org/10.1111/j.1467-9280.2007.01881.x
Daset, L. R., & Cracco, C. (2013). Psicología Basada en la Evidencia: algunas cuestiones básicas y una aproximación a través de una revisión bibliográfica. Ciencias Psicológicas, 7(2), 209-220.
Falk, R., & Greenbaum, C. W. (1995). Significance tests Die Hard: The amazing persistence of a probabilistic misconception. Theory & Psychology, 5, 75-98. http://dx.doi.org/10.1177/0959354395051004
Frias-Navarro, D. (2011). Técnica estadística y diseño de investigación. Valencia: Palmero Ediciones.
Frias-Navarro, D., & Pascual-Llobell, J. (2003). Psicología clínica basada en pruebas: efecto del tratamiento. Papeles del Psicólogo, 24(85), 11-18.
Frias-Navarro, D., Pascual-Soler, M., Badenes-Ribera, L., & Monterde-i-Bort, H. (2014). Reforma estadística en psicología. Valencia: Palmero Ediciones.
Garfield, J. (2002). The challenge of developing statistical reasoning. Journal of Statistic Education, 10. Recuperado de http://www.amstat.org/publications/jse/v10n3/garfield.html
Gliner, J. A., Leech, N. L., & Morgan, G. A. (2002). Problems with null hypothesis significance testing (NHST): What do the textbooks say? The Journal of Experimental Education, 71, 83-92. http://dx.doi.org/ 10.1080/00220970209602058
Gliner, J. A., Vaske, J. J., & Morgan, G. A. (2001). Null hypothesis significance testing: Effect size matters. Human Dimensions of Wildlife, 6, 291-301. http://dx.doi.org/ 10.1080/108712001753473966
Hager, W. (2013). The statistical theories of Fisher and of Neyman and Pearson: A methodological perspective. Theory & Psychology, 23, 251-270. http://dx.doi.org/10.1177/0959354312465483
Haller, H., & Krauss, S. (2002). Misinterpretations of significance: A problem students share with their teachers? Methods of Psychological Research Online [On-line serial], 7, 120. Recuperado de http://www.metheval.uni-jena.de/lehre/0405-ws/evaluationuebung/haller.pdf
Hoekstra, R., Morey, R. D., Rouder, J. N., & Wagenmakers, E. (2014). Robust misinterpretation of confidence intervals. Psychonomic Bulletin & Review, 21, 1157-1164. http://dx.doi.org/10.3758/s13423-013-0572-3
Hubbard, R., & Lindsay, R. M. (2008). Why p values are not a useful measure of evidence in statistical significance testing. Theory & Psychology, 18, 69-88. http://dx.doi.org/10.1177/0959354307086923
Ivarsson, A., Andersen, M. B., Stenling, A., Johnson, U., & Lindwall, M. (2015). Things we still haven't learned (so far). Journal of Sport & Exercise Psychology, 37, 449-461. http://dx.doi.org/10.1123/jsep.2015-0015
Johnson, D. H. (1999). The insignificance of statistical significance testing. Journal of Wildlife Management, 63, 763-772.
Kazdin, A. E. (1999). The meanings and measurement of clinical significance. Journal of Consulting and Clinical Psychology, 67, 332-339. http://dx.doi.org/10.1037/0022-006X.67.3.332
Kirk, R. E. (1996). Practical significance: A concept whose time has come. Educational and Psychological Measurement, 56, 746-759. http://dx.doi.org/ 10.1177/0013164496056005002
Kirk, R. E. (2001). Promoting good statistical practices: Some suggestions. Educational and Psychological Measurement, 61, 213-218. http://dx.doi.org/ 10.1177/00131640121971185
Kline, R. B. (2013). Beyond significance testing: Statistic reform in the behavioral sciences. Washington, DC: APA.
Kühberger, A., Fritz, A., Lermer, E., & Scherndl, T. (2015). The significance fallacy in inferential statistics. BMC Research Notes, 17(8), 84. http://dx.doi.org/10.1186/s13104-015-1020-4
Lecoutre, M. P., Poitevineau, J., & Lecoutre, B. (2003). Even statisticians are not immune to misinterpretations of Null Hypothesis Tests. International Journal of Psychology, 38, 37-45. http://dx.doi.org/10.1080/00207590244000250
Leek, J. (14 de febrero de 2014). On the scalability of statistical procedures: Why the p-value bashers just don’t get it [Simply Statistics Blog]. Recuperado de http://simplystatistics.org/2014/02/14/on-the-scalability-of-statisticalprocedures-why-the-p-value-bashers-just-dont-get-it/
Maher, J. M., Markey, J. C., & Ebert-May, D. (2013). The other half of the story: Effect size analysis in quantitative research. CBE Life Sciences Education, 12, 345-351. http://dx.doi.org/10.1187/cbe.13-04-0082
Mittag, K. C., & Thompson, B. (2000). A national survey of AERA members’ perceptions of statistical significance test and others statistical issues. Educational Researcher, 29, 14-20. http://dx.doi.org/10.3102/0013189X029004014
Monterde-i-Bort, H., Frias-Navarro, D., & Pascual-Llobel, J. (2010). Uses and abuses of statistical significance tests and other statistical resources: A comparative study. European Journal of Psychology of Education, 25, 429-447. http://dx.doi.org/10.1007/s10212-010-0021-x
Newcombe, R. G. (2012). Confidence intervals for proportions and related measures of effect size. Boca Raton, FL: CRC Press.
Nickerson, R. S. (2000). Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods, 5, 241-301. http://dx.doi.org/10.1037/1082-989X.S.2.241
Oakes, M. (1986). Statistical inference: A commentary for the social and behavioral sciences. Chichester, England: Wiley.
Palmer, A., & Sesé, A. (2013). Recommendations for the use of statistics in clinical and health psychology. Clínica y Salud, 24, 47-54. http://dx.doi.org/10.5093/cl2013a6
Pascual-Llobell, J., Frias-Navarro, D., & Monterde-i-Bort, H. (2004). Tratamientos psicológicos con apoyo empírico y práctica clínica basada en la evidencia. Papeles del Psicólogo, 25(87), 1-8.
Perezgonzalez, J. D. (2015a).Confidence intervals and tests are two sides of the same research question. Frontiers in Psychology, 6, 34. http://dx.doi.org/ 10.3389/fpsyg.2015.00034
Perezgonzalez, J. D. (2015b).Fisher, Neyman-Pearson or NHST? A tutorial for teaching data testing. Frontiers in Psychology, 6, 223. http://dx.doi.org/ 10.3389/fpsyg.2015.000223
Rosenthal, R. (1993). Cumulating evidence. En G. Keren & C. Lewis (Eds.), A handbook for data analysis in the behavioral sciences: Methodological issues (pp. 519-559). Hillsdale, NJ: Erlbaum.
Sánchez-Meca, J., Boruch, R. F., Petrosino, A., & Rosa-Alcázar, A. I. (2002). La Colaboración Campbell y la Práctica basada en la Evidencia. Papeles del Psicólogo, 83, 44-48.
Savalei, V., & Dunn, E. (2015). Is the call to abandon p-values the red herring of the replicability crisis? Frontiers in Psychology, 6, 245. http://dx.doi.org/10.3389/fpsyg.2015.00245
Shaver, J. P. (1993). What statistical significance testing is, and what is not. The Journal of Experimental Education, 61, 293-316.
Téllez, A., García, C. H., & Corral-Verdugo, V. (2015). Effect size, confidence intervals and statistical power in psychological research. Psychology in Russia: State of the Art, 8, 27-46. http://dx.doi.org/10.11621/pir.2015.0303
Valera-Espín, A., Sánchez-Meca, J., & Marín-Martínez, F. (2000). Contraste de hipótesis e investigación psicológica española: análisis y propuestas. Psicothema, 12(Supl. 2), 549-552.
Vallecillos, A. (2002). Empirical evidence about understanding of the level of significance concept in hypotheses testing by university students. Themes in Education, 3, 183-198.
Vallecillos, A., & Batanero, C. (1997). Conceptos activados en el contraste de hipótesis estadísticas y su comprensión por estudiantes universitarios. Recherches en Didactique des Mathématiques, 17, 29-48.
Vázquez, C., & Nieto, M. (2003). Psicología (clínica) basada en la evidencia (PBE): una revisión conceptual y metodológica. En J. L. Romero (Ed.), Psicópolis: Paradigmas actuales y alternativos en la psicología contemporánea (pp. 465-485). Barcelona: Paidós.
Verdam, M. G. E., Oort, F. J., & Sprangers, M. A. G. (2014). Significance, truth and proof of p values: Reminders about common misconceptions regarding null hypothesis significance testing. Quality of Life Research, 23, 5-7. http://dx.doi.org/10.1007/s11136-013-0437-2
Wasserstein, R. L., & Lazar, N. A. (2016). The ASA's statement on p-values: Context, process, and purpose. The American Statistician, 70, 129-133. http://dx.doi.org/10.1080/00031305.2016.1154108
Wilkinson, L. (1999). Statistical methods in psychology journals: Guidelines and explanations. The American Psychologist, 54, 594-604. http://dx.doi.org/10.1037/0003-066X.54.8.594
This journal is registered under a Creative Commons Attribution 4.0 International Public License. Thus, this work may be reproduced, distributed, and publicly shared in digital format, as long as the names of the authors and Pontificia Universidad Javeriana are acknowledged. Others are allowed to quote, adapt, transform, auto-archive, republish, and create based on this material, for any purpose (even commercial ones), provided the authorship is duly acknowledged, a link to the original work is provided, and it is specified if changes have been made. Pontificia Universidad Javeriana does not hold the rights of published works and the authors are solely responsible for the contents of their works; they keep the moral, intellectual, privacy, and publicity rights. Approving the intervention of the work (review, copy-editing, translation, layout) and the following outreach, are granted through an use license and not through an assignment of rights. This means the journal and Pontificia Universidad Javeriana cannot be held responsible for any ethical malpractice by the authors. As a consequence of the protection granted by the use license, the journal is not required to publish recantations or modify information already published, unless the errata stems from the editorial management process. Publishing contents in this journal does not generate royalties for contributors.