Abstract
Depressive disorders prevalence rates in women are approximately 1.7 times higher than those in men. This could be influenced, at least in part, by gender bias due to the differential functioning of certain items in measurement instruments, which occurs when women and men are assigned different scores despite having similar levels of the trait that is evaluated. Recent evidence on gender bias in instruments for measuring depression due to the differential functioning of items in eight frequently used instruments was reviewed. The comparative analysis of scores obtained identified 20 items featuring differential functioning and gender differences in other aspects related to depression measurement. Women scored higher on 15 items (13 in children and adolescents), while men scored higher on four items (three in children and adolescents), and lower on one item. Conclusion: six frequently used instruments evaluate depression differently depending on gender, presenting signs of bias that could explain the differences in depressive disorders prevalence rates.
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. https://www.testingstandards.net/uploads/7/6/6/4/76643089/standards_2014edition.pdf
American Psychiatric Association. (2022). Diagnostic and statistical manual of mental disorders (5th ed., text rev.). American Psychiatric Publishing. https://doi.org/10.1176/appi.books.9780890425787
Bacigalupe, A., Cabezas, A., Bueno, M. B., & Martín, U. (2020). El género como determinante de la salud mental y su medicalización. Informe SESPAS 2020. Gaceta Sanitaria, 34, 61–67. https://doi.org/10.1016/j.gaceta.2020.06.013
Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Beck Depression Inventory–II manual. The Psychological Corporation.
Beck, A. T., Ward, C. H., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for measuring depression. Archives of General Psychiatry, 4(6), 561–571. https://doi.org/10.1001/archpsyc.1961.01710120031004
Berger, J. L., Addis, M. E., Reilly, E. D., Syzdek, M. R., & Green, J. D. (2012). Effects of gender, diagnostic labels, and causal theories on willingness to report symptoms of depression. Journal of Social and Clinical Psychology, 31(5), 439–457. https://doi.org/10.1521/jscp.2012.31.5.439
Broekman, B. F. P., Nyunt, S. Z., Niti, M., Jin, A. Z., Ko, S. M., Kumar, R., Fones, C. S. L., & Ng, T. P. (2008). Differential item functioning of the Geriatric Depression Scale in an Asian population. Journal of Affective Disorders, 108(3), 285–290. https://doi.org/10.1016/j.jad.2007.10.005
Bruchmüller, K., Margraf, J., & Schneider, S. (2012). Is ADHD diagnosed in accord with diagnostic criteria? Overdiagnosis and influence of client gender on diagnosis. Journal of Consulting and Clinical Psychology, 80(1), 128–138. https://doi.org/10.1037/a0026582
Cameron, I. M., Crawford, J. R., Lawton, K., & Reid, I. C. (2013). Differential item functioning of the HADS and PHQ-9: An investigation of age, gender and educational background in a clinical UK primary care sample. Journal of Affective Disorders, 147(1–3), 262–268. https://doi.org/10.1016/j.jad.2012.11.015
Carleton, R. N., Thibodeau, M. A., Teale, M. J. N., Welch, P. G., Abrams, M. P., Robinson, T., & Asmundson, G. J. (2013). The Center for Epidemiologic Studies Depression Scale: A review with theoretical and empirical evaluation of item content and factor structure. PLoS ONE, 8(3), e58067. https://doi.org/10.1371/journal.pone.0058067
Carmody, D. P. (2005). Psychometric characteristics of the Beck Depression Inventory-II with college students of diverse ethnicity. International Journal of Psychiatry in Clinical Practice, 9(1), 22–28. https://doi.org/10.1080/13651500510014800
Cole, S. R., Kawachi, I., Maller, S. J., & Berkman, L. F. (2000). Test of item-response bias in the CES–D Scale. Journal of Clinical Epidemiology, 53(3), 285–289. https://doi.org/10.1016/S0895-4356(99)00151-1
Cwik, J. C., Papen, F., Lemke, J. E., & Margraf, J. (2016). An investigation of diagnostic accuracy and confidence associated with diagnostic checklists as well as gender biases in relation to mental disorders. Frontiers in Psychology, 7, 1813. https://doi.org/10.3389/fpsyg.2016.01813
Hiesinger, K., Tophoven, S., & March, S. (2018). Geschlechtsspezifische Verzerrungen bei der Erfassung von Depressivität. Prävention und Gesundheitsförderung, 13(3), 211–217. https://doi.org/10.1007/s11553-017-0634-x
Higgins, J. P. T., Thomas, J., Chandler, J., Cumpston, M., Li, T., Page, M. J., & Welch, V. A. (Eds.). (2023). Cochrane handbook for systematic reviews of interventions (Version 6.4). Cochrane. https://training.cochrane.org/handbook
Hyde, J. S., Bigler, R. S., Joel, D., Tate, C. C., & Van Anders, S. M. (2018). The future of sex and gender in psychology: Five challenges to the gender binary. American Psychologist, 74(2), 171–193. https://doi.org/10.1037/amp0000307
Iriarte, A. D. V., & Medina, J. (2024). Cross-national prevalence of risk of depression among older adults: Are there gender and age differences? Innovation in Aging, 8(Suppl. 1), 355–356. https://doi.org/10.1093/geroni/igae098.1159
Kessler, R. C., McGonagle, K. A., Swartz, M., Blazer, D. G., & Nelson, C. B. (1993). Sex and depression in the National Comorbidity Survey I: Lifetime prevalence, chronicity and recurrence. Journal of Affective Disorders, 29(2–3), 85–96. https://doi.org/10.1016/0165-0327(93)90026-G
Kovacs, M. (1992). Children’s Depression Inventory (CDI) manual. Multi-Health Systems.
Kroenke, K., Spitzer, R. L., & Williams, J. B. (2001). The PHQ‐9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16(9), 606–613. https://doi.org/10.1046/j.1525-1497.2001.016009606.x
Langer, E. J., & Abelson, R. P. (1974). A patient by any other name: Clinician group difference in labeling bias. Journal of Consulting and Clinical Psychology, 42(1), 4–9. https://doi.org/10.1037/h0036054
Levin-Aspenson, H. F., & Watson, D. (2018). Mode of administration effects in psychopathology assessment: Analyses of gender, age, and education differences in self-rated versus interview-based depression. Psychological Assessment, 30(3), 287–299. https://doi.org/10.1037/pas0000474
Lewis, R., Lamdan, R. M., Wald, D., & Curtis, M. (2006). Gender bias in the diagnosis of a geriatric standardized patient: A potential confounding variable. Academic Psychiatry, 30(5), 392–396. https://doi.org/10.1176/appi.ap.30.5.392
Mahalik, J. R., Di Bianca, M., & Harris, M. P. (2022). Conformity to masculine norms and men’s responses to the COVID-19 pandemic. Psychology of Men & Masculinity, 23(4), 445–449. https://doi.org/10.1037/men0000401
Marc, L. G., Raue, P. J., & Bruce, M. L. (2008). Screening performance of the 15-item Geriatric Depression Scale in a diverse elderly home care population. The American Journal of Geriatric Psychiatry, 16(11), 914–921. https://doi.org/10.1097/JGP.0b013e318186bd67
Mokhwelepa, L. W., & Sumbane, G. O. (2025). Men’s mental health matters: The impact of traditional masculinity norms on men’s willingness to seek mental health support—A systematic review of literature. American Journal of Men’s Health, 19(3). https://doi.org/10.1177/15579883251321670
Nolen-Hoeksema, S. (1987). Sex differences in unipolar depression: Evidence and theory. Psychological Bulletin, 101(2), 259–282. https://doi.org/10.1037/0033-2909.101.2.259
Nolen-Hoeksema, S., Larson, J., & Grayson, C. (1999). Explaining the gender difference in depressive symptoms. Journal of Personality and Social Psychology, 77(5), 1061–1072. https://doi.org/10.1037/0022-3514.77.5.1061
Osman, A., Downs, W. R., Barrios, F. X., Kopper, B. A., Gutierrez, P. M., & Chiros, C. E. (1997). Factor structure and psychometric characteristics of the Beck Depression Inventory-II. Journal of Psychopathology and Behavioral Assessment, 19(4), 359–376. https://doi.org/10.1007/BF02229026
Page, M. J., Moher, D., Bossuyt, P. M., Boutron, I., Hoffmann, T., Mulrow, C. D., … McKenzie, J. E. (2021). PRISMA 2020 Explanation and Elaboration: Updated guidance and exemplars for reporting systematic reviews. BMJ, 372, n160. https://doi.org/10.1136/bmj.n160
Radloff, L. S. (1977). The CES-D Scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1(3), 385–401. https://doi.org/10.1177/014662167700100306
Rivera, F. (2025). Equidad de los test psicológicos desde una perspectiva de género: Análisis de buenas prácticas en psicometría. Apuntes de Psicología, 43(1), 107–120. https://doi.org/10.70478/apuntes.psi.2025.43.10
Salk, R. H., Hyde, J. S., & Abramson, L. Y. (2017). Gender differences in depression in representative national samples: Meta-analyses of diagnoses and symptoms. Psychological Bulletin, 143(8), 783–822. https://doi.org/10.1037/bul0000102
Salokangas, R. K., Poutanen, O., & Stengård, E. (1995). Screening for depression in primary care: Development and validation of the Depression Scale. Acta Psychiatrica Scandinavica, 92(1), 10–16. https://doi.org/10.1111/j.1600-0447.1995.tb09536.x
Salokangas, R. K., Vaahtera, K., Pacriev, S., Sohlman, B., & Lehtinen, V. (2002). Gender differences in depressive symptoms: An artefact caused by measurement instruments? Journal of Affective Disorders, 68(2–3), 215–220. https://doi.org/10.1016/S0165-0327(00)00315-3
Seedat, S., Scott, K. M., Angermeyer, M. C., Berglund, P., Bromet, E. J., Brugha, T. S., Demyttenaere, K., de Girolamo, G., Haro, J. M., Jin, R., Karam, E. G., Kovess-Masfety, V., … Kessler, R. C. (2009). Cross-national associations between gender and mental disorders in the WHO World Mental Health Surveys. Archives of General Psychiatry, 66(7), 785–795. https://doi.org/10.1001/archgenpsychiatry.2009.36
Sheehan, D. V., Lecrubier, Y., Sheehan, K. H., Amorim, P., Janavs, J., Weiller, E., Hergueta, T., Baker, R., & Dunbar, G. C. (1998). The Mini-International Neuropsychiatric Interview (MINI): The development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10. Journal of Clinical Psychiatry, 59(20), 22–33.
Shi, P., Yang, A., Zhao, Q., Chen, Z., Ren, X., & Dai, Q. (2021). A hypothesis of gender differences in self-reporting symptoms of depression: Implications to solve under-diagnosis and under-treatment of depression in males. Frontiers in Psychiatry, 12, 589687. https://doi.org/10.3389/fpsyt.2021.589687
Smith, D. T., Mouzon, D. M., & Elliott, M. (2016). Reviewing the assumptions about men’s mental health: An exploration of the gender binary. American Journal of Men’s Health, 12(1), 78–89. https://doi.org/10.1177/1557988316630953
Suh, H., van Nuenen, M., & Rice, K. G. (2017). The CES-D as a measure of psychological distress among international students: Measurement and structural invariance across gender. Assessment, 24(7), 896–906. https://doi.org/10.1177/1073191116632337
Thibodeau, M. A., & Asmundson, G. J. (2014). The PHQ-9 assesses depression similarly in men and women from the general population. Personality and Individual Differences, 56, 149–153. https://doi.org/10.1016/j.paid.2013.08.039
Van Beek, Y., Hessen, D. J., Hutteman, R., Verhulp, E. E., & Van Leuven, M. (2012). Age and gender differences in depression across adolescence: Real or ‘bias’? Journal of Child Psychology and Psychiatry, 53(9), 973–985. https://doi.org/10.1111/j.1469-7610.2012.02553.x
Watson, D., O’Hara, M. W., Naragon-Gainey, K., Koffel, E., Chmielewski, M., Kotov, R., Stasik, S. M., & Ruggero, C. J. (2012). Development and validation of new anxiety and bipolar symptom scales for an expanded version of the IDAS (the IDAS-II). Assessment, 19(4), 399–420. https://doi.org/10.1177/1073191112449857
Wrobel, N. H. (1993). Effect of patient age and gender on clinical decisions. Professional Psychology: Research and Practice, 24(2), 206–213. https://doi.org/10.1037/0735-7028.24.2.206
Yang, F. M., & Jones, R. N. (2007). CES–D item response bias found with Mantel–Haenszel method successfully replicated using latent variable modeling. Journal of Clinical Epidemiology, 60(11), 1195–1200. https://doi.org/10.1016/j.jclinepi.2007.02.008
Yesavage, J. A., & Sheikh, J. I. (1986). Geriatric Depression Scale (GDS): Recent evidence and development of a shorter scale. Clinical Gerontologist, 5(1–2), 165–173. https://doi.org/10.1300/J018v05n01_09
You, S., Merritt, R. D., & Conner, K. R. (2009). Do gender differences in the role of dysfunctional attitudes in depressive symptoms depend on depression history? Personality and Individual Differences, 46(2), 218–223. https://doi.org/10.1016/j.paid.2008.10.002
Zhu, J., & Chiu, M. M. (2021). Gender- and age-bias in CES-D when measuring depression in China: A Rasch analysis. Current Psychology, 42(10), 8186–8196. https://doi.org/10.1007/s12144-021-01991-2
Zigmond, A. S., & Snaith, R. P. (1983). The Hospital Anxiety and Depression Scale. Acta Psychiatrica Scandinavica, 67(6), 361–370. https://doi.org/10.1111/j.1600-0447.1983.tb09716.x

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2026 Bartolomé Cantador, Adoración Antolí

