Gender Bias in Depression Measurement Instruments: A Systematic Review
PDF (Spanish)
HTML (Spanish)
XML (Spanish)

Keywords

depression
gender
bias
differential functioning
measuring instruments

How to Cite

Cantador Toril, B., & Antolí Cabrera, A. (2026). Gender Bias in Depression Measurement Instruments: A Systematic Review. Universitas Psychologica, 24, 1-15. https://doi.org/10.11144/Javeriana.upsy24.sgim
Dimensions
 

Google Scholar
 
Search GoogleScholar

Abstract

Depressive disorders prevalence rates in women are approximately 1.7 times higher than those in men. This could be influenced, at least in part, by gender bias due to the differential functioning of certain items in measurement instruments, which occurs when women and men are assigned different scores despite having similar levels of the trait that is evaluated. Recent evidence on gender bias in instruments for measuring depression due to the differential functioning of items in eight frequently used instruments was reviewed. The comparative analysis of scores obtained identified 20 items featuring differential functioning and gender differences in other aspects related to depression measurement. Women scored higher on 15 items (13 in children and adolescents), while men scored higher on four items (three in children and adolescents), and lower on one item. Conclusion: six frequently used instruments evaluate depression differently depending on gender, presenting signs of bias that could explain the differences in depressive disorders prevalence rates.

PDF (Spanish)
HTML (Spanish)
XML (Spanish)

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. https://www.testingstandards.net/uploads/7/6/6/4/76643089/standards_2014edition.pdf

American Psychiatric Association. (2022). Diagnostic and statistical manual of mental disorders (5th ed., text rev.). American Psychiatric Publishing. https://doi.org/10.1176/appi.books.9780890425787

Bacigalupe, A., Cabezas, A., Bueno, M. B., & Martín, U. (2020). El género como determinante de la salud mental y su medicalización. Informe SESPAS 2020. Gaceta Sanitaria, 34, 61–67. https://doi.org/10.1016/j.gaceta.2020.06.013

Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Beck Depression Inventory–II manual. The Psychological Corporation.

Beck, A. T., Ward, C. H., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for measuring depression. Archives of General Psychiatry, 4(6), 561–571. https://doi.org/10.1001/archpsyc.1961.01710120031004

Berger, J. L., Addis, M. E., Reilly, E. D., Syzdek, M. R., & Green, J. D. (2012). Effects of gender, diagnostic labels, and causal theories on willingness to report symptoms of depression. Journal of Social and Clinical Psychology, 31(5), 439–457. https://doi.org/10.1521/jscp.2012.31.5.439

Broekman, B. F. P., Nyunt, S. Z., Niti, M., Jin, A. Z., Ko, S. M., Kumar, R., Fones, C. S. L., & Ng, T. P. (2008). Differential item functioning of the Geriatric Depression Scale in an Asian population. Journal of Affective Disorders, 108(3), 285–290. https://doi.org/10.1016/j.jad.2007.10.005

Bruchmüller, K., Margraf, J., & Schneider, S. (2012). Is ADHD diagnosed in accord with diagnostic criteria? Overdiagnosis and influence of client gender on diagnosis. Journal of Consulting and Clinical Psychology, 80(1), 128–138. https://doi.org/10.1037/a0026582

Cameron, I. M., Crawford, J. R., Lawton, K., & Reid, I. C. (2013). Differential item functioning of the HADS and PHQ-9: An investigation of age, gender and educational background in a clinical UK primary care sample. Journal of Affective Disorders, 147(1–3), 262–268. https://doi.org/10.1016/j.jad.2012.11.015

Carleton, R. N., Thibodeau, M. A., Teale, M. J. N., Welch, P. G., Abrams, M. P., Robinson, T., & Asmundson, G. J. (2013). The Center for Epidemiologic Studies Depression Scale: A review with theoretical and empirical evaluation of item content and factor structure. PLoS ONE, 8(3), e58067. https://doi.org/10.1371/journal.pone.0058067

Carmody, D. P. (2005). Psychometric characteristics of the Beck Depression Inventory-II with college students of diverse ethnicity. International Journal of Psychiatry in Clinical Practice, 9(1), 22–28. https://doi.org/10.1080/13651500510014800

Cole, S. R., Kawachi, I., Maller, S. J., & Berkman, L. F. (2000). Test of item-response bias in the CES–D Scale. Journal of Clinical Epidemiology, 53(3), 285–289. https://doi.org/10.1016/S0895-4356(99)00151-1

Cwik, J. C., Papen, F., Lemke, J. E., & Margraf, J. (2016). An investigation of diagnostic accuracy and confidence associated with diagnostic checklists as well as gender biases in relation to mental disorders. Frontiers in Psychology, 7, 1813. https://doi.org/10.3389/fpsyg.2016.01813

Hiesinger, K., Tophoven, S., & March, S. (2018). Geschlechtsspezifische Verzerrungen bei der Erfassung von Depressivität. Prävention und Gesundheitsförderung, 13(3), 211–217. https://doi.org/10.1007/s11553-017-0634-x

Higgins, J. P. T., Thomas, J., Chandler, J., Cumpston, M., Li, T., Page, M. J., & Welch, V. A. (Eds.). (2023). Cochrane handbook for systematic reviews of interventions (Version 6.4). Cochrane. https://training.cochrane.org/handbook

Hyde, J. S., Bigler, R. S., Joel, D., Tate, C. C., & Van Anders, S. M. (2018). The future of sex and gender in psychology: Five challenges to the gender binary. American Psychologist, 74(2), 171–193. https://doi.org/10.1037/amp0000307

Iriarte, A. D. V., & Medina, J. (2024). Cross-national prevalence of risk of depression among older adults: Are there gender and age differences? Innovation in Aging, 8(Suppl. 1), 355–356. https://doi.org/10.1093/geroni/igae098.1159

Kessler, R. C., McGonagle, K. A., Swartz, M., Blazer, D. G., & Nelson, C. B. (1993). Sex and depression in the National Comorbidity Survey I: Lifetime prevalence, chronicity and recurrence. Journal of Affective Disorders, 29(2–3), 85–96. https://doi.org/10.1016/0165-0327(93)90026-G

Kovacs, M. (1992). Children’s Depression Inventory (CDI) manual. Multi-Health Systems.

Kroenke, K., Spitzer, R. L., & Williams, J. B. (2001). The PHQ‐9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16(9), 606–613. https://doi.org/10.1046/j.1525-1497.2001.016009606.x

Langer, E. J., & Abelson, R. P. (1974). A patient by any other name: Clinician group difference in labeling bias. Journal of Consulting and Clinical Psychology, 42(1), 4–9. https://doi.org/10.1037/h0036054

Levin-Aspenson, H. F., & Watson, D. (2018). Mode of administration effects in psychopathology assessment: Analyses of gender, age, and education differences in self-rated versus interview-based depression. Psychological Assessment, 30(3), 287–299. https://doi.org/10.1037/pas0000474

Lewis, R., Lamdan, R. M., Wald, D., & Curtis, M. (2006). Gender bias in the diagnosis of a geriatric standardized patient: A potential confounding variable. Academic Psychiatry, 30(5), 392–396. https://doi.org/10.1176/appi.ap.30.5.392

Mahalik, J. R., Di Bianca, M., & Harris, M. P. (2022). Conformity to masculine norms and men’s responses to the COVID-19 pandemic. Psychology of Men & Masculinity, 23(4), 445–449. https://doi.org/10.1037/men0000401

Marc, L. G., Raue, P. J., & Bruce, M. L. (2008). Screening performance of the 15-item Geriatric Depression Scale in a diverse elderly home care population. The American Journal of Geriatric Psychiatry, 16(11), 914–921. https://doi.org/10.1097/JGP.0b013e318186bd67

Mokhwelepa, L. W., & Sumbane, G. O. (2025). Men’s mental health matters: The impact of traditional masculinity norms on men’s willingness to seek mental health support—A systematic review of literature. American Journal of Men’s Health, 19(3). https://doi.org/10.1177/15579883251321670

Nolen-Hoeksema, S. (1987). Sex differences in unipolar depression: Evidence and theory. Psychological Bulletin, 101(2), 259–282. https://doi.org/10.1037/0033-2909.101.2.259

Nolen-Hoeksema, S., Larson, J., & Grayson, C. (1999). Explaining the gender difference in depressive symptoms. Journal of Personality and Social Psychology, 77(5), 1061–1072. https://doi.org/10.1037/0022-3514.77.5.1061

Osman, A., Downs, W. R., Barrios, F. X., Kopper, B. A., Gutierrez, P. M., & Chiros, C. E. (1997). Factor structure and psychometric characteristics of the Beck Depression Inventory-II. Journal of Psychopathology and Behavioral Assessment, 19(4), 359–376. https://doi.org/10.1007/BF02229026

Page, M. J., Moher, D., Bossuyt, P. M., Boutron, I., Hoffmann, T., Mulrow, C. D., … McKenzie, J. E. (2021). PRISMA 2020 Explanation and Elaboration: Updated guidance and exemplars for reporting systematic reviews. BMJ, 372, n160. https://doi.org/10.1136/bmj.n160

Radloff, L. S. (1977). The CES-D Scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1(3), 385–401. https://doi.org/10.1177/014662167700100306

Rivera, F. (2025). Equidad de los test psicológicos desde una perspectiva de género: Análisis de buenas prácticas en psicometría. Apuntes de Psicología, 43(1), 107–120. https://doi.org/10.70478/apuntes.psi.2025.43.10

Salk, R. H., Hyde, J. S., & Abramson, L. Y. (2017). Gender differences in depression in representative national samples: Meta-analyses of diagnoses and symptoms. Psychological Bulletin, 143(8), 783–822. https://doi.org/10.1037/bul0000102

Salokangas, R. K., Poutanen, O., & Stengård, E. (1995). Screening for depression in primary care: Development and validation of the Depression Scale. Acta Psychiatrica Scandinavica, 92(1), 10–16. https://doi.org/10.1111/j.1600-0447.1995.tb09536.x

Salokangas, R. K., Vaahtera, K., Pacriev, S., Sohlman, B., & Lehtinen, V. (2002). Gender differences in depressive symptoms: An artefact caused by measurement instruments? Journal of Affective Disorders, 68(2–3), 215–220. https://doi.org/10.1016/S0165-0327(00)00315-3

Seedat, S., Scott, K. M., Angermeyer, M. C., Berglund, P., Bromet, E. J., Brugha, T. S., Demyttenaere, K., de Girolamo, G., Haro, J. M., Jin, R., Karam, E. G., Kovess-Masfety, V., … Kessler, R. C. (2009). Cross-national associations between gender and mental disorders in the WHO World Mental Health Surveys. Archives of General Psychiatry, 66(7), 785–795. https://doi.org/10.1001/archgenpsychiatry.2009.36

Sheehan, D. V., Lecrubier, Y., Sheehan, K. H., Amorim, P., Janavs, J., Weiller, E., Hergueta, T., Baker, R., & Dunbar, G. C. (1998). The Mini-International Neuropsychiatric Interview (MINI): The development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10. Journal of Clinical Psychiatry, 59(20), 22–33.

Shi, P., Yang, A., Zhao, Q., Chen, Z., Ren, X., & Dai, Q. (2021). A hypothesis of gender differences in self-reporting symptoms of depression: Implications to solve under-diagnosis and under-treatment of depression in males. Frontiers in Psychiatry, 12, 589687. https://doi.org/10.3389/fpsyt.2021.589687

Smith, D. T., Mouzon, D. M., & Elliott, M. (2016). Reviewing the assumptions about men’s mental health: An exploration of the gender binary. American Journal of Men’s Health, 12(1), 78–89. https://doi.org/10.1177/1557988316630953

Suh, H., van Nuenen, M., & Rice, K. G. (2017). The CES-D as a measure of psychological distress among international students: Measurement and structural invariance across gender. Assessment, 24(7), 896–906. https://doi.org/10.1177/1073191116632337

Thibodeau, M. A., & Asmundson, G. J. (2014). The PHQ-9 assesses depression similarly in men and women from the general population. Personality and Individual Differences, 56, 149–153. https://doi.org/10.1016/j.paid.2013.08.039

Van Beek, Y., Hessen, D. J., Hutteman, R., Verhulp, E. E., & Van Leuven, M. (2012). Age and gender differences in depression across adolescence: Real or ‘bias’? Journal of Child Psychology and Psychiatry, 53(9), 973–985. https://doi.org/10.1111/j.1469-7610.2012.02553.x

Watson, D., O’Hara, M. W., Naragon-Gainey, K., Koffel, E., Chmielewski, M., Kotov, R., Stasik, S. M., & Ruggero, C. J. (2012). Development and validation of new anxiety and bipolar symptom scales for an expanded version of the IDAS (the IDAS-II). Assessment, 19(4), 399–420. https://doi.org/10.1177/1073191112449857

Wrobel, N. H. (1993). Effect of patient age and gender on clinical decisions. Professional Psychology: Research and Practice, 24(2), 206–213. https://doi.org/10.1037/0735-7028.24.2.206

Yang, F. M., & Jones, R. N. (2007). CES–D item response bias found with Mantel–Haenszel method successfully replicated using latent variable modeling. Journal of Clinical Epidemiology, 60(11), 1195–1200. https://doi.org/10.1016/j.jclinepi.2007.02.008

Yesavage, J. A., & Sheikh, J. I. (1986). Geriatric Depression Scale (GDS): Recent evidence and development of a shorter scale. Clinical Gerontologist, 5(1–2), 165–173. https://doi.org/10.1300/J018v05n01_09

You, S., Merritt, R. D., & Conner, K. R. (2009). Do gender differences in the role of dysfunctional attitudes in depressive symptoms depend on depression history? Personality and Individual Differences, 46(2), 218–223. https://doi.org/10.1016/j.paid.2008.10.002

Zhu, J., & Chiu, M. M. (2021). Gender- and age-bias in CES-D when measuring depression in China: A Rasch analysis. Current Psychology, 42(10), 8186–8196. https://doi.org/10.1007/s12144-021-01991-2

Zigmond, A. S., & Snaith, R. P. (1983). The Hospital Anxiety and Depression Scale. Acta Psychiatrica Scandinavica, 67(6), 361–370. https://doi.org/10.1111/j.1600-0447.1983.tb09716.x

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright (c) 2026 Bartolomé Cantador, Adoración Antolí