Content Analysis of Textbooks via Natural Language Processing

Authors

  • Mahshad Nasr Esfahani

DOI:

https://doi.org/10.47672/ajep.2252

Keywords:

Artificial Intelligence, Case Studies, Content Analysis, Curriculum, Data Science, Gender Studies, History, Natural Language Processing, Race, Textbooks, Textual Analysis

Abstract

Purpose:  Advanced methods from the field of data science have the potential to shed fresh light on basic concerns in the field of educational research. Natural language processing tools, such as lexicons, word embeddings, and topic models, are used in 15 United States history textbooks that were extensively used in Texas between the years 2015 and 2017.

Material and Methods: This study aims to analyze these textbooks for their portrayal of historically oppressed populations. Latinx individuals are rarely mentioned, but renowned white men are almost always mentioned. People of African descent are characterized as behaving in ways that imply helplessness and lack of control, according to lexical methods. Women are most often addressed in the contexts of the home and the workplace, according to the word embeddings. Issues of a political rather than a social nature are highlighted by subject modeling.

Findings: We also found that textbooks with a smaller representation of women and people of African heritage are more often purchased by conservative nations.

Implications to Theory, Practice and Policy: Our computational toolkit has a rich history of textbook analysis and has recently been distributed as part of our efforts to support new fields of study.

Downloads

Download data is not yet available.

References

Anon (2022). APA PsycNet. [online] psycnet.apa.org. Available at: https://psycnet.apa.org/record/2022-12800-001.

Boyd, R.L. and Schwartz, H.A. (2020). Natural Language Analysis and the Psychology of Verbal Behavior: The Past, Present, and Future States of the Field. Journal of Language and Social Psychology, 40(1), pp.21–41. doi:https://doi.org/10.1177/0261927x20967028.

Charlesworth, T.E.S., Yang, V., Mann, T.C., Kurdi, B. and Banaji, M.R. (2021). Gender Stereotypes in Natural Language: Word Embeddings Show Robust Consistency Across Child and Adult Language Corpora of More Than 65 Million Words. Psychological Science, 32(2), pp.218–240. doi:https://doi.org/10.1177/0956797620963619.

Chowdhary, K.R. (2020). Natural Language Processing. Fundamentals of Artificial Intelligence, 08(9), pp.603–649. doi:https://doi.org/10.1007/978-81-322-3972-7_19.

Khurana, D., Koli, A., Khatter, K. and Singh, S. (2022). Natural Language processing: State of the art, Current Trends and Challenges. Multimedia Tools and Applications, 82(3), pp.3713–3744. doi:https://doi.org/10.1007/s11042-022-13428-4.

Kleinheksel, A.J., Winston, N.R., Tawfik, H. and Wyatt, T.R. (2020). Demystifying Content Analysis. American Journal of Pharmaceutical Education, [online] 84(1). doi:https://doi.org/10.5688/ajpe7113.

Lucy, L., Demszky, D., Bromley, P. and Jurafsky, D. (2020). Content Analysis of Textbooks via Natural Language Processing: Findings on Gender, Race, and Ethnicity in Texas U.S. History Textbooks. AERA Open, 6(3), p.233285842094031. doi:https://doi.org/10.1177/2332858420940312.

Nicolas, C., Kim, J. and Chi, S. (2021). Natural language processing-based characterization of top-down communication in smart cities for enhancing citizen alignment. Sustainable Cities and Society, 66(07), p.102674. doi:https://doi.org/10.1016/j.scs.2020.102674.

Roblek, V., Thorpe, O., Bach, M.P., Jerman, A. and Meško, M. (2020). The Fourth Industrial Revolution and the Sustainability Practices: A Comparative Automated Content Analysis Approach of Theory and Practice. Sustainability, 12(20), p.8497. doi:https://doi.org/10.3390/su12208497.

Zhao, L., Alhoshan, W., Ferrari, A., Letsholo, K.J., Ajagbe, M.A., Chioasca, E.-V. and Batista-Navarro, R.T. (2021). Natural Language Processing for Requirements Engineering. ACM Computing Surveys, 54(3), pp.1–41. doi:https://doi.org/10.1145/3444689.

Downloads

Published

2024-07-29

How to Cite

Esfahani, M. N. (2024). Content Analysis of Textbooks via Natural Language Processing. American Journal of Education and Practice, 8(4), 36–54. https://doi.org/10.47672/ajep.2252

Issue

Section

Articles