Content Analysis of Textbooks via Natural Language Processing
DOI:
https://doi.org/10.47672/ajep.2252Keywords:
Artificial Intelligence, Case Studies, Content Analysis, Curriculum, Data Science, Gender Studies, History, Natural Language Processing, Race, Textbooks, Textual AnalysisAbstract
Purpose: Advanced methods from the field of data science have the potential to shed fresh light on basic concerns in the field of educational research. Natural language processing tools, such as lexicons, word embeddings, and topic models, are used in 15 United States history textbooks that were extensively used in Texas between the years 2015 and 2017.
Material and Methods: This study aims to analyze these textbooks for their portrayal of historically oppressed populations. Latinx individuals are rarely mentioned, but renowned white men are almost always mentioned. People of African descent are characterized as behaving in ways that imply helplessness and lack of control, according to lexical methods. Women are most often addressed in the contexts of the home and the workplace, according to the word embeddings. Issues of a political rather than a social nature are highlighted by subject modeling.
Findings: We also found that textbooks with a smaller representation of women and people of African heritage are more often purchased by conservative nations.
Implications to Theory, Practice and Policy: Our computational toolkit has a rich history of textbook analysis and has recently been distributed as part of our efforts to support new fields of study.
Downloads
References
Anon (2022). APA PsycNet. [online] psycnet.apa.org. Available at: https://psycnet.apa.org/record/2022-12800-001.
Boyd, R.L. and Schwartz, H.A. (2020). Natural Language Analysis and the Psychology of Verbal Behavior: The Past, Present, and Future States of the Field. Journal of Language and Social Psychology, 40(1), pp.21–41. doi:https://doi.org/10.1177/0261927x20967028.
Charlesworth, T.E.S., Yang, V., Mann, T.C., Kurdi, B. and Banaji, M.R. (2021). Gender Stereotypes in Natural Language: Word Embeddings Show Robust Consistency Across Child and Adult Language Corpora of More Than 65 Million Words. Psychological Science, 32(2), pp.218–240. doi:https://doi.org/10.1177/0956797620963619.
Chowdhary, K.R. (2020). Natural Language Processing. Fundamentals of Artificial Intelligence, 08(9), pp.603–649. doi:https://doi.org/10.1007/978-81-322-3972-7_19.
Khurana, D., Koli, A., Khatter, K. and Singh, S. (2022). Natural Language processing: State of the art, Current Trends and Challenges. Multimedia Tools and Applications, 82(3), pp.3713–3744. doi:https://doi.org/10.1007/s11042-022-13428-4.
Kleinheksel, A.J., Winston, N.R., Tawfik, H. and Wyatt, T.R. (2020). Demystifying Content Analysis. American Journal of Pharmaceutical Education, [online] 84(1). doi:https://doi.org/10.5688/ajpe7113.
Lucy, L., Demszky, D., Bromley, P. and Jurafsky, D. (2020). Content Analysis of Textbooks via Natural Language Processing: Findings on Gender, Race, and Ethnicity in Texas U.S. History Textbooks. AERA Open, 6(3), p.233285842094031. doi:https://doi.org/10.1177/2332858420940312.
Nicolas, C., Kim, J. and Chi, S. (2021). Natural language processing-based characterization of top-down communication in smart cities for enhancing citizen alignment. Sustainable Cities and Society, 66(07), p.102674. doi:https://doi.org/10.1016/j.scs.2020.102674.
Roblek, V., Thorpe, O., Bach, M.P., Jerman, A. and Meško, M. (2020). The Fourth Industrial Revolution and the Sustainability Practices: A Comparative Automated Content Analysis Approach of Theory and Practice. Sustainability, 12(20), p.8497. doi:https://doi.org/10.3390/su12208497.
Zhao, L., Alhoshan, W., Ferrari, A., Letsholo, K.J., Ajagbe, M.A., Chioasca, E.-V. and Batista-Navarro, R.T. (2021). Natural Language Processing for Requirements Engineering. ACM Computing Surveys, 54(3), pp.1–41. doi:https://doi.org/10.1145/3444689.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Mahshad Nasr Esfahani
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution (CC-BY) 4.0 License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.