Skip to content

Topic Model of a corpus in Digital Government Research

Notifications You must be signed in to change notification settings

aguileracastillo/topic_model

Repository files navigation

Unveiling Patterns in Digital Government Research: A Structural Topic Modeling Approach for Literature Review

Abstract The exponential growth in research output in most fields of knowledge is both a challenge and an opportunity for researchers to apply new computational techniques for scientific inquiry. Digital Government Research (DGR) is a vibrant multidisciplinary field of research at the intellectual crossroads of a wide variety of established academic fields and disciplines such as information systems, public administration, political science, economics, and innovation studies among others, thus making it the optimal “arena” to pursue this line of inquiry of exploring the relationship between digitalization and the public sector workforce. Structural Topic Modeling is a technique that allows for the systematic analysis of large quantities of text data, enabling researchers to perform evidence synthesis on the bibliographic sample considered, map the scientific discipline under review, explore the thematic evolution over time, and identify and quantify the prevalence of various topics in a selected corpus, thus helping to detect promising areas for further research. This chapter applies a Structural Topic Model (STM), an unsupervised machine-learning technique with the purpose of classification and discovery of scientific topics in a corpus containing a selection of bibliographic records collected and curated in the Digital Government Reference Library version 17.5 (DGRL), a collection of over 16,500 documents, including journal articles, conference proceedings, book chapters. For this dissertation, we trained and tested a structural topic model to scrutinize over 6,600 abstract texts from journal articles to estimate, report and visualize the latent topics in this subset of the Digital Government Reference Library. To the best of our knowledge, this study marks the first attempt to employ unsupervised machine learning techniques in a Digital Government Research corpus. By leveraging Structural Topic Modeling in our analysis, we uncovered and explored key themes and research topics of interest present in Digital Government literature. This has enabled us to generate valuable insights into the intellectual structure of the field over the years, identifying dominant topics in the literature, and estimating topics of growing interest, thus helping us identify promising areas of research and further inquiry. Among the thirty topics explored in the held-out data, four are related to automation technologies such as artificial intelligence, cloud infrastructure, blockchain, and the Internet of Things, all of them present an increasing topic prevalence over time, meaning these topics are growing in scholarly interest in the field. The model also shows one topic containing words related to employment and work, but the graphical analysis via the intertopic distance map reveals that there is no topic content overlap between the topics related to automation technologies and the topic related to work and employment in the public sector. Thus, indicating a relatively new and promising subfield in the extant literature and opening an opportunity to explore the relationship between automation technologies and the public sector workforce.

Keywords: digital government research, topic modeling, structural topic modeling, literature review, text mining, machine learning, text classification, bag-of-words models

About

Topic Model of a corpus in Digital Government Research

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published