PhilSci Archive

LDA Topic Modeling: Contexts for the History & Philosophy of Science

Allen, Colin and Murdock, Jaimie (2020) LDA Topic Modeling: Contexts for the History & Philosophy of Science. [Preprint]

LDA Topic Modeling for HPS_PHILSCI-ARCHIVE.pdf

Download (473kB) | Preview


In this paper we discuss the application of LDA topic modeling to questions that interest historians & philosophers of science, which we illustrate primarily through our own work on modeling Charles Darwin's reading and writing behavior. We discuss the need to go beyond simplistic presentations of topic models that tend to give scholars the idea that the algorithms produce results that are superficial and perhaps unreliable. The ways in which topic models are often misrepresented and misunderstood frame our attempt to convince readers that, despite appearances, topic modeling provides a lot more of value to HPS research than merely providing for enhanced search and information retrieval from large sets of documents. Rather than "topics" we prefer to think of these topic models as revealing contexts for individual reading and wrote, leading us to ask questions about the individual exploration and exploitation of the materials to which a scientist such as Darwin had access. We discuss the use of topic models as tools for identifying influence and measuring creativity within those contexts and conclude that the interplay between human intelligence and sophisticated algorithms will expand the range of questions about science that HPS scholars will ask, and can answer.

Export/Citation: EndNote | BibTeX | Dublin Core | ASCII/Text Citation (Chicago) | HTML Citation | OpenURL
Social Networking:
Share |

Item Type: Preprint
Allen, Colinprof.colin.allen@gmail.com0000-0003-4497-1725
Murdock, Jaimiejaimie.murdock@gmail.com0000-0002-1732-5499
Additional Information: Preprint of a chapter forthcoming in Ramsey, G., De Block, A.(Eds.) The Dynamics of Science: Computational Frontiers in History and Philosophy of Science. Pittsburgh University Press; Pittsburgh.
Keywords: text mining, topic modeling, context, meaning, Darwin, HPS
Subjects: General Issues > Data
General Issues > History of Science Case Studies
Depositing User: Colin Allen
Date Deposited: 30 May 2020 00:39
Last Modified: 30 May 2020 00:39
Item ID: 17261
Subjects: General Issues > Data
General Issues > History of Science Case Studies
Date: 29 May 2020

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item