Special Courses
Text Data in Economics
Lecturer
Dr Benjamin Arold (University of Cambridge)
Summary
Much of human knowledge is stored in unstructured formats, in particular in written text. This course teaches methods to process and analyze text data. The learning goals are to understand the structure and concept text-as-data methods, and to evaluate the use of text-analysis tools in economics research. The course will conclude with an overview of non-standard data in economics beyond text, in particular audio data and image data. The course covers 10 topics, see below. For most topics, a theoretical lecture will be provided first, followed by a discussion of a recent research paper in economics/NLP. This paper discussion will be conducted as a collaborative seminar, where students will take turns to present and discuss the papers.
Schedule
05.01.2027: 12:00-16:30
06.01.2027: 12:00-16:30
07.01.2027: 12:00-16:30
08.01.2027: 10:00-14:30
Venue
ifo Institute Dresden
Content
– Overview
– Dictionaries
– Tokenization & Distance
– Unsupervised and Supervised ML with Text
– Word Embeddings and Linguistic Parsing
– Embedding Sequences with Attention, LLMs
– Generative AI; and Using Transformers for YOUR Research
– Image Data in Business & Economics
– Audio Data in Business & Economics
– Ethical Considerations
Course requirements
Attendance at all lectures is mandatory and a prerequisite to take the exam.
The final assessment consists of a short research paper in which each student develops an original research design using text as data or other non-standard data covered in the course. The paper should state a clear research question, situate it in the relevant literature, and outline the proposed empirical design. This includes the data sources, the text or image or audio processing methods to be used, the measurement strategy, and where applicable the identification approach. Students should also reflect on relevant methodological and ethical issues discussed in the course. The focus is on demonstrating a sound understanding of text as data methods and their appropriate use in economic research, rather than on producing completed empirical results.
Required reading
9 recent scientific papers in applied economics, using NLP or related methods. The list of papers will be communicated with the students before the course.
Registration
Please register for the course until November 30, 2026 by sending an e-mail to: yvonne.bludau@tu-dresden.de
Subject
Registration Text Data in Economics.