Data Analysis and Visualization for the Humanities and Social Sciences
Full course description
Research data in the humanities and social sciences can take many forms. It is frequently rich and complex, filled with uncertainties and difficulties in its encoding, analysis and structure. The amount of data we have to deal with today can be overwhelming, both for research and in our personal lives. Harnessing the power of large data stores for research in the humanities and social sciences is a core objective of this course.
To utilize tens, hundreds, or even thousands of texts, we would not expect you to read as you typically do for your studies (e.g., via close reading, one word, one paragraph, one page after the other) but rather read digitally. Digital ‘reading’ of texts goes by many names including data analysis, text analysis, text mining, and data mining. In this class we are going to focus on the first of these methods, data analysis, an algorithmic-driven method of extracting text from (large) corpora. In this course we will focus on literary and historical sources, as well as social media. The data analysis tools we will introduce you to will visualise the text, making it easier to see patterns and come to insights, and develop research questions, in minutes or hours, where previously this might have taken days, months or years. We will explore these methods and practices through distant reading, a recent concept used to theorise the practice of reading algorithmically.
This course will take you through a mini big data project to provide you with hands-on experience and understanding of the affordances and limitations of data analysis methods. No background in the methods or programming skills are needed. We will be using easy-to-learn web-based tools and software. Theoretically, we will explore how the representation of text in more visual formats which are typically removed from its semantic contexts, offers opportunities for both new insights as well as misrepresentation. Concepts to be covered include distant reading, algorithmic visualisation, and data feminism. An overarching goal of the course is to help you become more savvy users of digital information: the implications and challenges that methods and technologies pose to conventional research, analysis and publication in the arts, humanities, and social sciences, including issues such as transparency, authenticity, and bias.
Course objectives
-
Explore different methodological approaches to computationally analyse textual corpora;
-
Use text analysis to develop and respond to research hypothesis and questions;
-
Understand how to analyse text (non-semantically) through visualisations;
-
Critically reflect on the challenges researchers face when working with textual data through new concepts, such distant reading and data feminism.
Prerequisites
None.
Recommended reading
-
Jänicke, S., Franzini, G., Cheema, M.F., and Scheuermann, G. (2015). On Close and Distant Reading in Digital Humanities: A Survey and Future Challenges. In R. Borgo, F. Ganovelli, and I. Viola (Eds) Eurographics Conference on Visualization (EuroVis).
-
Leurs, K. (2017). Feminist data studies: using digital methods for ethical, reflexive and situated socio- cultural research. Feminist Review, 115(1), 130-154.
-
Sinclair, S. and Rockwell, S. (2016). Text Analysis and Visualization: Making Meaning Count, In S. Schreibman, R. Siemens, and J. Unsworth (Eds.) A New Companion to the Digital Humanities (pp. 274– 90). Wiley Blackwell