Logo Utrecht University

Digital Humanities Lab

Tips & Tutorials

Societies


Journals


Diversity


Text Mining and Topic Modelling

  • The Programming Historian: Information on Digital Humanities tools and methods and tutorials in English, Spanish and French suited for starters. Text analysis tutorials are referred to as ‘distant reading’ (term coined by Franco Moretti).
  • Quanthum: brilliant fundgrube for a lot of very interesting stuff.
  • Voyant tools
  • AntConc: for concordancing.
  • Lancsbox: tool that incorporates lots of methods and knowledge on computational linguistics. Will allow you to compare corpora and cover POS tagging, so you will be able for instance to count verbs, nouns etc. in your corpus.
  • Iramuteq: French GUI for text mining in R
  • Article with a survey of (some) text analysis packages for R. Published in Language Resources and Evaluation, Vol. 53, Issue 4 (December 2019)

Books

  •  Shawn Gram, Ian Milligan and Scott B. Weingart, Exploring big historical data: the historian’s macroscope. Pre-draft version available here.
  • Ashish Kumar, Avinash Paul, Mastering text mining with R, Packt Publishing.

Tutorials


Text mining in R


Courses


Utrecht