Logo Utrecht University

Digital Humanities Lab


11 March 2020
11:00 - 11:30
Bushuis, VOC zaal, University of Amsterdam

Jelte van Boheemen and Ortal-Paz Saar will present paper on extracting dates from Hebrew and Aramaic Texts


DHLab developer Jelte van Boheemen and Ortal-Paz Saar have been invited to present a paper on extracting dates from Hebrew and Aramaic texts at the Time and Society in the Ancient World Conference. The event will be held at the University of Amsterdam from 9 to 11 March 2020; see the full programme of the conference here. This multidisciplinary conference addresses the question of how social constructs of time shaped the political, social, cultural and artistic life from the third millennium BC until the Early Middle Ages. In their paper, Jelte van Boheemen and Ortal-Paz Saar will address time in Jewish and Aramaic historic texts from a digital humanities perspective.

Hebrew and Aramaic historic texts, such as funerary inscriptions from 500 BCE onward, rabbinic and responsa literature, or Memorbücher, are rich in temporal information. Many such corpora are digitized, yet there is no efficient system to automatically extract the temporal information (explicit or relative) from them. While there are a few date parsers that support Hebrew, they function with modern texts, using standard notations and Arabic numerals. However, they cannot be used on historic texts. In these latter, dates are invariably expressed through Hebrew letters, not numerals. For example, the number 14 is written as two letters, yod and dalet, . יד When taken as a word, this two-letter sequence may also be read as yad, “hand”. Additionally, the notation standards vary widely, for instance: “since the Temple destruction”, “in the reign of King Seleucus”, or “the third year of the seven-year cycle”. This variation makes it even more complicated to recognize what date is mentioned in the text. The present paper will describe the development of a novel algorithm suited to automatically extract dates from historic Hebrew and Aramaic texts.