Big data

Big data refers to extremely large and/or complex datasets, and the methods used to manage and analyse them

quod: A Tool for Querying and Organising Digitised Historical Documents
EN
This blog post from EHRI introduces 'quod' (querying OCRed documents), a prototype Python-based command line tool for OCRing and querying digitised historical documents, which can be used to organise large collections and improve information about provenance. To demonstrate its use in context, this blog takes the reader through a case study of the International Tracing Service, showing workflows and the steps taken from start to finish.
Authors
Reinier De Valk
Read more →
EHRI in TEITOK
EN
This blog examines TEITOK, which is a corpus framework used as an alternative to Omeka. TEITOK is centered around texts and is similar to the Omeka interface – both allow you to search through the documents, and display the transcription. The main difference is that Omeka treats the transcription as an object description, whereas TEITOK not only shows that a word appears in a document, but also where it appears and how it is used.
Authors
Maarten Janssen
Read more →
Computational Museology
EN
This keynote lecture delivered at the DARIAH Annual Event 2021 by Sarah Kenderdine explores how computation has become ‘experiential, spatial and materialized; embedded and embodied’.
Authors
Sarah Kenderdine
Read more →
Cultural Big Data - Building a European Internet of Cultural Things
EN
In this lecture, Mark Cote takes us on a journey through a host of research projects that contextualise the way that he and other researchers try to address cultural data.
Read more →

Big data

Resources

quod: A Tool for Querying and Organising Digitised Historical Documents

EHRI in TEITOK

Computational Museology

Cultural Big Data - Building a European Internet of Cultural Things