Disentangling mixed use of script and language in the Ottoman Empire

The heterogeneity of the Ottoman Empire ensured that written records contain a variety of language and script combinations. At the same time, the dislocation and regulation of constituent populations has created difficulty in finding such materials in existing archives. By training a small language model on expert annotations, we are uncovering and cataloging hidden texts, along with practices they reflect from communities that often no longer exist in living memory.

Researchers

Outcomes