Index talk:Koht jemne mil̦t vьjet.pdf

From Wikisource
Jump to navigation Jump to search


This book uses an alphabet that is mostly Latin, but with some Cyrillic letters mixed in. There is also a letter that looks like a small 8, which I have not been able to find in Unicode. I will transcribe this with ŝ for now, until a better solution is found. Better solution found: S with stroke (Ꞩ, ꞩ)

The following Cyrillic letters are used mixed in with Latin:

  • в
  • є
  • з
  • ь

Some of them possibly have better-suited Latin script variants. I will research this later. Jon Harald Søby (talk) 12:31, 26 February 2019 (UTC)Reply[reply]

@Jon Harald Søby: This is the so-called Unified Northern Alphabet. I've also a relevant publication (An OCR system for the Unified Northern Alphabet) on the topic. Michael.riessler (talk) 12:25, 13 November 2019 (UTC)Reply[reply]
@Michael.riessler: Oh, that is awesome. Do you have any idea how we can apply this OCR in Wikisource? I'm afraid I have no experience with OCR at all, really. Jon Harald Søby (talk) 12:34, 13 November 2019 (UTC)Reply[reply]