Text-to-speech

Lekatha is a text-to-speech (TTS) project which is in its infancy at the moment. The word lekatha does not mean anything but it is constructed from two Odia-language words ଲେଖା (lekha, meaning text) and କଥା (katha, meaning voice) which refers to constructed voice from text by using a TTS engine.

Find source code/fork Lekatha on GitHub Logo.png

What does it do?

If Lekatha is a locomotive, its engine is a Python-based TTS tool that was originally written by written by Alex I. Ramirez. Thanks to Alex who released this as an open source software. The code and the workflow for Odia was then created by Subhashish Panigrahi.

Lekatha’s workflow can be understood in four basic steps:

  1. Each phoneme of an Odia word is converted into Latin-character equivalent using a converter. For this, an Odia–Latin transcription chart (just like Arpabet) was created that clearly defines a Latin equivalent of Odia phonemes.
  2. The output from the converter is copied into a text file
  3. Each phoneme is recorded and saved as a .wav file in a folder. For instance, the Odia phoneme “କା”‘s Latin equivalent is “KA” and the audio file is named as KA.wav.
  4. When the tool is run, it asks for a word or phrase. Here one has to input the word in Latin alphabet. The tool then matches with the recorded phonemes, joins multiple phonemes to create a word

Odia transcription chart

Odia phoneme Latin equivalent Odia phoneme Latin equivalent
OO NY
O T
AA TH
A D
II ଡ଼ RD
ି I RH
II ଢ଼ RDH
I NN
UU TT
U TTH
EE DD
E DDH
OI N
OY P
OOA PH
OA B
OOU BH
OU M
RRU J
RU R
୍ୟ Y LL
K L
KH UO
G S
GH H
UN କ୍ଷ KY
C ଜ୍ଞ GN
CH NG
J MM
JH

Test run

A small set of phonemes were recorded to test how it works. Not all the examples below are real words but they were constructed to see how the tool works with different phoneme combination.How to test the tool for your language?

Odia word Latin equivalent Odia Transcription Chart Synthesized audio
ସୌକ SOUKO SOU KO
ସୋକ SOAKO SOA KO
ସିକି SIKI SI KI
ସିକା SIKA SI KA
କୌସି KOUSI KOU SI
କୋସ KOASO KOA SO
କେସୌ KESOU KE SOU
କେସି KESI KE SI
କିସ KISO KI SO
କାସା KASA KA SA
କସି KOSI KO SI
କସା KOSA KO SA
କସ KOSO KO SO

Prerequisites

  • A computer running Linux/MacOS (preferably upgraded to latest available stable OS version)
  • Python 3 or above (Download the latest stable version from here. Please note you might already have Python 2.7 or any other version lower than Python 3. Do NOT delete them.)
  • PyAudio (downloading and installation here)
  • Audacity (download the latest version from here)
  • Wordlist containing words of your language in the following format in a text file:

<Word in Latin alphabet><space><space><Phonetic transcription> It would like below:

WORD WO AR D

For instance, the Odia language word କସି should be added in the text file as:

KOSI KO SI
The phonetic transcription for each language is different. If your language is written in Latin alphabet then you can use this tool for creating a wordlist that can straightaway be used for our software. If not, then you can first create a phonetic transcription like the one showed here for

Odia.

Steps

  • Install Python3, Pyaudio and Audacity
  • Download the tool, and unzip it
  • Go to the “sounds” folder and delete everything.
  • Record all the phonemes of your language using Audacity, and save them exactly the way you have transcribed your phonemes. For instance, if a phoneme is defined as “KO”, you need to save it as KO.wav. Ideally all the phonemes of your language should be there in order to make it work for your language but you can record only a few to test it.
  • Edit the “wordlist.txt” inside your folder, and replace everything with a list of words (see the previous section to learn how to create one for your language), and save it
  • Run your Linux/Mac Terminal and use the cd FILELOCATION to locate the folder
For instance, if your “Lekatha” folder is located in Desktop you need to type:

cd Desktop
cd Lekatha
  • Now type:
python3 load.py
You will see a message “Enter a word or phrase:
  • Type the word in Latin alphabet e.g. “KOSI” and enter
  • It should ideally pronounce the word

Licensing

  • All the software components are licensed under a GNU General Public License v. 3.0 and the text/audio-visual and documentations are licensed under Creative Commons Attribution-ShareAlike 4.0 license
  • While forking the software please attribute to the following:
 Original software: Alex I. Ramirez, Apache License 2.0. <https://github.com/alexram1313/text-to-speech-sample>. Derivative: Subhashish Panigrahi, GNU General Public License v3.0. <https://github.com/OdiaWikimedia/Lekatha>
  • While making derivatives of anything other than the software, please attribute to the following:
Subhashish Panigrahi, 2017, CC-BY-SA 4.0