Our pronunciation toolkit, Kathabhidhana is an open toolkit to record a large number of words. It consists of a few free/libre and open source software, open datasets, methodologies and documentations. It can be used to record pronunciations of words to make a talking dictionary to record phonemes to create a text-to-speech software.
A tool with many faces
Wikipedia has a sister project called Wiktionary, a multilingual dictionary where you can not just find meaning of words from your own language but also equivalent meanings of foreign language words. Unlike many available dictionaries that help learn proununciations, Wiktionary does not have pronunciations of all words in all the languages. Kathabhidhana was originally started by Subhashish Panigrahi to add pronunciations to the Odia-language Wiktionary. It is adopted from a free software created by by Shrinivasan T. It works both on Linux and Mac. The iOS version of Kathabhidhana was created by Prateek Pattanaik. You can certainly create pronunciations and add them to Wiktionary. But you can use Kathabhidhana beyond that by making a large library of pronunciations that can be used to build any machine learning or Natural Language Processing (NLP) tool.
What does this toolkit contain?
- Using a computer?
- Linux or macOS
- Linux running in a virtual machine
- Using an iOS device? (check more here)
- iOS (iPad or iPhone)
- Workflow (app)
How to use it?
- Download and set up Kathabhidhana (see the next section)
- Set up your recording hardware (see mine in the picture above) e.g. microphone (if using an external one), computer settings like level
- Record using Kathabhidhana
- Batch processing using (tutorial coming soon, download Audacity from here)
- Manual clean up of each file (tutorial coming up soon)
- Setting up Pattypan and upload files on Commons (download from here)
The installation can be done using command-line on a Linux or Mac computer, or using any iOS device.
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" < /dev/null 2> /dev/null
brew install portaudio
brew install vorbis-tools
sudo easy_install pip
sudo pip install pyaudio
Running the tool
1. Go to the file called “file”. Replace all words with the words you want to record
2. Run the command below (it will record both as .wav and .ogg)
python voice-record.py 2> err</code)
To upload all the ogg files to Wikimedia Commons This will record the sounds in .ogg and .wav formats. You can then use a tool like Pattypan to batch-upload either the .WAV or the .ogg files on Wikimedia Commons.
Findings so far
- Project led by Subhashish Panigrahi and the iOS tool is led by Prateek Pattanaik. All the media and text content are available under a CC-BY-SA 4.0 license
- All the software component is licensed under GNU General Public License (GPL) version 3 (read the License page for more details)
- This project and part of the documentation are based on the Voice recorder for Tawiktionary project created by Shrinivasan T (please attribute Shrinivasan T if you’re making a derivative of the software)
- Panigrahi, Subhashish. “A simple command-line tool for recording audio“. Opensource.com (May 12, 2017)
- Ojha, Bikash. Mishra, Chinmayee. Pattanaik, Prateek. Panigrahi, Subhashish. Patnaik, Sailesh. Elsharbaty, Samir. “Community digest: As Odia Wikisource turns two, a project to digitize rare books kicks off; news in brief“. Wikimedia Blog (March 30, 2017)
- Rezwan. “A New Audio Uploading Tool for Crowdsourced Wiktionary Project in Odia Language“. Global Voices (February 13, 2017)
- “Workshop “Kathabhidhana: Recording words for Wiktionary and preparing for an AI assistant“. Wikimania 2017, Montreal, Canada. (Selected for workshop on August 12. Check back in late August for more updates about the workshop)
- “Kathabhidhana, open source toolkit to record pronunciations of any world language“. Celtic Knot Conference 2017, University of Edinburgh. (Selected, Workshop on July 6)