Accion Labs at FSCONS 2014

Ashutosh Bijoor

Conference and Nordic Summit (FSCONS)](https://fscons.org/2014/) is being held in Göteborg, Sweden between October 31st and November 2nd 2014

FSCONS exists to provide a meeting place where subjects covering society, culture and technology can be discussed and brought to life in peer discussions, without being confined to each particular subject area. It should provide both the physical and virtual space where people, organisations and governments, with interest in the three subject areas can meet in a participatory and constructive dialogue. The unique combination of topics creates a platform where cross-pollination between the areas can occur, and where new co-operations and thoughts can emerge which allows the participants to find new inspiration even from areas outside of their own.

text

One of the speakers at the conference is Accion Labs’ senior consultant Bhavani Shankar Ravindra or Bhavi as he likes to be called. Apart from his work as a senior consultant at Accion Labs, he is a regular contributor to the open source community especially the Ubuntu and Debian community as a developer and a maintainer of packages. His research interest in the Ubuntu/Debian community revolves around mobile networking, natural language processing (NLP), software compliance and localization (Building and sustaining local teams). At FSCONS 2014, Bhavi has been invited to speak on Natural Language Processing (NLP) and Sustaining localized activities effectively through teams. In his first talk, he intends to speak about Natural language processing and Platform independent speech recognition and how it is achieved for Indian Languages as an example. In his second talk, Bhavi intends to speak on building and sustaining local teams and localized activities in a international open source community like Ubuntu as he introduces the working of Ubuntu Local community council https://wiki.ubuntu.com/LoCoCouncil

ABSTRACT : PLATFORM AND LANGUAGE-INDEPENDENT FRAMEWORK FOR SPEECH RECOGNITION

Create an easily extensible framework for utilizing speech input in any language to query a dataset of content and provide result set. The demo system takes a speech query in Telugu (a South Indian language) and converts it into text. The resultant text is refined with the help of POS taggers and relevant information is retrieved and processed out in form of speech in the same language. The system typically makes use of standard open source tools for speech recognition, synthesis and POS tagging with fine tunings to derive the necessary results in Indian languages as an example.

In this presentation Bhavi would like to show how the above is implemented by below:
Preparation of global phone set which typically replaces Latin letters with a unique combination of English letters
Transliteration of the available data in Unicode with the use of global phone set
Write and test parsing to chunk the transcription into individual phonemes
Use of ehmm in festvox or HTKAlign to do automatic labeling of speech into phonemes
Use of speech tools such as festvox or HTS to train the voice
Use of festival to generate the synthetic speech
Write and test JavaScript for Text-To-Speech synthesis system
Write codes to do automatic labelling of speech data which is to be used for speech recognition
Prepare a dictionary with each word in the vocabulary against its corresponding phoneme representation
Use of speech tools such as SphinxTrain or HTK to train the acoustic models for speech recognition
Use of speech tools such as CMUCLMTK or HLMTools to prepare language model
Build and test the system with sphinxdecode or HDecode
Collect sufficient transcribed phrases and corresponding exemplar recordings to test the accuracy of system in decoding
Enhance the system by passing speech to voice activity detection or noise reduction algorithms before decoding
Use of POS tagging and word-sense disambiguation to retrieve the necessary information from the resultant output of recognition
Integrating all the modules and preparation of a simple interface.

A bit more in detail about the implementation and challenges involved. The system mainly requires two important modules:

Speech-To-Text
Text-To-Speech

Along with these, some knowledge on dialogue systems, POS tagging is required.

Tasks involved:

SPEECH-TO-TEXT

Collection of audio data and corresponding text.
Text in UTF-8 format and its transliteration to IT3 or Roman
Construction of pre-defined dictionary based on given vocabulary
Automatic Labeling of data

PREPARATION OF ACOUSTIC MODELS

Preparation of Language Model (LM) or Finite State Grammar (FSG) in case of CMUSphinx

TEXT-TO-SPEECH

Collection of audio data and corresponding text.
Text in UTF-8 format and its transliteration to IT3 or Roman
Language parser for chunking transcription into phones
Automatic Labelling of data
Question file preparation and acoustic models in case of HTS

CHALLENGES INVOLVED

Collection of a large amount of data and its corresponding text for ASR (availability of speakers for recording)
Preparation of global phone-set (so that the work can be easily extended to other languages)
Language parser development (for chunking into phones – depends on transliteration)
Automatic labeling of data (use of ehmm or HTKAlign)
Use of noise-reduction algorithm or voice activity detection algorithm to enhance the system

Watch this space for a copy of his presentation after the event.

WORKSHOP: BUILDING LOCAL COMMUNITIES AND MAKING THEM ROCK!

In this talk, Bhavi would like to showcase the work of Loco council of Ubuntu in the handling of local communities and sustaining them. More on the Ubuntu loco council here: https://wiki.ubuntu.com/LoCoCouncil

In this talk Bhavi would like to show how the Ubuntu loco council handles teams:

To provide independent guidance for the LoCo Community.
To maintain the quality of governance in the LoCo community.
To assess and re-assess teams for verified state.
To provide input and feedback to other Ubuntu governance boards regarding the needs and achievements of the LoCo community.
To act as an independent, objective, third party to resolve conflict in teams by acting as a mediator for a group or individuals
How we allocate resources
How we help people to create teams and sustain them and so on.