Accion Labs at FSCONS 2014

Ashutosh Bijoor

The Free Society Conference and Nordic Summit (FSCONS) is being held in Göteborg, Sweden between October 31st and November 2nd 2014.

FSCONS exists to provide a meeting place where subjects covering society, culture and technology can be discussed and brought to life in peer discussions, without being confined to each particular subject area. It should provide both the physical and virtual space where people, organisations and governments, with interest in the three subject areas can meet in a participatory and constructive dialogue. The unique combination of topics creates a platform where cross-pollination between the areas can occur, and where new co-operations and thoughts can emerge which allows the participants to find new inspiration even from areas outside of their own.

Bhavani Shankar Ravindra.jpg

One of the speakers at the conference is Accion Labs’ senior consultant Bhavani Shankar Ravindra or Bhavi as he likes to be called. Apart from his work as a senior consultant at Accion Labs, he is a regular contributor to the open source community especially the Ubuntu and Debian community as a developer and a maintainer of packages.
His research interest in the Ubuntu/Debian community revolves around mobile networking, natural language processing (NLP), software compliance and localization (Building and sustaining local teams). At FSCONS 2014, Bhavi has been invited to speak on Natural Language Processing (NLP) and Sustaining localized activities effectively through teams.
In his first talk, he intends to speak about Natural language processing and Platform independent speech recognition and how it is achieved for Indian Languages as an example.
In his second talk, Bhavi intends to speak on building and sustaining local teams and localized activities in a international open source community like Ubuntu as he introduces the working of Ubuntu Local community council https://wiki.ubuntu.com/LoCoCouncil

Abstract : Platform and Language-Independent Framework for Speech Recognition

Create an easily extensible framework for utilizing speech input in any language to query a dataset of content and provide result set. The demo system takes a speech query in Telugu (a South Indian language) and converts it into text. The resultant text is refined with the help of POS taggers and relevant information is retrieved and processed out in form of speech in the same language. The system typically makes use of standard open source tools for speech recognition, synthesis and POS tagging with fine tunings to derive the necessary results in Indian languages as an example.

  • In this presentation Bhavi would like to show how the above is implemented by below:
  • Preparation of global phone set which typically replaces Latin letters with a unique combination of English letters from the below link:
    http://homepage.ntlworld.com/stone-catend/trimain1.html
  • Transliteration of the available data in Unicode with the use of global phone set
  • Write and test parsing to chunk the transcription into individual phonemes
  • Use of ehmm in festvox or HTKAlign to do automatic labeling of speech into phonemes
  • Use of speech tools such as festvox or HTS to train the voice
  • Use of festival to generate the synthetic speech
  • Write and test JavaScript for Text-To-Speech synthesis system
  • Write codes to do automatic labelling of speech data which is to be used for speech recognition
  • Prepare a dictionary with each word in the vocabulary against its corresponding phoneme representation
  • Use of speech tools such as SphinxTrain or HTK to train the acoustic models for speech recognition
  • Use of speech tools such as CMUCLMTK or HLMTools to prepare language model
  • Build and test the system with sphinxdecode or HDecode
  • Collect sufficient transcribed phrases and corresponding exemplar recordings to test the accuracy of system in decoding
  • Enhance the system by passing speech to voice activity detection or noise reduction algorithms before decoding
  • Use of POS tagging and word-sense disambiguation to retrieve the necessary information from the resultant output of recognition
  • Integrating all the modules and preparation of a simple interface.

A bit more in detail about the implementation and challenges involved. The system mainly requires two important modules:

  • Speech-To-Text
  • Text-To-Speech

Along with these, some knowledge on dialogue systems, POS tagging is required.
Tasks involved:

Speech-To-Text

  • Collection of audio data and corresponding text.
  • Text in UTF-8 format and its transliteration to IT3 or Roman
  • Construction of pre-defined dictionary based on given vocabulary
  • Automatic Labeling of data

Preparation of acoustic models

Preparation of Language Model (LM) or Finite State Grammar (FSG) in case of CMUSphinx

Text-To-Speech

  • Collection of audio data and corresponding text.
  • Text in UTF-8 format and its transliteration to IT3 or Roman
  • Language parser for chunking transcription into phones
  • Automatic Labelling of data
  • Question file preparation and acoustic models in case of HTS

Challenges involved

  • Collection of a large amount of data and its corresponding text for ASR (availability of speakers for recording)
  • Preparation of global phone-set (so that the work can be easily extended to other languages)
  • Language parser development (for chunking into phones – depends on transliteration)
  • Automatic labeling of data (use of ehmm or HTKAlign)
  • Use of noise-reduction algorithm or voice activity detection algorithm to enhance the system

Visit the event page here: https://frab.fscons.org/en/fscons14/public/events/138

Watch this space for a copy of his presentation after the event.

Workshop: Building local communities and making them rock!

In this talk, Bhavi would like to showcase the work of Loco council of Ubuntu in the handling of local communities and sustaining them.
More on the Ubuntu loco council here: https://wiki.ubuntu.com/LoCoCouncil

In this talk Bhavi would like to show how the Ubuntu loco council handles teams:

  • To provide independent guidance for the LoCo Community.
  • To maintain the quality of governance in the LoCo community.
  • To assess and re-assess teams for verified state.
  • To provide input and feedback to other Ubuntu governance boards regarding the needs and achievements of the LoCo community.
  • To act as an independent, objective, third party to resolve conflict in teams by acting as a mediator for a group or individuals
  • How we allocate resources
  • How we help people to create teams and sustain them and so on.

Visit the event page here: https://frab.fscons.org/en/fscons14/public/events/139