Difference between revisions of "Voice control (sphinx+festival)"

Revision as of 22:29, 22 April 2014

Preliminary steps

We use CMU Pocket Sphinx for speech recognition developed on Carnegie Mellon University, and Festival Speech Synthesis System for text speech developed on University of Edinburgh.

Currently the ROS package pocketsphinx is available for groovy and hydro distributions. We can install it executing the next command in a terminal:

sudo apt-get install ros-"groovy or hydro"-pocketsphinx

We have to install the ubuntu package CMU Pocket Sphinx in previous ROS distributions. We will execute the next command in a terminal to install it:

sudo apt-get install gstreamer0.10-pocketsphinx

Moreover for previous ROS distributions, we have to download the ROS stack "rharmony" by University of Albany in order to integrate CMU Pocket Sphinx in ROS. We will place in the ROS workspace and we will execute the next commands in a terminal:

svn checkout http://albany-ros-pkg.googlecode.com/svn/trunk/rharmony
rosmake --rosdep-install pocketsphinx

Festival Speech Synthesis System is intigrated in the ROS package sound_play.

Speech recognition

First, we will execute the next command in a terminal within our workspace in order to create a package named "voz_fer" that will content the voice control programs:

roscreate-pkg voz_fer pocketsphinx sound_play std_msgs roscpp

We have to select the vocabulary that we use to control the robot. We use short sentences for the orders in order to reduce the recognition mistakes. We will create a file named "comandos_voz.txt" within the folder "config" of the created package with the content that is show below (each sentence must be in a new line):

start speech
stop speech
go forward
go back
turn left
turn right
speed up
slow down
rotate left
rotate right
stop
point beer
point cocacola
fold arm

We will upload and compile the created file "comandos_voz.txt" in the next link in order to generate the recognition vocabulary:

Sphinx knowledge base tool

< go back to main

@@ Line 24: / Line 24: @@
 ==Speech recognition==
-<!--
+First, we will execute the next command in a terminal within our workspace in order to create a ''package'' named "voz_fer" that will content the voice control programs:
-==Reconocimiento de voz==
+<syntaxhighlight>roscreate-pkg voz_fer pocketsphinx sound_play std_msgs roscpp</syntaxhighlight>
-Comenzaremos creando el ''package'' que va a contener nuestros programas de control mediante la voz. [[Fernando-TFM-ROS02#Creando un package|Crearemos un ''package'']] llamado "voz_fer" con las siguientes dependencias: pocketsphinx sound_play std_msgs roscpp. Ahora tenemos que que establecer las ordenes de voz que vamos a emplear para el control, pueden ser palabras u oraciones (en inglés), el uso de oraciones reduce la aparición de falsos positivos de reconocimiento. El vocabulario seleccionado es el siguiente, que guardaremos  en el directorio "config" del ''package'' creado, en un archivo llamado "comandos_voz.txt", donde cada palabra u oración irá en una nueva línea:
+We have to select the vocabulary that we use to control the robot. We use short sentences for the orders in order to reduce the recognition mistakes. We will create a file named "comandos_voz.txt" within the folder "config" of the created ''package'' with the content that is show below (each sentence must be in a new line):
 <syntaxhighlight>
@@ Line 47: / Line 47: @@
 </syntaxhighlight>
-Para generar este vocabulario para el reconocimiento debemos subir nuestro archivo comandos_voz.txt al siguiente enlace y compilarlo:
+We will upload and compile the created file "comandos_voz.txt" in the next link in order to generate the recognition vocabulary:
 [http://www.speech.cs.cmu.edu/tools/lmtool-new.html Sphinx knowledge base tool]
+<!--
 Descargaremos los archivos generados al directorio "config" del ''package'' creado y los renombraremos como "comandos_voz.*". Vamos a crear un ''launcher'' para poder ejecutar el programa de reconocimiento usando el vocabulario que hemos creado. En el directorio "launch" el ''package'' crearemos un archivo llamado "comandos_voz.launch" con el siguiente contenido:

Difference between revisions of "Voice control (sphinx+festival)"

Revision as of 22:29, 22 April 2014

Preliminary steps

Speech recognition

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Resources

Activities

Tools