Difference between revisions of "Voice control (sphinx+festival)"
(→Speech recognition) |
|||
Line 51: | Line 51: | ||
[http://www.speech.cs.cmu.edu/tools/lmtool-new.html Sphinx knowledge base tool] | [http://www.speech.cs.cmu.edu/tools/lmtool-new.html Sphinx knowledge base tool] | ||
− | We will download and rename (commandos_voz.*) the generated files within the folder "config" of the generated ''package''. We will create a file named "comados_voz.launch" within the folder "launch" of the ''package'' in order to start : | + | We will download and rename (commandos_voz.*) the generated files within the folder "config" of the generated ''package''. We will create a file named "comados_voz.launch" within the folder "launch" of the ''package'' in order to start [http://cmusphinx.sourceforge.net/ CMU Pocket Sphinx] with our vocabulary: |
− | |||
− | |||
− | |||
− | |||
<syntaxhighlight lang="xml"> | <syntaxhighlight lang="xml"> | ||
Line 68: | Line 64: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | + | We will execute the next command in a terminal in order to start the speech recognition: | |
<syntaxhighlight>roslaunch voz_fer comandos_voz.launch</syntaxhighlight> | <syntaxhighlight>roslaunch voz_fer comandos_voz.launch</syntaxhighlight> | ||
+ | |||
+ | |||
+ | |||
+ | <!-- | ||
Los resultados de reconocimiento de voz se publican en el ''topic'' "recognizer/output", en un mensaje tipo [http://docs.ros.org/api/std_msgs/html/msg/String.html std_msgs/String]. Podemos ver el resultado producido al hablar ejecutando el siguiente comando en otro terminal: | Los resultados de reconocimiento de voz se publican en el ''topic'' "recognizer/output", en un mensaje tipo [http://docs.ros.org/api/std_msgs/html/msg/String.html std_msgs/String]. Podemos ver el resultado producido al hablar ejecutando el siguiente comando en otro terminal: |
Revision as of 21:10, 24 April 2014
Preliminary steps
We use CMU Pocket Sphinx for speech recognition developed on Carnegie Mellon University, and Festival Speech Synthesis System for text speech developed on University of Edinburgh.
Currently the ROS package pocketsphinx is available for groovy and hydro distributions. We can install it executing the next command in a terminal:
sudo apt-get install ros-"groovy or hydro"-pocketsphinx
We have to install the ubuntu package CMU Pocket Sphinx in previous ROS distributions. We will execute the next command in a terminal to install it:
sudo apt-get install gstreamer0.10-pocketsphinx
Moreover for previous ROS distributions, we have to download the ROS stack "rharmony" by University of Albany in order to integrate CMU Pocket Sphinx in ROS. We will place in the ROS workspace and we will execute the next commands in a terminal:
svn checkout http://albany-ros-pkg.googlecode.com/svn/trunk/rharmony
rosmake --rosdep-install pocketsphinx
Festival Speech Synthesis System is intigrated in the ROS package sound_play.
Speech recognition
First, we will execute the next command in a terminal within our workspace in order to create a package named "voz_fer" that will content the voice control programs:
roscreate-pkg voz_fer pocketsphinx sound_play std_msgs roscpp
We have to select the vocabulary that we use to control the robot. We use short sentences for the orders in order to reduce the recognition mistakes. We will create a file named "comandos_voz.txt" within the folder "config" of the created package with the content that is show below (each sentence must be in a new line):
start speech
stop speech
go forward
go back
turn left
turn right
speed up
slow down
rotate left
rotate right
stop
point beer
point cocacola
fold arm
We will upload and compile the created file "comandos_voz.txt" in the next link in order to generate the recognition vocabulary:
We will download and rename (commandos_voz.*) the generated files within the folder "config" of the generated package. We will create a file named "comados_voz.launch" within the folder "launch" of the package in order to start CMU Pocket Sphinx with our vocabulary:
<launch>
<node name="recognizer" pkg="pocketsphinx" type="recognizer.py" output="screen">
<param name="lm" value="$(find voz_fer)/config/comandos.lm"/>
<param name="dict" value="$(find voz_fer)/config/comandos.dic"/>
</node>
</launch>
We will execute the next command in a terminal in order to start the speech recognition:
roslaunch voz_fer comandos_voz.launch