Asterisk Fuels Speech Technologies

If open-source telephony is going mainstream as I recently wrote about, then the subsets of open-source telephony are also booming. For example, speech recognition in the open-source world is becoming more and more important. Case in point, a company called LumenVox has been working closely with Digium to provide speech recognition to Asterisk users.

Using a speech API (in the world of Asterisk it is called an AGI or Asterisk Gateway Implementation) Asterisk users can now have access to LumenVox speech technology. In addition, Asterisk developers can use the familiar Asterisk Dial Plan environment for programming. Here is an example of using the Asterisk Dial Plan for pizza delivery.

Since October of 2006 there have been 450 developer purchases of the LumenVox Speech Starter Kit for Asterisk according to Gerd Graumann, Director of Business Development and Laura Kennedy, Director of Communications. The kit includes one port of LumenVox Speech Engine Lite, a Digium connector bridge and speech tuner.

It should be noted the light version allows up to 500 pronunciations or words per prompt. This could be enough for a corporate directory application with 30 people. The reason you couldn’t support 500 people is because you need to support, first names, last names, whole names and nicknames.

The full engine allows 12,000 words or pronunciations per prompt. The Lite version obviously costs less and after you decide which version makes the most sense for your application you need to decide how many ports you need. Port prices range from $195 to $595 depending on volume and which version you need.

But what excites LumenVox and should be great for the whole industry is the work being done to integrate VXML into Asterisk. There are a few initiatives on the market allowing this to happen.

So going forward, all VXML applications should be able to run seamlessly on Asterisk allowing a greater number of developers to be able to write speech applications which will easily run on an open-source communications system.

It should be noted that Sphinx is an open-source speech recognition engine but as the execs at LumenVox tell me, they took Sphinx and added 30-35 years of development to make it commercial grade and easier to program with.

It is apparent open-source is having a positive effect on development in a number of spaces and speech recognition is just the latest space to feel this positive – dare I say turbocharged effect.

If you are interested in open-source communications development be sure to attend the Communications Developer Conference next week in Santa Clara, CA.