SVOX continues to be the hot voice provider in automotive these days. Here is the latest "Buzz":
SVOX Powers In-car Speech Output for the Infiniti Vehicle Line-up
Press Release
SVOX Powers In-car Speech Output for the
Infiniti Vehicle Line-up
Keywords: SVOX, Inf ini t i, Infotainment , Navigat ion, Speech Output, TTS,
text- to-speech
Zurich, August 29, 2007 – SVOX, the leading provider of embedded speech output solutions,
today announced that Infiniti leverages SVOX speech solutions to provide drivers with
enhanced convenience when using the in-vehicle navigation system.
Using the SVOX voice-activated user interface, the Infiniti Navigation System helps drivers to
keep their eyes on the road and contributes to safely reaching their destinations. Infiniti will offer
the SVOX-enabled Infiniti Navigation System on five models in 2008 model year and the entire
Infiniti line-up by the 2009 model year.
Powered by SVOX Automotive, the industry’s leading embedded speech output solution, the
system interacts with the driver in a natural and intuitive way. Martin Reber, COO of SVOX,
said, “As traffic volumes continue to increase and car infotainment systems become more
sophisticated, speech technology provides the most convenient interface with a car’s navigation
system. The Infiniti navigation system includes one of the most comprehensive installations of
voice features we have seen to date.”
About SVOX AG
SVOX AG is the leading provider of embedded speech output solutions for the automotive and
mobile device industries. The SVOX Text-to-Speech system, the company’s signature product,
is part of a full product suite of small, fast and multilingual applications that enable computers
and other electronic devices to convert written text into natural-sounding and easily
comprehensible speech. SVOX’s focus on embedded speech allows for specifically optimized
solutions, and its software architecture provides customers with a speech engine that can be
easily tailored to their technical requirements and market needs.
SVOX AG
Phil Lichtenberg
Baslerstrasse 30
CH-8048 Zurich
Switzerland
Tel.: +41 43 544 0613
Fax: +41 43 544 0601
lichtenberg@svox.com
www.svox.com
Friday, August 31, 2007
To Infiniti... and Beyond
Thursday, August 30, 2007
Nuance and Free Directory Assistance
Looks like Nuance will bring their great technology for TTS and VR into the free directory assistance market.
These services are completely changing directory assistance, and really hurting the bottom line of phone companies who have long made a killing in the space. I tried Goog411 a while back, and it is fantastic.
Thursday, August 23, 2007
Latest NextUp.com Newsletter
| | ||||||||||||||||||||
| ||||||||||||||||||||
| | ||||||||||||||||||||
How improved are IVRs?
Nice (and long) ComputerWorld article on progress with TTS/VR based phone systems.
Improved technology makes it less likely that you'll get caught in 'touch-tone hell'
August 22, 2007 (Computerworld) -- "Touch 1 for sales, touch 2 for customer service, touch 3 for ... "
Such recorded greetings, inviting a response via the caller's touch-tone telephone keypad, are generated by interactive voice response (IVR) systems, which for two decades have been the principal communications interface between the public and corporate America, supporting self-service applications -- or at least reducing the workload on live call agents.
But these days, IVR systems are changing, leaving less and less likelihood of callers being trapped in "touch-tone hell." More corporations are switching to speech recognition so that callers are greeted by a voice that invites them to simply state their business. Reacting to the words they recognize, these systems route the calls accordingly.
Such an open-ended greeting is called a natural language system, explained Lynda Smith, division manager at Nuance Communications Inc. in Burlington, Mass., which makes the "speech engine" used in many IVRs. (Simpler, menu-structured speech interfaces are called "directed dialog" systems.)
Smith divides speech-based IVRs into four tiers. The lowest tier prompts the user to "press or say 1, and might have a "grammar" (the repertoire of words and phrases it can respond to) of 250 words. Tier 2 would be similar but with a grammar of up to 2,500 utterances. Tier 3 would add a natural language system, and Tier 4 would be capable of handling an open-ended grammar, such as would be needed for a directory look-up application. Prices range from $100,000 to $1 million, she added. Much More...
Emotion and Robots
Robot development and the development of Text To Speech are somewhat married, as TTS will be a vital part of any human like robot. We've already seen some work in adding more emotion to TTS. Thought this article was very interesting and robots not only expressing emotions, but eventually actually feeling them.
Could robots' emotions help simplify things?
What if robots not only seemed emotional, but acted on their emotions too? This is the idea behind a project to give a robot called the iCAT, (one of Time magazine's best invention of 2005) "emotional logic", as outlined in this Technology Review article.
The robot itself is made by a team at Philips Research as a tool for experimenting with human-robot interactions. It features speech recognition and servomotors that generate a wide variety of facial expressions to simulate different emotions. See videos of iCAT in action here and here.
And now, Mehdi Dastani and colleagues at Utrech University in the Netherlands are using the robot to test out 22 artificial emotions - including anger, hope, fear and joy - that determine its behaviour. More...
Wednesday, August 22, 2007
Neat Project, Morse to TTS
Not a real market for this kind of thing, but still interesting innovation. Now if we can just go Voice->TTS, and maybe throw a translation in the middle of it, then we've got something.
Morse Code Translated to Text and Speech
So you think that text to speech is cool? What about Morse Code to Text to Speech! This is a great example of the ingenuity found in the Cornell University ECE 476 Microcontroller Design Final Projects.
“To implement our Morse Code system, we had to use both hardware and software. Since the Morse Code audio was that of a 750 Hz sine wave, we had to build a Schmidt Trigger to digitize the signal before sampling it. In our code, we used two state machines–one to detect the dots, dashes and spaces and another to determine the characters associated with the dots and dashes. To output the Morse Code, we used the Parallel D/A Direct Digital Synthesis (DDS) scheme presented on Professor Land’s website. To accomplish text-to-speech, we encoded the 100 most commonly used words in English (in addition to a few extras and a silence) and stored the compressed audio in dataflash. The audio is decompressed on the fly when the word is found in the table; otherwise, the system outputs a beep. All of these parts were essential for achieving our goal.” More...
Tuesday, August 21, 2007
Cepstral Donates Open Source MRCP Stack to Telephony Industry
PITTSBURGH, PA -- 08/20/07 -- Cepstral LLC is proud to announce that it is providing the first open source Media Resource Control Protocol (MRCP) library, OpenMRCP, at no charge to telephony application developers today. Cepstral, a pioneer in Text-to-Speech (TTS) voice software, is donating OpenMRCP to the developer community in order to spur the development of telephony applications that interact with distributed speech resources such as TTS and Automatic Speech Recognition (ASR). MRCP is a standards-based communications protocol that allows telephony applications to communicate with speech resources, similar to the way TCP/IP allows browsers to communicate with web sites. More...
Fonix still hanging in
Fonix announced financial results this week. They've been an interesting company to watch. I originally heard of them when they came out with a TextAloud clone called iSpeak.
It wasn't a great product. It had some of the old DecTalk Voices and a couple of AT&T Voices. They spent a lot of money on it quickly, getting it into BestBuy and CompUSA, which turned out to be a big money loser. The product and any support for it disappeared pretty quickly. It has amazed me to watch companies try to copy TextAloud, throw a bunch of money and very poor execution and support at it, only to fail.
While they aren't making money yet, they continue to hang in there, finding niches with voice recognition on game consoles, and even another use for the old DecTalk voices with this new offering from a licensee at
http://www.satogo.com/
Thursday, August 16, 2007
Thai Students Score a Prize For Speech Software
A team of four Thai students beat out 10,000 competitors to win the $25,000 prize in the Microsoft 2007 Imagine Cup. Their project is text-to-speech software in which computers read aloud typed and handwritten commands. The software will allow people who can't read to interact with a PC. Imagine Cup judge Rand Morimoto has been blogging on the whole experience — from his video of the opening ceremonies to how contestants swilled free Cokes to keep themselves awake during the 24-hour, no-sleep phase of the competition."
I'm not sure I understand exactly what they did. I'll see if I can find the details. Did find a few details of projects:
Team inGest, from the National University of Ireland, showcased a solution aimed at teaching people how to communicate in sign language. Through the use of coloured gloves and a Web camera, the solution tracks signing gestures, analyses this in real-time against data collected from proficient signers, and provides feedback to the student.
The SMOR team, from Serbia, demonstrated a driving simulation solution for driver education purposes. While the student sits in the simulator, the instructor is able to change the physical environment, including adapting road or weather conditions; introducing more cars to simulate rush hour; and introducing reckless drivers to teach students how best to react.
The EN# 65 team, from Korea, demonstrated a solution aimed at people suffering from both hearing and eyesight loss. Delivered through gloves, the solution enables speech through a system similar to typing and provides feedback or “hearing” via vibrations to the top of the hand.
Thailand's team showcased a solution which translated text into pictures to help the illiterate to learn to read and maintain motivation. With the use of a Web cam, pages from books can be scanned in and presented visually. The system also provides verbal tutorials.
The Intoi solution, from Team Austria, provides educators with a digital flipchart solution. Using rear projection onto a surface similar to whiteboard, the pen-based system facilitates multiple users and an ability to import presentations, pictures and videos from the supporting PC.
The Jamaican ICAD team demonstrated CADI, an e-learning solution which facilitates collaboration in a distance education environment. The solution provides real-time translation across 12 languages.
Wednesday, August 15, 2007
Scribd and TTS
Interesting site
Scribd.com
is a sort of free-for-all upload any documents you want. You could think of it as a YouTube for documents. The interesting thing is they have a feature where you can download any of their documents as an MP3 file. Voices aren't as good as TextAloud, and texts have a ton of junk in them, but it is interesting.
They are also breaking pretty much every copyright law around, but check out
http://www.scribd.com/doc/99266/Stephen-Hawking-A-Brief-History-Of-Time
After you skip over the junk at the front, it is a TTS generated audio book.
Pittsburg - Speech Capital?
Nice Cepstral mention in this article
Technically speaking
By Rob Amen
TRIBUNE-REVIEW
Tuesday, August 14, 2007
For a city with a bizarre dialect, Pittsburgh is one of the speech capitals of the world.That according to Craig Campbell, CEO of Cepstral, a tech start-up on the South Side that specializes in transferring text to speech.
"As a region, academically we're one of the top in the world," Campbell said. "Pittsburgh is looking for centers of excellence. We've identified education, hospitals, medical, nanotechnology. Speech isn't in there, but it (should be)."
The city is home to many technology start-up companies such as Cepstral that develop innovative products in relative anonymity. Full Story...
Saturday, August 4, 2007
Apple Phone Show TextAloud mention
Text To Speech: Convert Text Into Spoken Audio - Apple Phone Show

by Liana Lehua
Time is such a valuable commodity. How would you like to optimize your time by launching your iPod and having your work, school or other documents read to you on your iPhone while you purchase an Apple iPhone Bluetooth Headset online, query for directions to your lunch meeting in Maps, and email your latest sales figures to your boss?
Big props to Andy Ihnatko, Apple Phone Show Guest Host, for this awesome tip that works on both Mac and Windows platforms. Mac users will benefit from an on-board app called Automator, while Windows users will need to purchase a third-party software like TextAloud2 ($29.95). The voices provided to read your text to you may sound too much like a computer. There are alternate voice options covered later.
SETUP - MAC
1. Make sure the document you want to use is converted to plain text and that your document is saved with the .txt extension.
1. Open Automator.
2. Add an action by searching for and dragging “Get Contents of TextEdit Document” from the menu on the left to the blank box on the right.
3. Add action: “Text To Audio File” and complete the fields: System Voice, Save As, and Where.
4. Add action: “Rename Finder Items (Make Finder Item Names Sequential)”. In the first drop down box, select “Make Sequential”. Select “Add number to existing name”. Place number “after name”, and separated by “dash”
5. Add action: “Import Audio File”. Select “AAC Encoder” and check the “Delete source files after encoding.”
6. Save the Automator workflow as “Text to Speech”. Go to File - Save as plug-in, and select Script Menu to save.
Now you are finished with Automator and only have a few more steps to complete. Continuing with the process:
7. Open the document in TextEdit.
8. If needed, make any modifications to the text at this time.
9. Select the Scripts menu located in your menu bar. It looks like a scroll or curly “S” and choose the “Text to Speech” workflow.
You will see the status of the conversion at work in your menu bar. When it’s complete, an audio file will automatically be added to your library in iTunes.
SETUP - WINDOWS
1. Download and install TextAloud2.
2. Open the TextAloud2 program.
3. Go to File - Open and select the file you want to convert to audio. You can change the file name that appears in the Title field if you choose. Remember where you save this file as you will need to navigate to it later.
4. Click the Speak To File button.
5. Choose where you would like to save the new audio file, and select OK to start the process. Wait for the process to complete.
6. When the process is complete, drag the audio file into your iTunes library.
NOTE: I did not test the Windows setup. There may be some slight variations between these instructions and your actual experience.
VOICES
If you would like a more natural sounding voice to read your documents to you, AbleReader is an alternative that is both Mac and Windows compatible and uses “AT&T Natural Voices” (16kHz audio). The female voice, Crystal, is a bit more smooth than her counterpart, Mike. Mac pricing is $49.95 for the downloaded version. Windows pricing begins at $35 for the downloaded version and offers additional add-ons, including the option to use “AT&T Natural Voices” with TextAloud. For Intel Mac users, you can save your money and use Automator. AbleReader doesn’t work on Intel-based Macs.
Not being a fan of printing things out and carrying paper, this has been a time and tree saver. I have used this method to convert text from several web pages and articles so I can have them available to listen to as I have time. Full Story...
Friday, August 3, 2007
More TextAloud and Proofreading
NovelMaker.com Recommends TextAloud for Writers Seeking Valuable Tools to Improve Their Work
From proofreading and catching typos, to listening to flow, dialogue, and more, TextAloud makes a powerful secret weapon for writers
Clemmons, NC (PRWEB) August 3, 2007 -- For many writers, their writings often don't truly come to life for them until they can listen to those words spoken aloud. This makes Text to Speech tools like NextUp's TextAloud especially valuable, as they enable the writer to listen to her words and not only catch typographical errors that traditional word processing spell-checkers might miss, but to also get a sense of flow and syntax, and to even get a realistic feel for spoken dialogue exchanges between different characters. Now NovelMaker.com (www.NovelMaker.com), an acclaimed new online community for writers, agents and editors, has honored TextAloud by recommending the program to its participants seeking the best proofreading tools for writers.
TextAloud is an award-winning program that converts text into spoken audio for listening on a PC, and can also save to audio files for the added option of easy playback on-the-go, from iPod (TM), to iPhone (R), to car and beyond. Thanks to an elegant and simple program interface, along with a wide array of superb premium voices in almost any imaginable age, gender, or accent, many writers have already discovered the value of TextAloud to proof or simply listen to their work. Now comes the endorsement of NovelMaker, sure to be a resource for thousands of writers worldwide.
Just launched in July 2007, NovelMaker.com is the world's first truly interactive community for fiction writers, readers, critics, literary agents, editors, and publishers. With NovelMaker, authors upload their manuscripts or works in progress to the site for free, and can upon completion see their completed books in the site's own literary library. Community members (authors, editors, agents and more) as well as readers will then offer suggestions, reviews, ratings and encouragement. The site recommends TextAloud as one of a short list of helpful tools in its Helpful Tools & Links section (http://www.novelmaker.com/links_tools.php?s=).
"TextAloud offers enormous assistance in the writing process by catching errors more conventional tools like spell-checkers may miss," comments Chris Olander, President and CEO of NovelMaker.com. "It also offers a great deal of creative potential, for use in testing dialogue or even characterization. TextAloud ultimately offers a multitude of fascinating uses sure to be helpful to writers who take part in the NovelMaker community."
"We're delighted to have been recommended by NovelMaker," comments NextUp President Rick Ellis. "TextAloud has often been considered a secret weapon by many writers who have contacted us, assisting them in truly listening to their works in a way that was not possible before. We hope that writers and editors in the NovelMaker community can use this recommendation to discover the power of Text to Speech for themselves."
TextAloud enables anyone to easily and affordably export documents, books, magazine articles, web content, even e-mails, into spoken audio. The program smoothly converts text into spoken audio for listening on a PC or laptop, and can also save text to audio files for playback on portables like the iPhone (R), iPod (TM), PocketPC (R), and a wide range of other players and devices. Highly popular with people of a variety of professions and walks of life, TextAloud is affordably priced, and is simple for anyone with a PC, laptop or portable.
About NovelMaker.com
NovelMaker.com is the world's first truly interactive community for fiction writers, readers, critics, literary agents, editors, and publishers. From authors to agents, every participant in NovelMaker has access to a large, interactive community to participate with them in the creation, and potential commercial success, of new works of fiction. For more information, please visit www.NovelMaker.com.
About TextAloud
TextAloud has been featured in The New York Times, PC Magazine, Writer's Digest, on CNN, and more. Hailed by critics and users alike, TextAloud is priced at just $29.95, and is compatible with systems using Windows (R) 98, NT, 2000, XP and Vista. The program is available for fast, safe and secure purchase via http://www.NextUp.com
Thursday, August 2, 2007
A unique form of Text To Speech
This is one of the neatest twist on TTS I've ever seen
http://www.sr.se/P1/src/sing/#
Type in your text, and it will sing it back to you. Not sure there are many practical uses for this, but it is certainly a very clever idea that someone put a ton of work into.
Wednesday, August 1, 2007
Neat Kurzweil Reader for the Blind
Blind have ally in device that turns text into words
James Gashel had an "a-ha" moment last year as he walked to a boarding gate at Baltimore/Washington International Thurgood Marshall Airport.He had just eaten at an airport restaurant. He had paid his bill in exact change and left a generous tip. Nothing unusual — except he didn't ask anyone to read to him.
Gashel has been blind for 50 years. Typically when he ate out, a waiter or a fellow diner would have to read the menu and the bill to him.
But last year at the airport, he used a portable machine instead.
"As I was walking to my boarding gate, I stopped and thought, 'You just had an experience that you've never had before,'" he said.
Since then, he has used the machine — the Kurzweil-National Federation of the Blind Reader — in many places and conducted thousands of transactions without anyone reading to him.
While he is independent and capable without the reader, the ability to handle printed text anytime and anywhere has simplified his daily schedule, he said.
"I find that to be life-changing," Gashel said.
The reader, unveiled last June, is the first hand-held device that translates text into speech, said Chris Danielsen, spokesman for the Baltimore-based National Federation of the Blind. Full Story...





