Friday, August 31, 2007

To Infiniti... and Beyond

SVOX continues to be the hot voice provider in automotive these days. Here is the latest "Buzz":

SVOX Powers In-car Speech Output for the Infiniti Vehicle Line-up

Press Release
SVOX Powers In-car Speech Output for the
Infiniti Vehicle Line-up
Keywords: SVOX, Inf ini t i, Infotainment , Navigat ion, Speech Output, TTS,
text- to-speech
Zurich, August 29, 2007 – SVOX, the leading provider of embedded speech output solutions,
today announced that Infiniti leverages SVOX speech solutions to provide drivers with
enhanced convenience when using the in-vehicle navigation system.
Using the SVOX voice-activated user interface, the Infiniti Navigation System helps drivers to
keep their eyes on the road and contributes to safely reaching their destinations. Infiniti will offer
the SVOX-enabled Infiniti Navigation System on five models in 2008 model year and the entire
Infiniti line-up by the 2009 model year.
Powered by SVOX Automotive, the industry’s leading embedded speech output solution, the
system interacts with the driver in a natural and intuitive way. Martin Reber, COO of SVOX,
said, “As traffic volumes continue to increase and car infotainment systems become more
sophisticated, speech technology provides the most convenient interface with a car’s navigation
system. The Infiniti navigation system includes one of the most comprehensive installations of
voice features we have seen to date.”
About SVOX AG
SVOX AG is the leading provider of embedded speech output solutions for the automotive and
mobile device industries. The SVOX Text-to-Speech system, the company’s signature product,
is part of a full product suite of small, fast and multilingual applications that enable computers
and other electronic devices to convert written text into natural-sounding and easily
comprehensible speech. SVOX’s focus on embedded speech allows for specifically optimized
solutions, and its software architecture provides customers with a speech engine that can be
easily tailored to their technical requirements and market needs.
SVOX AG
Phil Lichtenberg
Baslerstrasse 30
CH-8048 Zurich
Switzerland
Tel.: +41 43 544 0613
Fax: +41 43 544 0601
lichtenberg@svox.com
www.svox.com

Thursday, August 30, 2007

Nuance and Free Directory Assistance

Looks like Nuance will bring their great technology for TTS and VR into the free directory assistance market.

These services are completely changing directory assistance, and really hurting the bottom line of phone companies who have long made a killing in the space. I tried Goog411 a while back, and it is fantastic.

Thursday, August 23, 2007

Latest NextUp.com Newsletter



NextUp.com Newsletter August 2007

Hello from NextUp.com!

Congratulations to Mike N., winner of the NextUp.com golf shirt for last month's newsletter prize. This month one lucky winner will receive a SanDisk Sansa 1GB MP3 Player, so be sure to read on for details on how you can enter. Thanks for staying tuned!
TextAloud Advanced FAQ
TextAloud Advanced FAQAs most of you know, we offer fantastic email support, probably the fastest responses on the net. But we've also built up some good documentation to help you get the most out of TextAloud. One of the great resources is the TextAloud manual at

http://www.nextup.com/manual.html

Also, we long ago learned that most of our users are smarter than us, and for really complex questions, users often come up with better solutions than we do, so we invite you all to drop by our online User Forum at

http://nextup.com/phpBB2/index.php

One additional resource is our Advanced FAQ, which is part of the forum
system where we answer some of the tougher questions we get, and reveal some really advanced features of TextAloud that most users do not know about. The Advanced FAQ is at

http://nextup.com/phpBB2/viewforum.php?f=11

We thought this month we'd highlight some of the more interesting items there that you might find helpful.

Common problems with getting HotKeys to work with other Windows Programs

http://nextup.com/phpBB2/viewtopic.php?t=3461

Forcing upper-case words to be spelled instead of pronounced

http://nextup.com/phpBB2/viewtopic.php?t=3348

How to backup pronunciation edits

http://nextup.com/phpBB2/viewtopic.php?t=1507

Older free voices not showing up in TextAloud

http://nextup.com/phpBB2/viewtopic.php?t=3284

How to suppress numbers in text from being spoken

http://nextup.com/phpBB2/viewtopic.php?t=3269

Skipping comments or certain text

http://nextup.com/phpBB2/viewtopic.php?t=3146

Using TextAloud to repeat the same text over and over

http://nextup.com/phpBB2/viewtopic.php?t=3048

Using TextAloud to read emails in TextAloud or Internet Mail

http://nextup.com/phpBB2/viewtopic.php?t=2818

Not speaking numbers at the start of lines

http://nextup.com/phpBB2/viewtopic.php?t=2202

How to transfer TextAloud audio files to Itunes/Ipod

http://nextup.com/phpBB2/viewtopic.php?t=2009

Common problems when reading PDF Files

http://nextup.com/phpBB2/viewtopic.php?t=1778

Information on SAPI5 Speech control tags

http://nextup.com/phpBB2/viewtopic.php?t=1694

Pausing automatically at certain punctuation or carriage returns
http://nextup.com/phpBB2/viewtopic.php?t=1526

How to embed speed and volume tags within text

http://nextup.com/phpBB2/viewtopic.php?t=1515

Reading Microsoft Reader .lit files in TextAloud

http://nextup.com/phpBB2/viewtopic.php?t=1514

We hope these and the other online resources we provide help you get
more out of Textaloud. As always if you have any questions or
suggestions for us, let us know.
Special Offer - Sale On Voices
Voices
It's our 8th year now with TextAloud and providing great voices to our customers, so we decided to celebrate with a sale. This special offer will only last for one week, so don't delay if you want to get a great deal on some outstanding voices.

Act now and you can purchase three high quality English voices for TextAloud for only $75. Normally the cost would be between $105 and $135, depending on which three voices you purchased, but for one week only if you buy three of these voices you will get them for only $25 each.
Go to the following page for complete details on this special offer and how you can order now

This deal is available only to TextAloud users and voices are for use with TextAloud and our other products. If you don't already have TextAloud, you'll be given a chance to buy it on the order page. The $75 price is for download only and a high speed Internet connection is required for that, or you can choose to pay extra charges for CD/shipping if you need that instead. If you want to substitute voices in languages other than English, we can do that but it is done manually and the cost is $85 for 3 voices. Details on all these things are on the above web page.


Partner Products from Acoustica
Acoustica Software
We've partnered with Acoustica, a leader in audio related software, to offer 4 unique products that may fit how you use audio you create in TextAloud. Free demo versions are available for download so if any of these sound like they might help you, please check them out at

http://www.nextup.com/acoustica.html

Acoustica MP3 CD Burner 4.5 - Burn Audio CDs or MP3 CDs with TextAloud audio files or music files.

Acoustica Audio Converter Pro - Easy audio file conversion with
right-click explorer integration to convert between MP3, WAV, WMA, CDA, or OGG files.

Acoustica MP3 Audio Mixer - Mix audio files to add background music to
TTS files, or speed up/slow down spoken audio files.

Acoustica CD/DVD Label Maker - Make attractive labels for your custom
created CDs.

Acoustica is a company we trust to treat our customers right, and we
invite you to try these products at

http://nextup.com/acoustica.html

Monthly Drawing - Back To School With TextAloud

We have tons of people enjoying the benefits of using TextAloud in combination with a portable audio device like the iPod, a PocketPC, flash memory MP3 players, or CD players. Students especially find TextAloud to be a great study tool that helps them deal with large amounts of reading, prepare for quizzes, and even improve their writing by serving as a proofreading aid.

It's back to school time so for this month's contest we're asking you to help us get the word out to students (and parents). The prize this month is a SanDisk Sansa 1GB MP3 Player. This is a neat little player with 1 GB of storage capacity, an FM tuner, voice recording and custom playback options. All you have to do to enter is tell a student or parent you know about TextAloud and direct them to the following page for more details
then for your entry, send us an email to
newsletterdrawing@nextup.com

We'll draw a winner from entries received by Friday, August 31 to win a SanDisk Sansa 1GB MP3 Player. Thanks and best of luck!
If you find our products useful, please share the news with your friends and at your place of business. We offer Volume Pricing with Site Licensing available to schools, organizations and businesses. If you'd like more information on this or have questions about any of our products, please don't hesitate to contact us at

support@nextup.com

Thank you for your interest in our company & products.

The NextUp.com Team
NextUp.com
The Power of Spoken Audio
http://www.NextUp.com

A Division of NextUp Technologies, LLC
NextUp.com Logo

Special Offer
Voices
It's our 8th year now for TextAloud and providing great voices to our customers, so we decided to celebrate with a sale. This special offer will only last for one week, so don't delay if you want to get a great deal on some outstanding voices. Get the details in our newsletter at left, or just click on the voice sale link below.
Quick Links
Join Our Mailing List


How improved are IVRs?

Nice (and long) ComputerWorld article on progress with TTS/VR based phone systems.

Improved technology makes it less likely that you'll get caught in 'touch-tone hell'

August 22, 2007 (Computerworld) -- "Touch 1 for sales, touch 2 for customer service, touch 3 for ... "

Such recorded greetings, inviting a response via the caller's touch-tone telephone keypad, are generated by interactive voice response (IVR) systems, which for two decades have been the principal communications interface between the public and corporate America, supporting self-service applications -- or at least reducing the workload on live call agents.

But these days, IVR systems are changing, leaving less and less likelihood of callers being trapped in "touch-tone hell." More corporations are switching to speech recognition so that callers are greeted by a voice that invites them to simply state their business. Reacting to the words they recognize, these systems route the calls accordingly.

Such an open-ended greeting is called a natural language system, explained Lynda Smith, division manager at Nuance Communications Inc. in Burlington, Mass., which makes the "speech engine" used in many IVRs. (Simpler, menu-structured speech interfaces are called "directed dialog" systems.)

Smith divides speech-based IVRs into four tiers. The lowest tier prompts the user to "press or say 1, and might have a "grammar" (the repertoire of words and phrases it can respond to) of 250 words. Tier 2 would be similar but with a grammar of up to 2,500 utterances. Tier 3 would add a natural language system, and Tier 4 would be capable of handling an open-ended grammar, such as would be needed for a directory look-up application. Prices range from $100,000 to $1 million, she added. Much More...

Emotion and Robots

Robot development and the development of Text To Speech are somewhat married, as TTS will be a vital part of any human like robot. We've already seen some work in adding more emotion to TTS. Thought this article was very interesting and robots not only expressing emotions, but eventually actually feeling them.

Could robots' emotions help simplify things?

What if robots not only seemed emotional, but acted on their emotions too? This is the idea behind a project to give a robot called the iCAT, (one of Time magazine's best invention of 2005) "emotional logic", as outlined in this Technology Review article.

The robot itself is made by a team at Philips Research as a tool for experimenting with human-robot interactions. It features speech recognition and servomotors that generate a wide variety of facial expressions to simulate different emotions. See videos of iCAT in action here and here.

And now, Mehdi Dastani and colleagues at Utrech University in the Netherlands are using the robot to test out 22 artificial emotions - including anger, hope, fear and joy - that determine its behaviour. More...

Wednesday, August 22, 2007

Vivee Ad

Kind of funny ad featuring new TTS mobile service.


Neat Project, Morse to TTS

Not a real market for this kind of thing, but still interesting innovation. Now if we can just go Voice->TTS, and maybe throw a translation in the middle of it, then we've got something.

Morse Code Translated to Text and Speech



So you think that text to speech is cool? What about Morse Code to Text to Speech! This is a great example of the ingenuity found in the Cornell University ECE 476 Microcontroller Design Final Projects.

“To implement our Morse Code system, we had to use both hardware and software. Since the Morse Code audio was that of a 750 Hz sine wave, we had to build a Schmidt Trigger to digitize the signal before sampling it. In our code, we used two state machines–one to detect the dots, dashes and spaces and another to determine the characters associated with the dots and dashes. To output the Morse Code, we used the Parallel D/A Direct Digital Synthesis (DDS) scheme presented on Professor Land’s website. To accomplish text-to-speech, we encoded the 100 most commonly used words in English (in addition to a few extras and a silence) and stored the compressed audio in dataflash. The audio is decompressed on the fly when the word is found in the table; otherwise, the system outputs a beep. All of these parts were essential for achieving our goal.” More...

Tuesday, August 21, 2007

Cepstral Donates Open Source MRCP Stack to Telephony Industry

PITTSBURGH, PA -- 08/20/07 -- Cepstral LLC is proud to announce that it is providing the first open source Media Resource Control Protocol (MRCP) library, OpenMRCP, at no charge to telephony application developers today. Cepstral, a pioneer in Text-to-Speech (TTS) voice software, is donating OpenMRCP to the developer community in order to spur the development of telephony applications that interact with distributed speech resources such as TTS and Automatic Speech Recognition (ASR). MRCP is a standards-based communications protocol that allows telephony applications to communicate with speech resources, similar to the way TCP/IP allows browsers to communicate with web sites. More...

Fonix still hanging in

Fonix announced financial results this week. They've been an interesting company to watch. I originally heard of them when they came out with a TextAloud clone called iSpeak.

It wasn't a great product. It had some of the old DecTalk Voices and a couple of AT&T Voices. They spent a lot of money on it quickly, getting it into BestBuy and CompUSA, which turned out to be a big money loser. The product and any support for it disappeared pretty quickly. It has amazed me to watch companies try to copy TextAloud, throw a bunch of money and very poor execution and support at it, only to fail.

While they aren't making money yet, they continue to hang in there, finding niches with voice recognition on game consoles, and even another use for the old DecTalk voices with this new offering from a licensee at
http://www.satogo.com/

Thursday, August 16, 2007

Thai Students Score a Prize For Speech Software

A team of four Thai students beat out 10,000 competitors to win the $25,000 prize in the Microsoft 2007 Imagine Cup. Their project is text-to-speech software in which computers read aloud typed and handwritten commands. The software will allow people who can't read to interact with a PC. Imagine Cup judge Rand Morimoto has been blogging on the whole experience — from his video of the opening ceremonies to how contestants swilled free Cokes to keep themselves awake during the 24-hour, no-sleep phase of the competition."


I'm not sure I understand exactly what they did. I'll see if I can find the details. Did find a few details of projects:

Team inGest, from the National University of Ireland, showcased a solution aimed at teaching people how to communicate in sign language. Through the use of coloured gloves and a Web camera, the solution tracks signing gestures, analyses this in real-time against data collected from proficient signers, and provides feedback to the student.

The SMOR team, from Serbia, demonstrated a driving simulation solution for driver education purposes. While the student sits in the simulator, the instructor is able to change the physical environment, including adapting road or weather conditions; introducing more cars to simulate rush hour; and introducing reckless drivers to teach students how best to react.

The EN# 65 team, from Korea, demonstrated a solution aimed at people suffering from both hearing and eyesight loss. Delivered through gloves, the solution enables speech through a system similar to typing and provides feedback or “hearing” via vibrations to the top of the hand.

Thailand's team showcased a solution which translated text into pictures to help the illiterate to learn to read and maintain motivation. With the use of a Web cam, pages from books can be scanned in and presented visually. The system also provides verbal tutorials.

The Intoi solution, from Team Austria, provides educators with a digital flipchart solution. Using rear projection onto a surface similar to whiteboard, the pen-based system facilitates multiple users and an ability to import presentations, pictures and videos from the supporting PC.

The Jamaican ICAD team demonstrated CADI, an e-learning solution which facilitates collaboration in a distance education environment. The solution provides real-time translation across 12 languages.

Wednesday, August 15, 2007

Scribd and TTS

Interesting site
Scribd.com
is a sort of free-for-all upload any documents you want. You could think of it as a YouTube for documents. The interesting thing is they have a feature where you can download any of their documents as an MP3 file. Voices aren't as good as TextAloud, and texts have a ton of junk in them, but it is interesting.

They are also breaking pretty much every copyright law around, but check out
http://www.scribd.com/doc/99266/Stephen-Hawking-A-Brief-History-Of-Time

After you skip over the junk at the front, it is a TTS generated audio book.

Pittsburg - Speech Capital?

Nice Cepstral mention in this article


Technically speaking


By Rob Amen
TRIBUNE-REVIEW
Tuesday, August 14, 2007

For a city with a bizarre dialect, Pittsburgh is one of the speech capitals of the world.

That according to Craig Campbell, CEO of Cepstral, a tech start-up on the South Side that specializes in transferring text to speech.

"As a region, academically we're one of the top in the world," Campbell said. "Pittsburgh is looking for centers of excellence. We've identified education, hospitals, medical, nanotechnology. Speech isn't in there, but it (should be)."

The city is home to many technology start-up companies such as Cepstral that develop innovative products in relative anonymity. Full Story...


Saturday, August 4, 2007

Apple Phone Show TextAloud mention

Text To Speech: Convert Text Into Spoken Audio - Apple Phone Show

automator_window

by Liana Lehua

Time is such a valuable commodity. How would you like to optimize your time by launching your iPod and having your work, school or other documents read to you on your iPhone while you purchase an Apple iPhone Bluetooth Headset online, query for directions to your lunch meeting in Maps, and email your latest sales figures to your boss?

Big props to Andy Ihnatko, Apple Phone Show Guest Host, for this awesome tip that works on both Mac and Windows platforms. Mac users will benefit from an on-board app called Automator, while Windows users will need to purchase a third-party software like TextAloud2 ($29.95). The voices provided to read your text to you may sound too much like a computer. There are alternate voice options covered later.


SETUP - MAC
1. Make sure the document you want to use is converted to plain text and that your document is saved with the .txt extension.
1. Open Automator.
2. Add an action by searching for and dragging “Get Contents of TextEdit Document” from the menu on the left to the blank box on the right.
3. Add action: “Text To Audio File” and complete the fields: System Voice, Save As, and Where.
4. Add action: “Rename Finder Items (Make Finder Item Names Sequential)”. In the first drop down box, select “Make Sequential”. Select “Add number to existing name”. Place number “after name”, and separated by “dash”
5. Add action: “Import Audio File”. Select “AAC Encoder” and check the “Delete source files after encoding.”
6. Save the Automator workflow as “Text to Speech”. Go to File - Save as plug-in, and select Script Menu to save.

Now you are finished with Automator and only have a few more steps to complete. Continuing with the process:

7. Open the document in TextEdit.
8. If needed, make any modifications to the text at this time.
9. Select the Scripts menu located in your menu bar. It looks like a scroll or curly “S” and choose the “Text to Speech” workflow.

You will see the status of the conversion at work in your menu bar. When it’s complete, an audio file will automatically be added to your library in iTunes.

SETUP - WINDOWS
1. Download and install TextAloud2.
2. Open the TextAloud2 program.
3. Go to File - Open and select the file you want to convert to audio. You can change the file name that appears in the Title field if you choose. Remember where you save this file as you will need to navigate to it later.
4. Click the Speak To File button.
5. Choose where you would like to save the new audio file, and select OK to start the process. Wait for the process to complete.
6. When the process is complete, drag the audio file into your iTunes library.

NOTE: I did not test the Windows setup. There may be some slight variations between these instructions and your actual experience.

VOICES

If you would like a more natural sounding voice to read your documents to you, AbleReader is an alternative that is both Mac and Windows compatible and uses “AT&T Natural Voices” (16kHz audio). The female voice, Crystal, is a bit more smooth than her counterpart, Mike. Mac pricing is $49.95 for the downloaded version. Windows pricing begins at $35 for the downloaded version and offers additional add-ons, including the option to use “AT&T Natural Voices” with TextAloud. For Intel Mac users, you can save your money and use Automator. AbleReader doesn’t work on Intel-based Macs.

Not being a fan of printing things out and carrying paper, this has been a time and tree saver. I have used this method to convert text from several web pages and articles so I can have them available to listen to as I have time. Full Story...

Friday, August 3, 2007

More TextAloud and Proofreading

NovelMaker.com Recommends TextAloud for Writers Seeking Valuable Tools to Improve Their Work

From proofreading and catching typos, to listening to flow, dialogue, and more, TextAloud makes a powerful secret weapon for writers

Clemmons, NC (PRWEB) August 3, 2007 -- For many writers, their writings often don't truly come to life for them until they can listen to those words spoken aloud. This makes Text to Speech tools like NextUp's TextAloud especially valuable, as they enable the writer to listen to her words and not only catch typographical errors that traditional word processing spell-checkers might miss, but to also get a sense of flow and syntax, and to even get a realistic feel for spoken dialogue exchanges between different characters. Now NovelMaker.com (www.NovelMaker.com), an acclaimed new online community for writers, agents and editors, has honored TextAloud by recommending the program to its participants seeking the best proofreading tools for writers.

TextAloud is an award-winning program that converts text into spoken audio for listening on a PC, and can also save to audio files for the added option of easy playback on-the-go, from iPod (TM), to iPhone (R), to car and beyond. Thanks to an elegant and simple program interface, along with a wide array of superb premium voices in almost any imaginable age, gender, or accent, many writers have already discovered the value of TextAloud to proof or simply listen to their work. Now comes the endorsement of NovelMaker, sure to be a resource for thousands of writers worldwide.

Just launched in July 2007, NovelMaker.com is the world's first truly interactive community for fiction writers, readers, critics, literary agents, editors, and publishers. With NovelMaker, authors upload their manuscripts or works in progress to the site for free, and can upon completion see their completed books in the site's own literary library. Community members (authors, editors, agents and more) as well as readers will then offer suggestions, reviews, ratings and encouragement. The site recommends TextAloud as one of a short list of helpful tools in its Helpful Tools & Links section (http://www.novelmaker.com/links_tools.php?s=).

"TextAloud offers enormous assistance in the writing process by catching errors more conventional tools like spell-checkers may miss," comments Chris Olander, President and CEO of NovelMaker.com. "It also offers a great deal of creative potential, for use in testing dialogue or even characterization. TextAloud ultimately offers a multitude of fascinating uses sure to be helpful to writers who take part in the NovelMaker community."

"We're delighted to have been recommended by NovelMaker," comments NextUp President Rick Ellis. "TextAloud has often been considered a secret weapon by many writers who have contacted us, assisting them in truly listening to their works in a way that was not possible before. We hope that writers and editors in the NovelMaker community can use this recommendation to discover the power of Text to Speech for themselves."

TextAloud enables anyone to easily and affordably export documents, books, magazine articles, web content, even e-mails, into spoken audio. The program smoothly converts text into spoken audio for listening on a PC or laptop, and can also save text to audio files for playback on portables like the iPhone (R), iPod (TM), PocketPC (R), and a wide range of other players and devices. Highly popular with people of a variety of professions and walks of life, TextAloud is affordably priced, and is simple for anyone with a PC, laptop or portable.

About NovelMaker.com

NovelMaker.com is the world's first truly interactive community for fiction writers, readers, critics, literary agents, editors, and publishers. From authors to agents, every participant in NovelMaker has access to a large, interactive community to participate with them in the creation, and potential commercial success, of new works of fiction. For more information, please visit www.NovelMaker.com.

About TextAloud

TextAloud has been featured in The New York Times, PC Magazine, Writer's Digest, on CNN, and more. Hailed by critics and users alike, TextAloud is priced at just $29.95, and is compatible with systems using Windows (R) 98, NT, 2000, XP and Vista. The program is available for fast, safe and secure purchase via http://www.NextUp.com

Thursday, August 2, 2007

A unique form of Text To Speech

This is one of the neatest twist on TTS I've ever seen
http://www.sr.se/P1/src/sing/#


Type in your text, and it will sing it back to you. Not sure there are many practical uses for this, but it is certainly a very clever idea that someone put a ton of work into.

Wednesday, August 1, 2007

Neat Kurzweil Reader for the Blind

Blind have ally in device that turns text into words

James Gashel had an "a-ha" moment last year as he walked to a boarding gate at Baltimore/Washington International Thurgood Marshall Airport.

He had just eaten at an airport restaurant. He had paid his bill in exact change and left a generous tip. Nothing unusual — except he didn't ask anyone to read to him.

Gashel has been blind for 50 years. Typically when he ate out, a waiter or a fellow diner would have to read the menu and the bill to him.

But last year at the airport, he used a portable machine instead.

"As I was walking to my boarding gate, I stopped and thought, 'You just had an experience that you've never had before,'" he said.

Since then, he has used the machine — the Kurzweil-National Federation of the Blind Reader — in many places and conducted thousands of transactions without anyone reading to him.

While he is independent and capable without the reader, the ability to handle printed text anytime and anywhere has simplified his daily schedule, he said.

"I find that to be life-changing," Gashel said.

The reader, unveiled last June, is the first hand-held device that translates text into speech, said Chris Danielsen, spokesman for the Baltimore-based National Federation of the Blind. Full Story...