Wednesday, November 28, 2007

SpeechTech TextAloud mention

Presenting the Bible, in MP3 or .Wav

While conducting missionary work in South Africa, Pieter Schutte encountered a problem common among English-speakers wishing to communicate with those whose primary language is not English: how to effectively get a message across in English, without confusion.

Schutte, who has been a missionary since the 1980s, struggled with getting Biblical passages and lessons across to Africans with poor English-language reading skills. After conducting an Internet search for a text to voice program, Schutte found the Web site for TextAloud, a division of NextUp Technologies, that provides both TTS and STT software. Schutte purchased the software for $29.95 and began using it to convert Biblical passages, Bible studies, and sermons for non-native English speakers into audio formats.

According to Rick Ellis, president of NextUp, the software provides further text comprehension when converted into audio format. "A light bulb will go off (in the user's head) if you’re (reading and listening) simultaneously," he states. "To hear it as they’re looking at it; it unlocks the reading."

Used predominantly as an assistive tool for people with various disabilities and English language learners, Ellis says the company has kept track of unique uses of TextAloud by customers like Schutte. Throughout the company's eight years of operation, Ellis states that the software is being used by a wider array of users since its release; including lawyers, court reporters, firefighters, and police officers.

The program is touted primarily as an educational or consumer product, and has the ability to convert text to speech on a PC or laptop, and supports files in text, HTML, and PDF formats. Schutte adds that he has found himself using the software to complete activities related to his personal life, but that his primary use of TextAloud is still for missionary work. Full Story...

SMS to TTS

This is one of those, Now why didn't I think of this. Our phone rings the other day. Caller ID was the cell phone of one of my daughter's friends. When I answered, I instantly knew it was TTS talking to me. The cool thing was, it was reading a text message (SMS) that her friend had typed on her cell phone.

Embarq Wireless Speaks to Landlines



Embarq wireless has introduced a new text-to-landline service for their wireless customers. The new service will automatically convert text messages sent to landlines into speech, with no extra fees other than standard messaging rates. The competitive advantage of Embarq’s new service is that they have built an entire dictionary of commonly used SMS acronyms and shorthand, which are translated by the system.

For instance, if you texted ‘LOL, Ricky, U R my BFF’ to my landline, I would get a recording that says, “Laugh out loud, Ricky, you are my best friend forever.”

Sprint also has this service, though I have not seen it advertised at all. It is listed as a feature on their text messaging information page, however. Sprint’s service does not appear to have the cool acronym dictionary, and I have not been able to test the accuracy of either system. One thing I can say, however, is that with Embarq and Sprint, in addition to services such as Spinvox, the speech-to-text and text-to-speech arena is getting interesting.

TTS and Animation

Interesting, this one is in my back yard almost.

The IMS Voice-to-Animation Solution Selected to Power

RESEARCH TRIANGLE PARK, N.C.– MakeBeliever Productions, LLC has selected Interactive Multimedia Solutions, Inc.'s IMS V2A MDKTM (Voice-to-Animation Multimedia Developer’s Kit) product for the development of the company’s innovative online personalized video greeting solution. MakeBeliever is an Arkansas-based company that was recently founded to revolutionize and transform the online greeting card market with an unprecedented level of personalization in online video greetings. For the first time ever, consumers can create their own videos and have a real human character say virtually anything you want to anyone you want with an automated process.

“Being part of this next generation of online video greeting cards is exciting and just the type of project that truly demonstrates the capabilities of our core solution offering,” said Donovan Moxey, Ph.D., founder & CEO of IMS. “The IMS team has been involved in every aspect of this project from conception to launch, and we along with our strategic partners are very pleased with the final solution released with the launch of the MakeBeliever web site.”

The IMS V2A MDKTM is a developer kit that takes a recorded voice and automatically creates precise mouth, jaw, and lip position data. The audio files can be in any language, and recorded by a human, or generated by a text-to-speech (TTS) engine. The core voice-to-animation engine is ideally suited for English dialog, and because of its unique proprietary co-articulation algorithm, animation data can be generated for French, Italian, German, Spanish, Portuguese, Japanese, Mandarin Chinese and a number of other languages.

In order to assist MakeBeliever in bringing this solution to market, IMS leveraged several of its key strategic international relationships to provide web development, software integration, text-to-speech, texture mapping and other services.

“What is really innovative about the MakeBeliever offering is for the first time ever a consumer-oriented solution that incorporates 3D animation and video compositing on a grand scale is being provided without the need for a post-production environment,” said Moxey. “The MakeBeliever V-Greetings utilize key technology components to showcase the true value of an automated voice-to-animation engine as part of an integrated multimedia solution that focuses on personalization and user-generated content.”

“We are very pleased with the total solution and the role that the IMS team has played in helping to make our vision a reality," says David Wallace, Co-Founder of MakeBeliever. "When we were looking for technology solutions for our product offering, we also wanted a company with innovative thinkers who would be dedicated and could execute; we definitely found this in the IMS team.”

IMS Products and Solutions have been used in a number of other consumer-oriented entertainment products and applications including the Spider-Man series of video games from Activision, Valve’s Half-Life2 game engine, and several game titles including “A Shark’s Tale” from Amaze Entertainment. MakeBeliever’s launch centers around charming Santa video greetings where Santa “knows” each child’s name, home state, favorite activities, nice deeds, what they want for Christmas and more.

“Early testing indicates that children have been mesmerized by the level of personalization in their greetings. Much of this believability is due to the technology and unique expertise IMS provided. It also has allowed us to keep our price below $10”, says MakeBeliever co-founder Ted Taggart.

“We are very excited to be part of this revolutionary change and look forward to continuing to team with MakeBeliever to expand this innovative technology to other applications and other languages, said Moxey. ”We think we are just getting started.”

About IMS
Interactive Multimedia Solutions, Inc. (IMS) is a multimedia software and solutions company focused on developing and delivering innovative software solutions that allow for the seamless integration of interactive animated talking digital avatars as part of multimedia solutions to be delivered via the web or as part of a DVD/CD-ROM. The software solutions developed by IMS can be applied across a number of markets including Online Entertainment, Gaming, Interactive Training, and Interactive Kiosk systems.
www.IMS3D.com

About MakeBeliever
MakeBeliever Productions is a multimedia company and the developer of a new online product category: V-Greetings (virtual greetings). Founded in 2007 by longtime friends and entrepreneurs Ted Taggart and David Wallace, MakeBeliever is changing the way people exchange online greetings by enabling customers to create realistic, personalized Flash videos that are delivered via e-mail MakeBeliever fills a distinct void in the Web 2.0 marketplace, offering customers an unmatched opportunity to create video messages specifically tailored to each recipient.
www.makebeliever.com

Word plug-in to create Daisy Books using TTS

This should be great for many sight-impaired users and should also help spurn more TTS Voice sales.

Word Daisy chain to help poorly-sighted

New plug-in saves to format designed to facilitate navigation of documents and web-pages

Clive Akass, Personal Computer World 26 Nov 2007

A new plug-in for Word 2007 will help visually-impaired people navigate complex documents.

The free utility, available early next year, will allow documents based on Microsoft’s new Open XML format to be converted to one called Daisy XML, developed originally to facilitate the navigation of audio books.

It will initially be of most use to people who produce talking books, designed for PCs or specialist devices, in which speech is synchronised wth large-print text, says Rob Longstaff, operations manager at the Royal National Institute for the Blind.

He explained that many partially-sighted people want to be able to see a large-print version of what is being read. “For instance, a student might need to know how to spell a word, and so needs to see as well as hear it.”

But, without the visual cues most of us take for granted, poorly sighted people find it hard to find their way round documents and have to reply on sound cues. Daisy, short for Digital Access Information System, is an industry standard that that helps set these up by flagging features such as headers and footers that can act as navigation aids.

The new Word 2007 plug-in is a joint project between Microsoft and the Daisy organisation, a consortium of non-profit organisations and talking libraries.

Longstaff said it should speed up the production of talking books but in the longer term text-to-speech utilities that read letters and other documents could be made smart enough to take advantage of Daisy XML.

At least 8 million people in the UK are 'print-disabled' and rely on synthetic speech to navigate electronic text, according to Julia Howell director of accessibility at digital design agency Fortune Cookie.

She said the Daisy plug-in will make it easier to make complex timetables and itineraries on travel sites more accessible to “anyone who chooses to navigate the web using sound.”

Wednesday, November 7, 2007

SVOX Pico

SVOX
has a cool name for their new small footprint embedded voices for phones.

SVOX announces SVOX Pico, a revolutionary new Hidden Markov Model-based Text-to-Speech Product for Mobile Phones


Seeking to catalyze large-scale adoption of cell phones, SVOX AG CEO Volker Jantzen today announced SVOX Pico, a revolutionary Hidden Markov Model (HMM)-based text-to-speech product, to help people and businesses better embrace mobile speech technology. SVOX Pico is the first dedicated handset solution to complement the growing success of SVOX speech technology in the mobile market and help even more people use the benefits of hands free mobile solutions.

"People expect to be able to do more and more with their cell phone, Volker Jantzen said. "Were building on our expertise across the globe to deliver speech user interface experiences that leverage the unique SVOX technology. With SVOX Pico we are opening new opportunities for cell phone users for a true hands-free, eyes-free access to information. SVOX puts an end to speech solutions that only lend themselves to one or two use cases. In contrast, SVOX Pico is designed to flexibly support a wide range of applications: navigation, location-based services, SMS, e-mail and screen reading as well as music content. Our TTS playback response time is very low and SVOX Pico produces voice output much faster than our competitor's TTS products. Thats one of the reasons we are the navigation industrys most trusted speech solution partner.

Industry Shows Broad Support for SVOX Pico

"Success in the mobile space means integrating powerful speech solutions that enhance the cell phone user experience, said Eric Lehmann, CSO, SVOX. "By supporting mobile device companies in more than 20 languages, we are building upon our long and successful alliance with the mobile industry to provide people with a compelling embedded speech solution. Mobility is the future of business. The SVOX Pico platform as the core of a highly attractive user interface will enable this future. We are going to roll-out dozens of new languages in order to serve our mobile industry customers better and keep up with their incredible growth rate. Our language portfolio will consist of over 40 languages in 2009.

Available in 2008, SVOX Pico breathes life into cell phones in 20 plus languages

Key benefits are natural, intelligible text-to-speech output supporting true hands-free, eyes-free user interaction with mobile devices. Low footprint (ca. 1 MB) and the modular SVOX software architecture support rapid integration and easy voice and language updates providing high quality TTS for the cost-sensitive mobile market. The unparalleled footprint / quality ratio is the breakthrough for speech technology in the mobile phone market.

Monday, November 5, 2007

TTS and Science Rap

In this hip-hop universe, science - not violence - is what they rap about



Hip-hop's evolution from genre to brand has included just about everything: clothing lines, movie roles, even NASCAR endorsements. It's easy to forget its origins are rooted in sharing a message, a lifestyle, and a story.

Under the guise of MC Hawking, Ken Leavitt-Lawrence has a story, too. The 36-year-old Web developer from Gloucester is a self described geek with a penchant for "Star Wars" and Led Zeppelin. More important, he is part of a growing crop of "nerdcore" hip-hop artists garnering buzz for rapping about computer science, video games, and comic books.

Coined in 2000 by movement pioneer Damian Hess, a.k.a. MC Frontalot, who opens his fall tour with a show in Providence Thursday and then comes to Harpers Ferry in Allston Friday, nerdcore includes artists who have earned degrees from Ivy League schools, as well as those who haven't even graduated high school.

What connects them is a DIY approach and an underdog's passion for self-expression - whether to share the struggle of being bullied or of studying for PhD exams.

As MC Hawking, Leavitt-Lawrence uses a text-to-speech program to imitate researcher Stephen Hawking's digitized voice. He mixes self-produced beats and live instrumentation that he records at a studio a few miles away from his duplex. While he acknowl edges a certain amount of parody to what he does, it's a craft he takes seriously. He says he spends hours researching such concepts as entropy and natural law to use for his rhymes.

In one song, Leavitt-Lawrence takes on creationists with the lyrical jab "upon blind faith they place reliance/ What we need more of is science."

With a sharp delivery, MC Hawking has skill behind the mike, but it's unlikely that will translate to Top 40 radio success. Because Leavitt-Lawrence's Hawking persona is confined to a wheelchair, he can't even take his act on the road. It's that idea - Stephen Hawking rapping - that reveals the genre's narrow if quirky appeal.

At 33, Hess is a nerdcore veteran - and among the rappers with the most crossover appeal. Marketing and releasing his music independently, Hess has sold about 7,000 albums in fewer than two years, just enough to pay for groceries and rent. With percussion-heavy beats and a confident flow, his music tackles everything from homophobia and the indie-music scene to getting older.

"The whole nerd narrative is alienation and powerlessness," he says from Brooklyn.

Hess recognizes nerdcore's potential - and also its limitations. "If [nerdcore was] to make record-company money, it'd have to become 15 or 20 times bigger if they marketed it carefully," he says. "When you're coming up through the Internet, you're not following the traditional model for music. For me, I just sat by a computer to see who would bite. I didn't do any touring until there were lots of fans e-mailing me all the time."

What may surprise many listeners is not where nerdcore differs from mainstream hip-hop, but where the two converge. In nerdcore, rap rivalries, a staple of mainstream hip-hop, are fought on message boards like rhyme torrents.com instead of face to face. Threats are also refracted through a nerd prism. "[In nerdcore], it's all about 'I can find your Social Security number; I'll find out where you live,' " says Dan Lamoureux, director of the upcoming documentary "Nerdcore for Life," which chronicles the stories of several nerdcore rappers. Full Story from Boston Globe...