I seem to have got TextAloud running on Linux with NeoSpeech US English premium voices.
TextAloud icon is now sitting on my Linux desktop and I can simply click it to start using this great application in Linux. An important point is that TextAloud gets integrated with Linux environment so that I can now directly open and save documents in TextAloud on my Linux machine. Running TextAloud on Linux also lets me directly convert texts to WAV and MP3 audio files that are flawlessly written to the hard disk using Linux file system.
I decided to post a scenario that helped me get TextAloud to work on Linux with the hope that other users looking forward to moving TextAloud to Linux could probably find these guidelines helpful.
1. Install Wine. Wine is needed to run TextAloud on Linux. If Wine is not installed on your system, you will have to download and install appropriate package. See Wine User Guide for detailed instructions.
2. Configure Wine. Latest builds have graphic interface and are easy to configure. Just run 'winecfg' in terminal window to invoke the configuration applet.
A. I’d strongly recommend setting Wine to emulate Windows Me. I’ve tried other Windows systems on top a Red Had based Linux and have found that Windows Me was the fastest and didn’t cause any issues with opening and saving files in TextAloud.
B. Make sure Wine is set to use appropriate sound drivers. Older Linux distributions (kernel 2.4) typically use the OSS driver, while 2.6 kernels have switched to ALSA. (In my setup ALSA worked fine.) Also set Acceleration to ‘Standard’ and uncheck ‘Driver Emulation’ box under audio settings tab.
3. Install SAPI5. You will need SAPI5 to use NeoSpeech premium voices with TextAloud on Linux. If you want to check whether or not SAPI5 has already been installed, download, unzip and run FixRegistry utility
If FixRegistry reports a SAPI error, download and install SAPI5
4. Install TextAloud. For some unknown reasons, TextAloud v 223x might need being reinstalled to work properly on Linux. It was not until I reinstalled TextAloud that its icon showed up on my Linux desktop and TextAloud started ok. I installed TextAloud using default options suggested by the installer and then just installed it for the second time on top the first installation.
5. Launch TextAloud to verify it works ok with SAPI5 Microsoft voices and then exit. If you see the splash screen but TextAloud doesn’t actually start and freezes at the initialization stage, you should try different audio settings in Wine. The reason to freeze is that, most likely, TextAloud just can’t find the audio device. Try switching to a different audio driver in Wine.
6. Install NeoSpeech voices. Please note, that NeoSpeech temporary files take about 1 GB of disk space so the system might ‘freeze’ for a minute or two during the install.
7. Launch TextAloud and enjoy using this great program with NeoSpeech premium voices on Linux.
Hopefully this info can be of any help to users who want to start using TextAloud in Linux. As it’s impossible to work out a universal scenario applicable to all Linux versions, I’d greatly appreciate any comments, corrections, and additions to this topic.
Tuesday, February 27, 2007
Monday, February 26, 2007
did some checking on the idea of using the windows desktop as a handsfree device for a cell phone. In other words, the PC acts like a headset, where audio output from the PC is sent to your cell phone mic, and audio input from the cell phone is directed to your pc speakers.
I found that the Widcomm Bluetooth stack includes a Headset service, and was pretty simple to set up to work with Text-To-Speech output.
Assuming you have the Widcomm stack installed, here are the steps:
1. Enable bluetooth on the pc and cell phone.
2. On the cell phone, search for new devices, and locate your pc. Add that device.
3. Next, scan the device for services and look for "Headset". Select this service.
4. On your phone, mark your pc 'device' as handsfree.
Now when you make a phone call, your cell phone should automatically connect to the pc as if it were an ordinary headset.
The only other thing you need to do is choose "Select Audio Device" from the NextUp Talker Options menu, and set the audio output device. On my system the audio device is named "Bluetooth Audio". Once this is selected, TTS output from NextUp Talker is sent to your 'headset' microphone.
The real trick to this is figuring out what Bluetooth stack you're using, and getting the Widcomm stack installed if you happen to be using the Microsoft stack. If you're using a stack other than Microsoft or Widcomm, you'll need to find out if the stack includes a headset service.
Some helpful links:
Finding Your Bluetooth PC Stack
Updating the Bluetooth stack on your XP Computer
This second link talks about bluetooth headsets in general, and covers how to switch from Microsoft to the Widcomm stack.
The interview with Paige was awesome; everything worked. She will now edit the recording before putting it “on air”. I’ll announce the details once I know them.
For now, I would like to share how I, with cerebral palsy and a significant speech impairment, was able to give my first radio show! It is actually mind-boggling that this “non-verbal” red-head was able to do this.
Here are the steps taken to accomplish this feat:
- Paige sent me her questions ahead of time.
- I typed my responses into Microsoft Word.
- I copied each individual response into my text-to-speech software TextAloud and tweaked the text so that my computerized voice Kate reads it as accurately as possible.
- I saved each response as a separate wave file.
- I created one PowerPoint slide with links to each wave file; that way each response is only one mouse click away.
- In Paige’s online room used for recording, when it was time to give response, I hit the microphone button and then the appropriate link in PowerPoint.
- Voila…my first radio interview!
Friday, February 23, 2007
|Build Your Own Talking Voice With VoiceForge(TM) by Cepstral|
|Self-Directed Voice-Banking Tools for Creating Synthetic TTS Voices|
Distribution Source : Market Wire
Date : Tuesday, February 20, 2007
PITTSBURGH, PA -- (Market Wire - Feb 20, 2007) -- Cepstral LLC announced the release of VoiceForge(TM), a web 2.0 product that can turn a set of recorded audio prompts into a Text-to-Speech (TTS) voice capable of saying anything. With VoiceForge(TM), companies or actors can capture or "bank" their voices on their own. Once a voice is synthetically forged, it can be used to speak dynamic information for Entertainment, Telephony, Navigation, Education, or Reminder applications.
VoiceForge(TM) is a novel suite of web tools that gives clients the ability to create their own voices rapidly and inexpensively by themselves. Furthermore, the client retains all intellectual property rights to his/her voice creations. The final voice database is a plug-in that synthesizes using Cepstral's core engine. Cepstral's core engine runs on all platforms from the smallest cell phone devices to large distributed systems as well as PC, Mac, and Linux desktops offering unparalleled flexibility with respect to distribution once a voice is finished.
"As an industry, we make voices that are safe, but not necessarily exciting," said Cepstral CEO Craig Campbell. "With VoiceForge(TM), clients can now create unique high-quality TTS voices that keep pace with consumer and business demand for branded, celebrity, ethnic, and even cartoon personalities. To cite but one example of the need to improve voice diversity, there are currently no African American TTS voices available," added Mr. Campbell.
VoiceForge(TM) may help spur a new layer of speech services as companies take on voice building to serve the specific needs of their vertical markets. In the 1990s, Cepstral's founders released an open source TTS engine, Festival. Today, Cepstral's tools are proprietary, but there exists an ecosystem of "master voice builders" who have toiled under the complex old tools and welcome newer-better-faster ones. One such partner is Silex Creations who uses VoiceForge(TM) to offer professional voice creation in conjunction with their audio and voice manipulation technology. "The VoiceForge system has been fast and intuitive. Within days we can hear voices. We are now in a position where we can commercially apply our experience and help clients bring truly exciting speech products to market," said François Lanctôt, president of Silex Creations.
VoiceForge(TM) is a breakthrough for any company, entertainer, or brand manager interested in voice banking their talent, preserving a celebrity voice for an estate, or extending a franchise to include dynamic features such as VoIP announcements, SMS-to-Voice, Text-to-Podcast, custom ring tones, etc. Clients have the option to bring the tools in-house, or contract services through third-party experts like Silex.
About Cepstral LLC
Cepstral is a speech technology company based in Pittsburgh, PA, USA, which provides speech technologies and services for the spoken delivery of information. We build high quality, natural sounding voices for hand-held, desktop, and server applications. Cepstral: We Build Voices.
Thursday, February 22, 2007
According to a survey conducted by text-to-speech software manufacturer NextUp.com, three out of four iPod and portable consumers also use their devices for listening to text outside the home or office, with 60 percent listening in their car.
The company manufactures TextAloud software utilizing voice synthesis to "speak aloud" documents, Web pages and e-mails for playback by MP3 players.
Rick Ellis, president of NextUp.com, said most people view their commute to work as wasted time, but with the use of portable audio devices like iPods, a whole new world for the commuter has opened up.
According to Novi Mayor David Landry, 183,000 vehicles travel Interstate 275 each day, and 153,000 vehicles pass through Interstate 96.
"The world passes through Novi," he said during his recent state of the city address.
And traffic backups come hand-in-hand with long or congested commutes, causing many drivers to reach for their MP3 players for solace.
Spoken Translation(TM) Unveils World's First Software for Reliable Translation of Extensive Written or Spoken Conversations
Converser is a system for two-way translated communication between limited-English-speaking patients and English-speaking caregivers. The system allows people who do not speak the same language to hold broad health-related conversations in real time, without a human interpreter. It addresses a major pain-point in healthcare organizations: low budgets for patient communication and interpreting services. Converser gives medical institutions a translation solution that not only significantly reduces costs but improves overall patient safety, helping to eliminate numerous grave errors made by non-professional human interpreters.
|Converser is a system for two-way translated communication between limited-English-speaking patients and English-speaking caregivers|
Converser represents a fundamental advance in Machine Translation (MT) technology. No other system on the market today can provide reliable, bi-directional, real-time, wide-ranging translation via multiple interface modalities including speech recognition.
To improve translation accuracy and enhance the user experience, Converser provides reverse (or back-) translations and permits verification and selection of word definitions to ensure that the translation is "what you mean to say." Never before has a commercial product for conversational translation enabled a user to verify in real time that the translation is accurate, and, if not, to correct it on the spot. By allowing even monolingual users to monitor and guide translations as they happen, Converser promotes understanding of and trust in its translations, even in wide-ranging conversations. Monolinguals are thus empowered in multilingual settings, achieving an unprecedented degree of control. Other software products usable for real-time translation (e.g. free online translation services like http://babelfish.altavista.com) provide no such control or confidence.
Monitoring human translators is also impractical, although human translation errors have been a significant issue in healthcare institutions. Studies have shown that non-professional medical interpreters risk patient safety and increase liability. According to a study published in Pediatrics, the leading journal for illnesses affecting children, an average of 31 interpreter errors occurred on each of the 13 doctor visits studied.
Some of the mistakes were minor, such as omission of a word that did not significantly change a doctor's meaning, but 63% were considered serious enough to have medical consequences. In these cases, the incorrect translation changed the description of an illness to the doctor, misstated diagnostic or treatment options, or affected a parent's understanding of a child's condition or the need for follow-up visits or referrals.
The problem is a serious one. According to a 2004 study of 200 state hospitals, roughly 51 percent of California hospital patients who needed translation did not receive it. San Mateo Medical Center spokesman Dave Hook said in an Examiner.com article last summer that an estimated 35 percent, or about 23,400, of the Medical Center's annual patients speak a language other than English. He added that this number is growing. A significant number of medical errors have occurred nationwide because people have misinterpreted medical information.
In the U.S., the increasing number of patients with limited English proficiency (LEP) has recently been attracting considerable attention in federal and state legislatures. Language barriers impact the quality of care, service utilization, patient satisfaction, health outcomes, legal liability, and hospital admissions, and have resulted in excessive costs within the healthcare industry.
As a result, the Department of Health and Human Services (HHS) Office of Civil Rights and Office of Minority Health has mandated that any entities receiving federal funds, including healthcare organizations, "must offer and provide language assistance services, including bilingual staff and interpreter services, at no cost to each patient/consumer with limited English proficiency, at all points of contact, in a timely manner during all hours of operation."(1) The 6,003 hospitals (2003 http://www.USNews.com) and 836,156 physicians in the United States (2001 http://www.ama.com) are expected to absorb hundreds of millions of dollars to comply. Converser for Healthcare will directly address this demand and provide institutions with a reliable alternative which is lower in cost than any other solution on the market today.
Benefits of the Converser for Healthcare translation system: Converser is
-- A cost-efficient alternative to human translators, interpreters, and transcribers.
-- Highly reliable. It is the first broad-coverage translation product that allows a user to check accuracy and easily correct errors in real time, using Spoken Translation's unique Meaning Cues(TM) technology.
-- A private, consistent, and verifiable solution for translation in an environment where mistranslation could result in medical reporting errors and incorrect patient diagnoses.
-- An around-the-clock system that can be used anywhere, anytime.
-- Capable of broader, more reliable translation results than other automatic translation solutions on the market today.
Availability & Pricing:
Converser for Healthcare is available starting in March 2007 for English to Spanish, with other languages planned for release later this year. Chinese is planned for the healthcare market, while German and Japanese are currently under development for other markets. Converser can run on Tablet PCs or laptops (full-size or ultra-portable), and release is planned for numerous handheld devices. Converser uses Nuance's RealSpeak text-to-speech engine, which accurately pronounces translated text.
The list price is slated for $1,499. In North America, Spoken Translation will sell Converser for Healthcare through VARs (value added resellers), direct sales, and government contracts. In North America and worldwide, sales through OEMs are also planned.
For further information about reselling or purchasing Converser, please call 1-866-SPOKENT.
Monday, February 19, 2007
|Samantha US English Female 22khz Voice Sample **New**|
|MP3 File||WMA File||WAV File|
|Sangeeta Indian Accent English 22khz Voice Sample **New**|
|MP3 File||WMA File||WAV File|
|Yannick German Male 22khz Voice Sample **New**|
|MP3 File||WMA File||WAV File|
|Alexandros Greek Male 22khz Voice Sample **New**|
|MP3 File||WMA File||WAV File|
Especially excited about Alexandros as he is our first Greek voice.
Purchase is via
Sunday, February 18, 2007
We continue to receive a lot of requests for TextAloud on PocketPC. While we still don't have this ready yet, we have found a partner with a decent PocketPC product. It uses the Neospeech voices Kate and Paul. There isn't a downloadable trial version, but is available for purchase. You can read more about it through the affiliate links below.
If you do purchase, if we do come up with a TextAloud for PocketPC, we will give you a free upgrade to it.
To read more, see screenshots and purchase, use the links below:
Windows Mobile PocketPC Version
Windows Mobile SmartPhone Version
If you do purchase be sure to email us and let us know what you think.
Kevin Lenzo has a unique background in academia, the open source community, and now as the founder of Cepstral, a text-to-speech (TTS) company seeking to interact with the open source community to build a commercial product. This gives him a panoramic view of both the potential and the problems involved in implementing voice technology most effectively.
Lenzo asserts that the key speech technology is not speech recognition, but text-to-speech. Speech output is of paramount importance, not speech input. He uses the example of a car radio to illustrate that buttons can be just as effective in controlling what you hear, with greater privacy, and without the inevitable occasional failures associated with speech recognition.
Lenzo presents a long list of possible applications of TTS, including hands-free in-car navigation systems, location-based weather reporting, remote network monitoring, and just-in-time broadcasting. He contrasts the latter with packaged podcasts that can end up relaying stale information. In all his examples, he sees it as crucial that devices are driven by user needs rather than the needs of the service provider, so that such applications can evolve into what he terms "an external brian" that, in a sense, controls the user. This may sound almost threatening on first hearing, but Lenzo welcomes devices, such as location-aware warehouse systems, that can guide and inform as you perform other tasks.
There are other areas in which TTS can be extremely useful. Lenzo is involved in a project in Kenya to provide speech services via phone in areas where computers are extremely rare. With literacy and language problems, TTS can provide accessibility to information that traditional computing cannot.After bemoaning the problems involved in porting VoiceXML across different platforms, Lenzo ends his presentation with a plea for a vendor-independent cross-platform API for speech components.
More from ITConversations...
Friday, February 16, 2007
Within TextAloud on the main menu, there are 2 items on the Speak menu
Speak->Loop Speak Current Article
Speak->Loop Speak All Articles
With either of these, text will be repeated over and over until Stop is pressed.
This could certainly be an interesting way to lean material.
Thursday, February 15, 2007
Text To Speech to the rescue. Below are some quotes from writer's who use it
Praise from Writers who use TextAloud:
"Everything I write gets the run-through with TextAloud now," comments fiction writer Tom Hannon. "It is as important as running a spell-check." It's not uncommon, when proofreading from a screen or printed page, to miss sentence fragments or improper word choices. However, "with TextAloud, you can hear when something doesn't sound right," he adds. "It's so easy to use. It is the greatest writing tool since the word processor."
"I use TextAloud every day when I am writing," comments multi-published fiction author Kathryn Caskie, whose latest book, A Lady's Guide to Rakes, was released on September 1, 2005 for Warner Books. "In fact, when I was recently out of town and didn't have access to my favorite voice 'Audrey,' I wasn't nearly as productive."
Caskie finds the program valuable not only for its ambiance (listening to British-accented voice 'Audrey' helps her to get into the feel of the English characters in her historical romance in progress), but also for its value as a proofreading tool. "Most writers read their work aloud to themselves in order to make sure the dialogue sounds natural, as well as to catch typos and to ensure that their prose flows. However, by time many authors get to this point, they've likely read any given passage several times. So, too often, we read what 'should' be on that page, not necessarily what actually appears. With TextAloud, I can just sit back with a cup of coffee and listen to my book read back to me. I follow along on the computer screen and correct any typos, and will also pause instantly if a particular line of dialogue doesn't sing."
She even uses TextAloud to save her chapters in MP3 format, loading them onto her iPod for later listening at her convenience, whether in the car or at a child's soccer game. "I've used TextAloud for two published books so far," she comments. "I love the program, and rave about it to all of my author friends."
Book Reviewer and fantasy novelist Sherryl King-Wilds values TextAloud's usefulness when editing her novels, as well as in writing her book reviews, and has high praise for the program's edit function in particular. "The ability to highlight a certain section of text and hear it without having to listen to endless blocks of 'wordage' gives me flexibility and saves me much-needed time," she says.
In her fiction writing, as well as her book reviews and articles for the webzine Fantasy Novel Review, the program alerts King-Wilds to awkward language - or worse, "the horrible typo," and allows her to correct such mistakes instantly. "I can fix things then and there," she comments, "without having to worry so much about my editor's flowing red ink pen." For her, TextAloud has proven to be "a time saver, an editor, a partner - even a preservationist of sanity" when deadlines approached.
So I wanted to give a quick run-through of how to setup and use the TextAloud Proofread hotkey for proofreading. Here is the FAQ response I usually give
One great way to improve your writing is with better proofreading. Spell checkers catch the typos but don't help much with missing or wrong words, or with bad grammar. Whether you write for a living or just want your emails to be mistake free, using TextAloud to listen to text will help catch most mistakes. To make this process even easier, in TextAloud 2.0 we've created a special proofreading process that makes catching mistakes and correcting them very simple.From the TextAloud Manual
You start by setting a proofread hotkey. Go to Options->Hotkey Setup on the TextAloud main menu, and select a hotkey combination for Proofreading. Hotkeys are special key combinations that you can press within any program, and as long as TextAloud is running (even if the window isn't displayed), TextAloud will take an action. You want the hotkey to be an obscure combination so that it will be unique and not in use by other programs. We recommend Control-Alt-Shift-P for Proofread. Once this is set, click OK. Now you can minimize the TextAloud window.
After typing in your document or email, highlight a paragraph of text with your mouse, then hold down the Shift, Control, and Alt keys, and press P. The TextAloud Proofread window will pop up. The text of the paragraph will be spoken, with each word highlighted. If you hear a mistake, simply click anywhere on the proofread window and speaking will stop, returning you to your original document to make the correction. You simply repeat this process through the document until it is mistake free. Based on the feedback we've heard so far, this easy process can quickly lead to mistake free writing.
An often overlooked use for TextAloud is to help proofread. Spell Checkers within word processors and email clients help correct many common typing and spelling errors, but do little to correct other common problems such as typing the wrong word, leaving out words, or poorly constructed sentences. Proofreading the old fashioned way is often in ineffective too, because our brains are so adept at reading that we will often not catch mistakes. But hearing our own written words spoken back to us in another voice will almost always alert us to mistakes.
To assist with this proofreading task, TextAloud has a special Proofread HotKey and Popup Proofreading window. Via Options->HotKey Setup you can set a keyboard combination that will activate the proofreading window. Choose something obscure to insure other programs will not be using the combination. We suggest using Control-Shift-Alt-P, but you can experiment with any combination that works for you.
The theory behind using the Proofreading function is that you need to hear small sections of text at a time, while watching the words being highlighted. Since you typically aren’t typing this text within TextAloud, but within your email or word processing program, you need a way to quickly return to the text to make any corrections. This means that using the TextAloud main window could become cumbersome. So instead, with the Proofread HotKey, a popup window will show you the text, if you see a mistake, simply click on the window and speaking will stop and you will be returned to the program you are writing in to make corrections.
To demonstrate this process, assume you are typing a document in your word processor such as Microsoft Word. Once the document or a section of the document is complete and you are ready to proofread, return to the top and highlight a paragraph. Next, hit the Proofread HotKey combination (Control-Alt-Shift-P for example). The TextAloud Proofreading Window will appear as shown below:
Text from the highlighted paragraph will immediately begin speaking as words are highlighted. You can customize the size of the window, voice and speed used, as well as Font and Colors used for the text. These settings will be remembered for future use. Most users will increase speed to slightly faster than normal listening since this is text they are already familiar with.
If while listening and watching the text you find a mistake, simply click anywhere within the text area of this window and speaking will stop and the window disappears, returning you to your word processor. Correct the mistake you found, and repeat the process. If no mistake is found, when the paragraph is complete, the window disappears and you are ready to repeat the process with the next paragraph, until the document is completed. This process will greatly cut down on mistakes in your writing.
Wednesday, February 14, 2007
MUMBAI: Imagine you are in a foreign country where you don't speak the language, and you need to decipher a confusing train schedule in a hurry. Wouldn't it be handy to be able to talk into a device, asking questions about departures and ticket prices, and have your queries translated into spoken word in the native language of train officials? Thanks to IBM's experiments with translation and speech technology, the spoken language gap for travellers, and others who might need a personal translator in their pockets, may be bridged now.
Revealing some innovative research initiatives that are under way, IBM innovation and technology executive vice-president Nicholas Donofrio says: "We have been working on speech technology for nearly 35 years now. As opposed to our earlier efforts where we were solely focussed on perfect translation, the ones that can stand up in a court of law, or face up to financial scrutiny, this time around we focused on its use in other parlances where perfection does not matter. A technology was born that can offer translation service in real time." Although several companies, including IBM, produce software that provides text-to-speech translation, so-called speech-to-speech translation engines have always remained on the horizon.
The prototype of the IBM software, dubbed Multilingual Automatic Speech-to-Speech Technology or MASTOR, "lets someone speak to me in say Hindi or Chinese, and the receiver understand by way of MASTOR what is being said in English in real time. May be a few words would be changed, give or take a few prepositions and adjectives here and there. But the whole idea now is to have speech-to-speech language translation even without the perfection of the language or grammatical skills on the part of the technology," explains Mr Donofrio.
IBM's earlier attempts at speech translation include ViaVoice, which gave voice handheld devices like PDAs, and Phrasalator. Now, after having tested with the US armed forces in the Iraq war, IBM has commercial plans for its newest technological breakthrough. It intends to explore market opportunities where language translation technologies are in high demand, including medical facilities, law enforcement, banking and travel. "We are planning to talk to telcos and get them interested in offering the service," the innovation head for IBM adds. The company also plans to offer the service to first-time care givers such as the fire department.
The technology also brings good news to gamers around the world. IBM has plans for the technology for gaming companies. "The technology has uses in massively multiplayer online games aka MMOGs. Imagine you are playing World of Warcraft with players from different countries. How can you converse in a common language? Well with MASTOR, you could be playing from Korea, China or Italy, all at the same time and everybody could understand each other, " he explains. For internet users, IBM also has plans to bring this technology to use in e-mails. "The service is not just for voice, it works for text as well," says Mr Donofrio.
Monday, February 12, 2007
NeoSpeech led in turn to the discovery of TextAloud, where NeoSpeech and other compatible voices are sold. While a TextAloud demo succeeded in getting the Microsoft Korean voice to work, the quality was predictably robotic - though my girlfriend thought at first it was reading with a North Korean accent - make of this what you will... Having satisfied myself that a combination of TextAloud and the NeoSpeech Korean 'Yumi' voice was the best, I bought them. Purchasing was fairly painless but the voice file was a 550Mb download which took a while.
Create audio books for iPod with TextAloud make your own MP3 audio
TextAloud is the text to speech tool that enables you to create your very own audio books, ready to Create your own audio books. In today’s busy world, audio books are a excellent way of being
We live in such a fast paced world. Everything that we do is getting quicker and quicker. Waiting a minute or two for something 10 years ago is equivalent to waiting 2 or 3 seconds today. Don't you think that is true? And we are now masters at multitasking. Even men multitask and that used to be thought of as a woman's talent. I found a new to me gadget that helps us to multitask called text to speech. More specifically it is a tool called TextAloud. It is really rather amazing! It can read your email, web pages, reports and lots more aloud to you on your PC. Imagine being able to dust and have your email read aloud to you, or cook supper while listening to a report for work. You can also have them saved as Mp3 or Windows Media files. Then you can just take your reading with you and listen on your iPod or PocketPC. Isn't that amazing? More...
I have found something good!
As an increasingly frequent writer, my concern about putting out well written material has given me pause for thought. Some recently read, self-published books that I found to be horrific literary adventures, have only added to my concern. Given that I write a blog, I can't very well blame my editor, publisher or proof-reader.
I imagine that I could blame my wife (my unpaid editor)... but doing so would only confirm suspicions that my jackass writings are indeed written by... a jackass.
Well, the good thing I have found will at least let people know that I can, more frequently than not, put together the basics of grammar, punctuation and spelling.
The tool is called TextAloud, and it simply converts written words into words spoken aloud by my computer.
Sunday, February 11, 2007
Troubleshooting can be a difficult task, especially if you have not worked with a specific technology before. When it comes to troubleshooting text-to-speech problems, there are a few points that you should keep in mind.
- Use the Preview Text button from the Speech Properties dialog box to verify that the TTS engine.
- Open the Utility Manager to check the status of the Narrator program.
- If you do not hear any sound and you are using external speakers, make sure they are turned on.
- Check the Master Volume dialog box to make sure that muting is not enabled.
- Verify that the speakers are properly connected to the computer. You may need to check the documentation that came with the speakers for the proper procedure.
- Use Device Manager to check the status of the computer’s sound card. If necessary, reinstall or update the drivers for the device.
Friday, February 9, 2007
I want you to meet Mary. She has a monotone voice and speaks at a slow, steady pace. It`s hard to listen to her for more than a few minutes at a time and yet I regularly submit to listening to her for several hours. Why, you ask? Because Mary in the voice in a computer program I have called TextAloud. (For more information see www.NextUp.com ) And one of the final steps on every manuscript is to let Mary read it to me. The purpose is not that I might enjoy the story. If I do, that`s a bonus. What I`m wanting is to hear and catch typos, repeated words, things like using ever instead of even. And it`s amazing how many little things I catch. Things that would normally be called line editing. But with Mary`s help I can correct them before the manuscript goes to the editor. It`s nice to know I`m sending a manuscript as clean as I can make it.
Thursday, February 8, 2007
These voices have come a long way over the last few years, and a few of them are very, very good. Cepstral's voices are known for smaller footprint, using less memory and CPU Power, and being very quick at creating audio files. They are also unique in having a couple of fun voices, Shouty and Whispery, which are truly unique, and only cost $6.95 each.
The other unique thing about Cesptral is you can download trial versions of their voices. These voices are trial versions, which will work forever, but until purchased they have a little audio notice at the begin of the audio.
You can download a trial of my favorite, Callie at
or the others via the Cepstral Store.
Samples and more info from our site below:
*Version 4.0 Cepstral Voices Now Available* Exciting new voices from Cepstral® are now available for only $29.99 each. These high quality voices take up less disk space (average less than 50mb) than most premium voices, do not use as much processor power, and are very fast when creating audio files. These SAPI5 compliant voices are supported by all NextUp.com products as well as most other speech products.
|Click Here to purchase any or all of these great new voices from Cepstral® for only $29.99 each and download them after purchase.|
Wednesday, February 7, 2007
is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech.
Synthesized speech can also be created by concatenating pieces of recorded speech that are stored in a database. Systems differ in the size of the stored speech units; a system that stores phones or diphones provides the largest output range, but may lack clarity. For specific usage domains, the storage of entire words or sentences allows for high-quality output. Alternatively, a synthesizer can incorporate a model of the vocal tract and other human voice characteristics to create a completely "synthetic" voice output.
The quality of a speech synthesizer is judged by its similarity to the human voice, and by its ability to be understood. An intelligible text-to-speech program allows people with visual impairments or reading disabilities to listen to written works on a home computer. Many computer operating systems have included speech synthesizers since the early 1980s.
- 1 Overview of text processing
- 2 History
- 3 Synthesizer technologies
- 4 Challenges
- 5 Computer operating systems or outlets with speech synthesis
- 6 Speech synthesis markup languages
- 7 References
- 8 See also
- 9 External links
Overview of text processing
A text-to-speech system (or "engine") is composed of two parts: a front-end and a back-end. The front-end has two major tasks. First, it converts raw text containing symbols like numbers and abbreviations into the equivalent of written-out words. This process is often called text normalization, pre-processing, or tokenization. The front-end then assigns phonetic transcriptions to each word, and divides and marks the text into prosodic units, like phrases, clauses, and sentences. The process of assigning phonetic transcriptions to words is called text-to-phoneme or grapheme-to-phoneme conversion . Phonetic transcriptions and prosody information together make up the symbolic linguistic representation that is output by the front-end. The back-end—often referred to as the synthesizer—then converts the symbolic linguistic representation into sound.
Monday, February 5, 2007
The Kate and Paul bundle of US English voices is $35 and they are fantastic. Neospeech is also unique in their focus on Asian voices. There is a nice interactive demo of the voices at
Some samples from our site
NextUp.com is pleased to now be able to offer new Text To Speech voices from Neospeech. Kate and Paul are US English voices, available in 16khz or 8khz versions, supporting SAPI5 Speech applications including all NextUp.com Products, most newer TTS programs from other companies, as well as TTS functions built into Windows XP.
Asian voices for Korean, Japanese, and Mandarin Chinese are available in 16khz SAPI5 Verisons.
Each voice requires between 300 and 650mb of disk space, and is available on CD or via download. They support speed and pitch adjustments, and require a minimum of Pentium II, 400mhz with 128mb RAM.
Listen to NeoSpeech Samples below:
The technology as created by VoiceWare is called VoiceText. Some of the info from their site on these voices: