InvestorsHub Logo
Followers 3156
Posts 961239
Boards Moderated 205
Alias Born 09/04/2000

Re: mick post# 53

Friday, 11/18/2005 7:55:22 AM

Friday, November 18, 2005 7:55:22 AM

Post# of 117
PRODUCT REVIEW: Podcasts Converted to Text


Email this Story

Nov 17, 9:42 PM (ET)

By BRIAN BERGSTEIN

(AP) Suranga Chandratillake of Blinkx is photographed at his office in San Francisco, Thursday, Nov. 17,...
Full Image



Google sponsored links
$1,000 - $7,000 Per Week - Direct Deposit Daily to Your Bank $100 Application Fee Required
www.weeklycashincome.com

Debt Consolidation Loans - Erase debt now! No credit check. Up to 4 free debt consolidation quotes
www.LowerMyBills.com







BOSTON (AP) - Suddenly the universe of downloadable audio files known as podcasts seems as enormous as the Internet. Name a topic - from the weather in Asuncion to the ZigBee wireless technology - and there is a podcast about it.

But while the Internet's vastness is accessible because of deep-probing search engines, comparably authoritative services for podcasts and other multimedia haven't really emerged.

That's because search programs are primed to catalog text. When they encounter an audio or video file, generally they determine the contents by reading the titles and other descriptive tags, known as "metadata," that creators voluntarily add.

It's useful, but much like examining only the first few lines of a Web site. Reading the whole thing is a lot better.


(AP) Suranga Chandratillake of Blinkx is photographed at his office in San Francisco, Thursday, Nov. 17,...
Full Image


With that in mind, a few companies are trying to make search engines actually listen to big audio and video files. From there, speech-to-text software can generate written transcripts, which are searched in addition to metadata.

Perhaps best known has been Blinkx Inc., an information-management startup that gets its speech-to-text software from Autonomy Corp.

Now comes BBN Technologies Inc., a defense contractor that developed elements of the Internet. After tinkering with speech-to-text programs it created for U.S. intelligence services, BBN has produced Podzinger, a Web service that mines the content of podcasts.

A third service, Podscope, from a broadcast-monitoring company called TV Eyes Inc., performs a similar trick, but with a twist. CEO David Ives says Podscope uses some voice-recognition technology but mainly scans for phonemes - the individual sounds that make up syllables - rather than full words.

America Online Inc. is a big fan - it's due to begin using Podscope as its podcasting search engine this fall.


(AP) BBN Technologies Delta Division's president Alex Laats poses at company offices in Cambridge, Mass,...
Full Image


I tried all three, and found BBN's Podzinger best at podcast searches because it offered the most user-friendly options.

- Podzinger lets you expand the links in search results to read a podcast's metadata, so you can quickly tell what kind of show it is. Podscope does, too; Blinkx does not.

- Podzinger lets you stream a podcast if you don't feel like taking on a time-consuming download. Podscope handles that too; Blinkx seems to do it only for video clips.

- The results displayed by Podzinger helpfully include segments of the transcript include the terms you were looking for. Then, by clicking on the transcript, you can instantly play a sample of the file from that moment.

That turns out to be the big differentiator, in my view.


(AP) BBN Technologies Delta Division's president Alex Laats poses with an Apple iPod in front of a...
Full Image


Podscope also lets you jump to moments in which it believes your term is mentioned. But you have to spend time listening to each snippet because the phoneme engine doesn't produce a transcript you can visually scan.

Blinkx shows a transcript, but you still have to cue up a clip from the beginning and find on your own the moment you think your subject might come up.

Blinkx appears to search the biggest pool of material - not only 45,000 podcasts but also millions of hours of TV broadcasts and homegrown video clips, which are displayed cleverly in thumbnail images alongside search results. This week Blinkx added lectures from Harvard, Princeton and other universities.

Podscope's podcast scope also is about 45,000, while Podzinger catalogs only about 11,000. But that should expand greatly, and incorporate video, as the site leaves beta mode.

To be sure, none of these sites has mastered audio recognition, a notoriously tricky beast. Computers still cannot consistently understand all the innumerable accents, mispronunciations and other nonstandard diction that colors human speech.

Even so, considering that the feds pay BBN a lot of money for real-time analysis of overseas broadcasts in Arabic and other languages, I found it funny that early in my test the phrase "Osama bin Laden" got no hits on Podzinger. Neither did "Usama bin Laden," the spelling often used by federal authorities.

When I shortened the search to "Osama," that brought up an episode of "MSNBC Countdown" in which host Keith Olbermann uttered the name of the terrorist mastermind - though Podzinger heard it as an Anglicized "Osama bin Lawton." I couldn't check whether the fault lay with a too-sharp pronunciation by Olbermann; the original material was gone from the Web.

To Podscope's credit, the same search returned a result in which a podcaster pronounced the name as "Oosama bin Layden." Nice catch.

Blinkx made a geopolitical gaffe by transcribing the following snippet from a Fox News broadcast about a political murder in Lebanon:

"... pro-Syrian President Emile Lahoud, citing a cell phone call Lahoud received minutes before the murder."

as:

"... pro-Syrian President Emile of food citing a cell phone colic who received minutes before the murder."

Do errors like that matter? To some degree. That Lebanon-Syria clip doesn't appear if you search for "Lahoud." But it does come up if you hunt for "colic."

You're not likely to encounter that non sequitur on sites that don't convert speech to text.

For example, I got precise results from Yahoo Inc. (YHOO)'s podcast search engine, which launched in October and claims tens of thousands of podcasts. It mines metadata and reviews written by listeners to raise the chance a search will yield a relevant result.

All nine results about "colic" were indeed related to babies' dreaded crying fits. (I got no hits for the term on Podscope and one on Podzinger.)

The good news for speech-to-text services is that they might improve with use. That's partly because the engines can learn better ways to determine words from their context.

Blinkx co-founder Suranga Chandratillake illustrates the process this way: If a podcast were made about the topics in this story, a computer probably would be right if it detected the phrase "recognize speech."

But in a podcast about last year's tsunami, the computer would do better to hear almost the same sounds as "wreck a nice beach."

---

On the Net:

http://www.blinkx.com

http://www.podzinger.com

http://podcasts.yahoo.com

http://www.podscope.com





Caspermick

"TOUGH TIMES NEVER LAST BUT TOUGH PEOPLE DO."


God Bless America

In Gambling,,,Playing Card Games. Ya Never Know What The Next Hand Will Look Like.
Ten Bagger Potential Stock

Join the InvestorsHub Community

Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.