Re: Lobaltech

From: Alex Ramonsky (alex@ramonsky.com)
Date: Wed Feb 27 2002 - 17:09:32 MST


Yes, well, it's all hype. the advantages of NNs, as these guys almost say,
 are that they can learn things without rules and that some learning
 algorithms can use relatively unstructured data. this makes them ideal for
 areas where we don't know what's going on (rules) or how to descrbe the
 phenomena involved (data), such as meaning and consciousness.
 the disadvantages of NNs are that we don't know how to design them (we can
 build an infinite number of physically/logically different NNs, but we
 don't know how this affects their properties) and we don't know what
 they'll learn given a particular set of data: it's all trial and error. in
 this respect, Star Trek's Data is not too far from the truth.
Lots of people have applied NNs to speech synth/rec, with some success.
 the current vogue in ASR is for hybrid HMM+NN systems . NN applications in
ASR are quite promising,
 because there are many complex low-level phenomena which we can't describe
 with rules but we can say what the right answer is (e.g. 10**n wavefomrs
 of the utterance "fish" by different people, all different, but all
 signifying the same thing).
 Applications of NNs to TTS are rarer, mainly because most of the problems
 can be solved by rule and those which cannot are beyond our ability to
 describe in enough detail for a NN application (e.g. the meaning of the
 text, the emotional content of the speech, the details of the acoustic
 transitions between sounds, ...).
There are a few things we could consider doing with NNs - augmenting our
 ASR, adding meaning/emotion to TTS, or even investigating the contextual
 effects on acoustic transitions.
What thinkest thou?
Ramonsky



This archive was generated by hypermail 2.1.5 : Sat Nov 02 2002 - 09:12:42 MST