From: Alex Ramonsky (alex@ramonsky.com)
Date: Wed Feb 27 2002 - 17:09:32 MST
Yes, well, it's all hype. the advantages of NNs, as these guys almost say,
are that they can learn things without rules and that some learning
algorithms can use relatively unstructured data. this makes them ideal for
areas where we don't know what's going on (rules) or how to descrbe the
phenomena involved (data), such as meaning and consciousness.
the disadvantages of NNs are that we don't know how to design them (we can
build an infinite number of physically/logically different NNs, but we
don't know how this affects their properties) and we don't know what
they'll learn given a particular set of data: it's all trial and error. in
this respect, Star Trek's Data is not too far from the truth.
Lots of people have applied NNs to speech synth/rec, with some success.
the current vogue in ASR is for hybrid HMM+NN systems . NN applications in
ASR are quite promising,
because there are many complex low-level phenomena which we can't describe
with rules but we can say what the right answer is (e.g. 10**n wavefomrs
of the utterance "fish" by different people, all different, but all
signifying the same thing).
Applications of NNs to TTS are rarer, mainly because most of the problems
can be solved by rule and those which cannot are beyond our ability to
describe in enough detail for a NN application (e.g. the meaning of the
text, the emotional content of the speech, the details of the acoustic
transitions between sounds, ...).
There are a few things we could consider doing with NNs - augmenting our
ASR, adding meaning/emotion to TTS, or even investigating the contextual
effects on acoustic transitions.
What thinkest thou?
Ramonsky
This archive was generated by hypermail 2.1.5 : Sat Nov 02 2002 - 09:12:42 MST