From: hal@finney.org
Date: Thu Jun 21 2001 - 14:28:24 MDT
Robin writes, regarding the Pratt report on CYC:
> Immediately after mentioning that report, the article Eugene cited says:
>
> >But in other tests, Cyc blew away the competition as decisively as Eurisko's
> >space cruisers. In July 1998, the Pentagon put Cyc and a dozen other AI
> >systems through their analytical paces, giving each team a package of 300
> >pages of abstruse data to program in their systems and following up with a
> >series of complicated strategic queries. Cyc scored better than all the
> >other systems put together, according to the company, leading the Pentagon
> >to make it the core of a new experimental program aimed at developing large
> >knowledge bases.
>
> It would be nice to learn more about this competition. But assuming it
> was managed in good faith, this seems strong evidence that Cyc is much
> better than the competition.
I should have read the Times article more carefully. I get the Times
and was reading it while eating breakfast but didn't get to the very end
where it mentioned this result. (However it misstated the Pratt visit
as being in 1984 when it was of course in 1994.)
Actually I think this competition result is consistent with what Pratt
saw. CYC does a very good job at putting together facts if it knows them.
As long as the "ontology" (the set of concepts and words that is input to
the system) represents the area in question, it can find inconsistencies
and cross correlate different data bases. It apparently excels at that
and it sounds like this is what the Pentagon test required. Input 300
pages of data and then query the system on that data. This is exactly
what CYC (and, equally importantly, its trainers) have been practicing
for 20 years now.
But that's not really a test of common sense. What Pratt saw was
something else. Despite all the years of training, CYC's knowledge was
still extremely spotty. It knew that bread was food, but it didn't know
that bread wasn't drink. In fact it didn't know that food and drink are
normally mutually exclusive. It knew that lack of food caused hunger,
and that people could die of starvation, but it didn't know that hunger
and starvation were related.
It knew the relative sizes of all the planets, but not their actual
sizes. It knew that the earth had an atmosphere, but not that the sky
was blue.
Seemingly it all depended on whether someone had actually sat down and
put a particular fact into CYC. At the time at least, only a small
number of the facts had powerful logical conclusions embedded in them
(male -> masculine). Most of them just existed statically, and the only
connections were that the same ontological concept might be used in more
than one fact. That's the impression I got from the Pratt report, anyway.
The database is now 1.4 million assertions, 3 times bigger now then
when Pratt was there. It's not clear that this is strictly comparable
because shortly before Pratt's visit they had shrunk the database by a
factor of 4 by eliminating redundancies. But assuming that the database
was reasonably compact at the time of Pratt's visit, it seems unlikely
that the current knowledge base could be much more than 3 times larger.
Would a three-fold improvement of coverage have made a qualitative
difference in what Pratt saw? I am skeptical. CYC's coverage seemed
less than 1/3 of what Pratt had been led to expect by Lenat's optimistic
reports. He had a list of questions prepared and they didn't even get
into it, really, the questions being obviously far beyond what could
be expected.
> Surely Cyc is not as good as many people
> expect it to be, but it may be far better than they have any right to
> expect it to be. Cyc is probably still a long way off from getting "most"
> of common sense knowledge, but it is also probably much farther along that
> the other efforts.
Even if it is farther along, the relevant question is whether it is far
enough along to be useful at applying common-sense knowledge. That's
been the hard nut which CYC has hoped to crack all along. Clearly that
was not true in 1994, with half a million rules. Maybe things will be
different today.
> It is all well and good for Vaughan Pratt to call for
> more objective measures of progress, but if he thinks it so important maybe
> he should roll up his sleeves and create such better measures.
I felt that Pratt was politely suggesting that Lenat should stop making
claims about CYC that he can't back up objectively. This report was
a pretty big black eye for the project and AFAIK they haven't invited
anyone else for an outsider's analysis since then, at least I couldn't
find anything on the web. I think it's a bad sign that they didn't invite
him (or someone else) back a couple years later and say, now look it,
see how much better it's doing.
> If Cyc
> does eventually succeed, all the other AI researchers should be called
> to account for what they were doing instead of helping to improve Cyc.
Well, you can hardly blame researchers for trying many paths to the
truth, or criticize them when one of the paths works out much better
than the others. If we always knew in advance which project would work
we wouldn't need to do science in the way we do.
In any case I agree that the public release will be a big step forward.
I'd love to be positively surprised.
What I'd really like would be if one of those clever chatbot programs
could be integrated with CYC somehow so that it was not only good
at fooling you, but could make some really strong leaps of judgement
occasionally. That could pack quite a punch.
Hal
This archive was generated by hypermail 2.1.5 : Sat Nov 02 2002 - 08:08:14 MST