Re: COMP: Moore's Law

From: Eugene Leitl (eugene.leitl@lrz.uni-muenchen.de)
Date: Fri Jun 11 1999 - 15:19:40 MDT


mark@unicorn.com writes:
 
> Yet every attempt to do this I remember ended up running a lot slower than
> dedicated hardware; firstly because they had to keep reconfiguring the
> chip to do different things, which took a long time, and secondly because
> they couldn't run as fast as dedicated chips.
 
That's because all current reconfigurable architectures are based on
FPGAs. Which is certainly not exactly the best way to do
it. Suboptimal as it is, we're going to get FPGA areas in
DSPs/consumer devices before very long.
 
> Perhaps... but there's a big difference between what's theoretically better
> and what's practically better. So far there's no good reason for believing
> that this kind of hardware is really better.
 
Well, one cannot really argue with physics of computation. Not on the
long run. So we're going to have reversible, reconfigurable computing
before very long.
 
> Yet we've seen probably a million-fold improvement in computing performance
> in that time, and probably a thousand-fold reduction in cost. What more
> would a 'revolution' have given us?
 
What we've got is iteration of the computational mill/dumb storage
paradigm, first as mechanical, then electromechanical, then vacuum
tube/electromagnetic, then semiconductor/electromagnetic/optical
incarnation. The speed has increased, the architecture has remained
the same. Whether punching cards or typing in a text editor, whether
hardwiring the program or writing to FPGA cells there is no
qualitative difference. I cannot help but to think that the one
billion transistors in a current PC could be utilized much more
efficiently. Considering that that billion constitutes a small
fraction of all processed silicon (the defective rest goes into
the scrap bin) I think we can do much better for the money.
 
> >Of course. The essence of Wintel's success. What I don't understand is
> >why after all these years people are still buying it, hook and sinker.
>
> a) it's cheap.

Economies of scale are not specific to architectures. And perpetuating
a braindead architecture is much more damaging on the long run. Market
is irrational/short-term, though.

> b) it runs all your old software.

Which should be more properly addressed with emulation. Why doesn't
emulation work very well today? Because the enhancements iterations are
gradual and don't allow to run the last architecture snappily enough.

> c) it mostly does the job.
 
Which is arguable. It certainly doesn't do mine very well.

> However, that's changing now; the K7 looks like it could give the P-III a

K7 is just a minor variation on the CPU motif, really. As is Alpha.

> real run for its money, and Windows is making up a large fraction of the
> cost of cheap PCs. Plus open source software greatly simplifies the process

Actually, a Californian company is selling a complete K6-2/350, 32
MByte RAM, 4 GByte EIDE, CDROM etc. computer sans CRT and OS for
$299. With $30 they'll install RedHat on it for you. I heard it sells
extremely well.

> of changing CPU architectures; just recompile and you can run all the
> software you used to run.
 
Open Source is nice. But how much of it is written in rigorous OOP
fashion, from a mosaic of tiny objects? Using threaded code? How much
of it is written using asynchronous OO message passing? It's hard
enoug to make people think in MPI/Beowulf way. Linux sucks. Beowulf is
the wrong way to do it. But it at least gets you started.

> >Why paying for one expensive, legacy-ballast CPU and invest in nine
> >others hideously complex designs (possibly more complex than the CPU
> >itself), each requiring individual resources on the fab if you
> >could churn out ~500-1000 CPUs for roughly $500 production costs?
>
> Last I checked, a Z80 was a dollar or two a chip. Why aren't we all running
> massively parallel Z80 machines? Perhaps because building a machine with
> 500 CPUs will be much more expensive than buying them and writing software
> to do anything useful on them will be a monumental task?
 
Um, why aren't we using abacuses? Your comparison with Z80 is about as
meaningless. A valid comparison would be a MISC CPU, like the i21. A
32 bit version of it would have ~30 kTransistors, and should
outperform your PII 200, in some cases PII 400. If scaled to 1..2 kBit
bus width and embedded RAM it would make an interesting comparison to
a quad-Xeon.
 
And of course a 500 CPU machine, if mass-produced, would be less
expensive than your desktop. There is no packaging, since you go
WSI. The die yield is quantitative because you adjust the grain size
to have >80% die yield on wafer. The wafer yield is 100%. The testing
is trivial: software does it. There is only one kind of chip to
produce. There is no motherboard. Etc.

> >You can implement a pretty frisky 32 bit CPU core plus networking
> >in ~30 kTransistors, and I guess have semiquantitive die yield assuming
> >1 MBit grains.
>
> But what good will it do for me? I used to work with Transputers, which
> were going to lead to these massively parallel computers built from cheap
> CPUs. Didn't happen, because there were few areas where massively parallel
> CPUs had benefits over a single monolithic CPU.
 
There are no monolithic supercomputers in existance. They are all
consumer-class CPUs glued with custom networking. Currently, we've got
Alpha Beowulfs with Myrinet in several instance beating the crap out
of current SGI/Cray in terms of absolute performance. Clustering rules
supreme. Want to bet for the first consumer PC with clustering
integrated in one case?
 
> >Engines look very differently if you simultaneously operate on
> >an entire screen line, or do things the voxel way.
>
> I find discussion of voxel rendering pretty bizarre from someone who
> complains about my regarding 32MB as a 'reasonable amount' of memory for
> a graphics chip. Reasonable voxel rendering is likely to need gigabytes
> of RAM, not megabytes.
 
Who says you need to keep the entire voxelset in one grain? There is
sure no known memory which lets you process several GBytes at 100
Hz. With few MBit grains, it's easy. Voxels love embarrassingly
parallel fine-grain architectures.
 
> >The reason's why we don't have WSI yet are mostly not technical. It is
> >because people don't want to learn.
>
> Anamartic were doing wafer-scale integration of memory chips more than
> a decade ago; from what I remember, they needed a tremendous amount of

There was no viable embedded RAM process as late as last year. Most of the
industry doesn't have access to one. Sony isn't producing the
Playstation 2 in quantities yet. Anamartic hadn't had a ghost of a
chance of succeeding. Embedded RAM alone takes several G$ to develop,
WSI plus embedded RAM at least that much. If you have the hardware,
you need the nanokernel OS and the development environment to support
it, and hordes of programmers to train.

> work to test each wafer and work out how to link up the chips which worked
> and avoid the chips which didn't. This is the kind of practical issue

You don't avoid dead dies, that's a sofware problem. Choose a die on the
wafer at random, touch down a testing pin and boot from link. The
picokernel tests the dies, forks off clones all over the wafer in a
few ms (redundant links and integrated routers route around dead
dies), runs tests and collects the results. The testing machine
gathers the results, and computes the wafer quality according to the
fraction of good dies reported. If you don't get a good die after a
few random tries, the whole wafer is sour, which indicates major
production trouble. Next wafer.

> which theoreticians just gloss over, and then 'can't understand' why
> people don't accept their theories.
 
I understand the collective inertia of millions of programmers and the
whole of the semiconductor industry and especially the
market. However, people tend to confuse these very real issues with
purely technical ones, which is not true. Amdahl's law is not Scripture.
 
> The reason we still use monolithic chips is not that people are afraid of
> trying other solutions, but because we have tried those other solutions,
> and so far they've failed. That may change, but I don't see any good evidence
> of that.
 
We _haven't_ tried other solutions. Currently, we're not attempting to
change this. This will have to change when current technology
saturates, which should be soon enough. Perhaps sooner, if we
understand the mechanisms underlying our inertia.
 
> Mark



This archive was generated by hypermail 2.1.5 : Fri Nov 01 2002 - 15:04:08 MST