From: Harvey Newstrom (mail@HarveyNewstrom.com)
Date: Wed Apr 04 2001 - 15:50:35 MDT
Jim Fehlinger wrote,
> I see your point, but you are overstating the case to an almost ludicrous
> degree.
Probably. My point was that we can't predict everything accurately. I
agree that for most cases, there is no need to.
I'm responsible for debugging and maintaining one corner
> of an enormously complex ERP system with thousands of users.
> I've seen all the
> situations you mention (though I've never seen anything that was
> ultimately blamed on
> sunspots!)
I have seen satellites knocked out by solar flares and sunspot activity, due
to the interference.
> , and all that stuff has to be sorted through and ruled out
> systematically when there's a problem. That's what programmers and system
> administrators and database administrators and network administrators and
> hardware techs get paid the big bucks for. Generally, one **doesn't** get
> away with claiming a bug is not reproducible
Oh, I agree totally. One has to prove that the blame is accurate or else
it's just a guess. True debugging is ruling out possibilities and narrowing
in until the exact cause is found. Just trying things until something works
is the brute-force approach of an unskilled debugger.
I certainly didn't mean to imply that bugs could not be debugged after the
fact. I was objecting from the opposite direction. I was trying to say
that it is impossible to predict computers well enough to eliminate
unforeseen bugs. (At least we haven't been able to do it yet.)
> One sine qua non of doing computer programming
> for a living (never mind being a brilliant one -- just being able
> to do the
> job at all!) is to absolutely (and I mean to the marrow of your bones)
> get out of the habit of blaming ghosts in the machine. You have to get a
> really, really firm grip on the probabilities --
I agree totally.
> and the probability that
> something was caused by sunspots, or some random, irreproducible glitch
> is always next to nothing
Depends on your field. I have debugged radiation-induced communications
failures in satellites. Sunspots and solar flares definitely will change
transmission throughput characteristics. A system that requires 99% of
available bandwidth throughput is not robust enough to operate during high
sunspot activities in space.
> You betcher bank account balance computers are predictable!
Sorry, no. I have extensive experience programming bank computers. The
methods for handling fractional pennies on interest or fees varies widely
between banks. Direct deposits are not predictable by time and are
routinely done up to 24 hours in advance just to make sure. Transmission
lines are often down, so updates sometimes don't occur when they should.
Most banks hold funds deposited after a certain time until the next day.
Unfortunately, the ATM clock is not usually exactly synchronized with the
bank's internal computer. Deposit something a few seconds before the
deadline, and you have no way of knowing when it will be credited to your
account. You never know if checks through the mail have been processed or
not. The balance shown on your ATM may or may not include deposit checks
with holds on them that are not yet "available". I have run into many such
timing errors where bank computers cannot absolutely confirm that money is
or is not in an account at any specific moment. Banks have staff members
routinely correcting accounts each month due to minor timing errors that the
customers complain about.
I know this may sound unreasonable, but if you actually get into writing
low-level device drivers or actually write code to handle bank systems, you
find that things are never as clear cut as they seem at first. I am not
arguing just to be stubborn. I actually have seen these kinds of errors
while developing, testing and debugging bank systems. The more I debug
computers, the more I am convinced that they are not perfectly predictable.
This is not due to vaguely blaming the ghost in the machine. I actually get
traces or resolve the history and determine that things happen differently
in different rare circumstances that were unforeseen by the programmer.
Software is rarely robust enough to recover completely transparently. Even
the best software can't undo previous history that has already occurred but
shouldn't have.
-- Harvey Newstrom <http://HarveyNewstrom.com> <http://Newstaff.com>
This archive was generated by hypermail 2.1.5 : Sat Nov 02 2002 - 08:06:50 MST