The Art of Doing Science and Engineering: Learning to Learn
Overview unavailable.
Book Front Matter
- Identifies the book as Richard W. Hammingâs The Art of Doing Science and Engineering: Learning to Learn.
- Provides publication and copyright details for the 2005 Taylor & Francis e-Library edition, originally published by Gordon and Breach Science Publishers.
- Lists bibliographic information, ISBNs, publisher locations, and reproduction restrictions.
- Includes the table of contents, showing chapters on computing history, AI, coding and information theory, digital filters, simulation, fiber optics, mathematics, creativity, systems engineering, and research.
The Art of Doing Science and Engineering
The Art of Doing Science and
Engineering
Learning to Learn
Richard W.Hamming
U.S. Naval Postgraduate School
Monterey, California
GORDON AND BREACH SCIENCE PUBLISHERS
Australia ⢠Canada ⢠China ⢠France ⢠Germany ⢠India â˘
Japan ⢠Luxembourg ⢠Malays ia ⢠The Netherlands â˘
Russia ⢠Singapore ⢠Sw itzerland ⢠Thailand â˘
United Kingdom
This edition published in the Ta ylor & Francis e-Library, 2005.
âTo purchase your own copy of this or any of Taylor & Franci s or Routledgeâs collection of thousands of eBooks please go to
www.eBookstore.tandf.co.uk.â
Copyright Š 1997 OPA (Overseas Publishers Association)
Amsterdam B.V.Published in The Netherlands under license
by Gordon and Breach Science Publishers.
All rights reserved.
No part of this book may be reproduced or utilized in any
form or by any means, electronic or mechanical, including
photocopying and recording, or by any information storage
or retrieval system, without permission in writing from the
publisher. Printed in India.
Amsteldijk 166
1st Floor
1079 LH Amsterdam
The Netherlands
British Library Catalogu ing in Publication Data
Hamming, R.W. (Richard Wesley), 1915â
The art of doing science and engineering: learning to
learn
1. Science 2. Engineering
I. Title
500
ISBN 0-203-45071-X Master e-book ISBN
ISBN 0-203-45913-X (Adobe eReader Format)
ISBN 90-5699-501-4 (Print Edition)
CONTENTS
Preface vi
Introduction viii
1 Orientation 1
2 Foundations of the Digital (Discrete) Revolution 9
3 History of ComputerâHardware 17
4 History of ComputerâSoftware 24
5 History of Computer Applications 33
6 Limits of Computer ApplicationsâAIâI 40
7 Limits of Computer ApplicationsâAIâII 47
8 Limits of Computer ApplicationsâAIâIII 55
9 n-Dimensional Space 57
10 Coding TheoryâI 67
11 Coding TheoryâII 74
12 Error Correcting Codes 81
13 Information Theory 89
14 Digital FiltersâI 97
15 Digital FiltersâII 107
16 Digital FiltersâIII 115
17 Digital FiltersâIV 123
18 SimulationâI 128
19 SimulationâII 135
20 SimulationâIII 141
21 Fiber Optics 151
22 Computer Aided InstructionâCAI 157
23 Mathematics 163
24 Quantum Mechanics 171
25 Creativity 176
26 Experts 181
27 Unreliable Data 187
28 Systems Engineering 195
29 You Get What You Measure 202
30 You and Your Research 209
Index 216v
PREFACE
The Art of Scientific Style
- The author aims to teach the 'style' of thinking in science and engineering, treating it as an art form rather than a purely technical discipline.
- Instruction follows the methods of art teachers, using a loose, rambling lecture format that emphasizes suggestions and multiple approaches over rigid rules.
- The 'story' approach is utilized to demonstrate how individual preparation allows researchers to capitalize on 'luck' and achieve great results.
- Education should focus on preparing students for their own future rather than the teacher's past, despite the inherent difficulty of predicting what is to come.
- The course serves as a non-technical complement to graduate studies, focusing on the intangible qualities that distinguish great practitioners from average ones.
Teachers should prepare the student for the studentâs future, not for the teacherâs past.
After many years of pressure and encouragement from friends, I decided to write up the graduate course in
engineering I teach at the U.S.Naval Postgraduate School in Monterey, Ca lifornia. At first I concentrated on
all the details I thought should be tightened up, rather than leave the material as a series of somewhatdisconnected lectures. In class the l ectures often followed the interests of the students, and many of the later
lectures were suggested topics in which they expressed an interest Also, the lectures changed from year toyear as various areas developed. Since engineeri ng depends so heavily these days on the corresponding
sciences, I often use the terms interchangeably.
After more thought I decided that since I was trying to teach âstyleâ of thinking in science and
engineering, and âstyleâ is an art, I should therefor e copy the methods of teaching used for the other artsâ
once the fundamentals have been learned. How to be a great painter cannot be ta ught in words; one learns
by trying many different approaches that seem to surround the subject. Art teachers usually let the advanced
student paint, and then make suggestions on how they would have done it, or what might also be tried, moreor less as the points arise in the st udentâs headâwhich is where the lear ning is supposed to occur! In this
series of lectures I try to communicate to students what cannot be said in wo rdsâthe essence of style in
science and engineering. I have adopted a loose organi zation with some repetition since this often occurs in
the lectures. There are, therefore, di gressions and storiesâwith some told in two different placesâall in the
somewhat rambling, informal style typical of lectures.
I have used the âstoryâ approach, often emphasizing the initial part of the discovery, because I firmly
believe in Pasteurâs remark, âLuck favors the prepar ed mind.â In this way I can illustrate how the
individualâs preparation before encountering the problem can often lead to recognition, formulation, andsolution. Great results in science and engineering are âbunchedâ in the same person too often for success tobe a matter of random luck.
Teachers should prepare the student for the studentâs future, not for the teacherâs past. Most teachers
rarely discuss the important topic of the future of their field, and when this is pointed out they usually reply:âNo one can know the future.â It seems to me the difficulty of knowing the future does not absolve theteacher from seriously trying to help the student to be ready for it when it comes. It is obvious theexperience of an individual is not necessarily that of a class of individuals; therefore, any one personâsprojection into the future is apt to be somewhat pers onal and will not be univers ally accepted. This does not
justify reverting to impersonal surveys and losing the impact of the personal story.
Since my classes are almost all carefully selected na vy, marine, army, air force, and coast guard students
with very few civilians, and, interestingly enough, about 15% very highly selected foreign military, thestudents face a highly technical futureâhen ce the importance of preparing them for their future and not just
our past.
The year 2020 seems a convenient date to center th e preparation for their futureâa sort of 20/20
foresight, as it were. As graduate students working to ward a masterâs degree, they have the basics well in
hand. That leaves me the task of adding âstyleâ to th eir education, whic h in practice is usually the difference
between an average person and a great one. The school has allowed me great latitude in trying to teach a
completely non-technical course; this course âcom plementsâ the more technical ones. As a result, my
opening words, occasionally repeated, are: âThere is really no technica l content in the course, though I will,
of course, refer to a great deal of it, and hopefully it will generally be a good review of the fundamentals ofwhat you have learned.
The Art of Thinking
- The course focuses on the 'style of thinking' rather than specific content, treating information as illustrative material for broader cognitive skills.
- The core philosophy, 'Learning to Learn,' is presented as the primary tool for students to adapt to rapid technological and professional changes.
- The author utilizes personal anecdotes, including spectacular failures, to communicate the 'art' of discovery that cannot be easily codified in words.
- Mathematics is used as a foundational tool to expose the weaknesses of current beliefs and to provide deep insights into future scientific directions.
- The text acknowledges that the future of science and engineering will be increasingly mathematical, yet general concepts remain accessible through verbal descriptions.
- The course serves as a repository for essential knowledge and perspectives that do not fit into the standard academic curriculum.
Apparently an âartââ which almost by definition cannot be put into wordsâis probably best communicated by approaching it from many sides and doing so repeatedly.
Do not think it is the content of the courseâit is only illustrative material. Style of
thinking is the center of the course.â
The subtitle of this book, Learning to Learn, is the main solution I offer to help students cope with the
rapid changes they will have to endure in their fields. The course centers around how to look at and thinkabout knowledge, and it supplies some historical perspectives that might be useful.
This course is mainly personal experiences I have had and digested, at least to some extent. Naturally one
tends to remember oneâs successes and forget lesse r events, but I recount a number of my spectacular
failures as clear examples of what to avoid. I have found that the personal story is far, far more effective
than the impersonal one; hence there is necessarily an aura of âbraggingâ in the book that is unavoidable.
Let me repeat what I earlier indicated. Apparently an âartââ which almost by definition cannot be put
into wordsâis probably best communicated by approaching it from many sides and doing so repeatedly,hoping thereby students will finally master enough of the art, or if you wish, style, to significantly increasetheir future contributions to society. A totally differen t description of the course is: it covers all kinds of
things that could not fi nd their proper place in the standard curriculum.
The casual reader should not be put off by the mathematics; it is only âwindow dressingâ used to
illustrate and connect up with earlier learned material. Usually the underlying ideas can be grasped from thewords alone.
It is customary to thank various people and institutions for help in producing a book. Thanks obviously
go to AT&T Bell Laboratories, Murray Hill, New Jersey, and to the U.S.Naval Postgraduate School,especially the Department of Electrical and Computer Engineering, for making this book possible. vii
INTRODUCTION
This book is concerned more with the future and less with the past of science and engineering. Of course
future predictions are uncertain and usually based on the past; but the past is also much more uncertainâoreven falsely reportedâthan is usually recognized. Thus we are forced to imagine what the future will
probably be. This course has been called"Hamming on Hamming" since it draws heavily on my own pastexperiences, observati ons, and wide reading.
There is a great deal of mathematics in the early pa rt because almost surely the future of science and
engineering will be more mathemati cal than the past, and also I need to establish the nature of the
foundations of our beliefs and their uncertainties. Only then can I show the weaknesses of our currentbeliefs and indicate future di rections to be considered.
If you find the mathematics difficult, skip those early parts. Later sections will be understandable
provided you are willing to forgo the deep insights mathematics gives into the weaknesses of our currentbeliefs. General results are always stat ed in words, so the content will still be there but in a slightly diluted
form.
1
Orientation
Style Over Technical Content
- The course prioritizes the development of a 'style of thinking' over specific technical training or content.
- Style is defined as a quality that cannot be taught through words alone, requiring examples and personal experience to grasp.
- The author challenges the scientific tradition of impersonal delivery, opting for first-person accounts to ensure the material has a lasting impact.
- Studying successes is presented as more efficient than studying failures because there are countless ways to be wrong but few ways to be right.
- The instructor acts as a coach rather than a lecturer, emphasizing that the student must perform the mental 'running' to achieve any benefit.
I am, as it were, only a coach. I cannot run the mile for you; at best I can discuss styles and criticize yours.
The purpose of this course is to prepare you for your technical future. There is re ally no technical content in
the course, though I will, of course, refer to a great deal of it, and hopefully it will generally be a good
review of the fundamentals you have learned. Do not think the technical content is the courseâit is only
illustrative material. Style of thinking is the center of the course. I am concerned with educating and not
training you.
I will examine, criticize, and display styles of thinki ng. To illustrate the points of style I will often use
technical knowledge most of you know, but, again, it will be, I hope, in the form of a useful review whichconcentrates on the fundamentals. You should rega rd this as a course which complements the many
technical courses you have learned. Many of the things I will talk about are things which I believe you oughtto know but which simply do not fit into courses in the standard cu rriculum. The course exists because the
department of Electrical and Computer Engineering of the Naval Postgraduate School recognizes the needfor both a general education and the speciali zed technical training your future demands.
The course is concerned with âstyleâ, and almost by definition style cannot be taught in the normal
manner by using words. I can only approach the topic through particular examples, which I hope are wellwithin your grasp, though the examples come mainly from my 30 years in the mathematics department ofthe Research Division of Bell Telephone Laboratories (before it was broken up). It also comes from yearsof study of the work of others.
The belief anything can be âtalked aboutâ in words was certainly held by the early Greek philosophers,
Socrates (469â399), Plato (427â3 47), and Aristotle (384â322). This attitude ignored the current mystery
cults of the time who asserted you had to âexperienceâ some things which could not be communicated in
words. Examples might be the gods, truth, justice, th e arts, beauty, and love. Your scientific training has
emphasized the role of words, along with a strong belief in reductionism, hence to emphasize the possible
limitations of language I shall take up the topic in several places in the book. I have already said âstyleâ is
such a topic.
I have found to be effective in this course, I must use mainly first hand knowledge, which implies I break
a standard taboo and talk about myself in the first person, instead of the traditional impersonal way of
science. You must forgive me in this matter, as th ere seems to be no other approach which will be as
effective. If I do not use direct experience then the material will probably sound to you like merely piouswords and have little impact on your minds, and it is your minds I must change if I am to be effective.
This talking about first person experiences will give a flavor of âbraggingâ, though I include a number of
my serious errors to partially balance things. Vicarious learning from the experiences of others savesmaking errors yourself, but I regard the study of successes as being basically more important than the studyof failures. As I will several times say, there are so many ways of being wrong and so few of being right,
studying successes is more efficient, and furtherm ore when your turn comes you will know how to succeed
rather than how to fail!
I am, as it were, only a coach. I cannot run the mile for you; at best I can discuss styles and criticize
yours. You know you must run the mile if the athletics course is to be of benefit to youâhence you must
think carefully about what you hear or read in this book if it is to be effective in changing youâwhich must
obviously be the purpose of any course. Again, you will get out of this course only as much as you put in,and if you put in little effort beyond sitting in the class or reading the book, then it is simply a waste of your
time. You must also mull things over, compare what I sa y with your own experiences, talk with others, and
Style and Meta-Education
- Developing a personal style requires synthesizing the fundamentals of masters with one's own native abilities to adapt for the future.
- True leadership in a field comes from selecting and adapting traits rather than merely following or copying the past.
- Education is defined as knowing what, when, and why to do things, whereas training focuses on the technical how.
- The concept of 'meta-education' involves rising above standard learning to examine the process of education and contribution itself.
- Scientific knowledge and the population of scientists have historically doubled every 17 years, creating an exponential growth environment.
Either you will be a leader, or a follower, and my goal is for you to be a leader.
make some of the points part of your way of doing things.
Since the subject matter is âstyleâ, I will use the comparison with teaching painting. Having learned the
fundamentals of pain ting, you then study under a master you accept as being a great painter; but you know
you must forge your own style out of the elements of various earlier painters plus your native abilities. Youmust also adapt your style to fit the future, since merely copying the past will not be enough if you aspire tofuture greatnessâa matter I assume, and will talk about often in the book. I will show you my style as best Ican, but, again, you must take those elements of it which seem to fit you, and you must finally create your
own style. Either you will be a lead er, or a follower, and my goal is for you to be a leader. You cannot
adopt every trait I discuss in what I have observed in myself and others; you mu st select and adapt, and
make them your own if the course is to be effective.
Even more difficult than what to select is that what is a successful style in one age may not be appropriate
to the next age! My predecessors at Bell Telephone Laboratories used one style; four of us who came in allat about the same time, and had about the same chronological age, found our own styles and as a result werather completely transformed the over all style of the Mathematics Department, as well as many parts of the
whole Laboratories. We privately called ourselves âThe four young Turksâ, and many years later I foundtop management had called us the same!
I return to the topic of educat ion. You all recognize there is a significant difference between education
and training .
Education is what, when, and why to do things, Training is how to do it.
Either one without the other is not of much use. You need to know both what to do and how to do it. I have
already compared mental and physical training and said to a great extent in both you get out of it what youput into itâall the coach can do is suggest styles an d criticize a bit now and then . Because of the usual size
of these classes, or because you are reading the book, there can be little direct criticism of your thinking by
me, and you simply have to do it internally and between yourselves in conversations, and apply the things Isay to your own experiences. You mi ght think education should precede trai ning, but the kind of educating I
am trying to do must be based on your past experi ences and technical knowledge. Hence this inversion of
what might seem to be reasonable. In a real sense I am engaged in âmeta-educati onâ, the topic of the course
is education itself and hence our discussions must ri se above itââmeta-educationâ , just as metaphysics was
supposed to be above physics in Aristotleâs time ( actually âfollowâ, âtranscen dâ is the translation of
âmetaâ).
This book is aimed at your future, and we must examine what is likely to be the state of technology
(Science and Engineering) at the time of your greatest contributions. It is well known that since about Isaac
Newtonâs time (1642â1727) knowledge of the type we are concerned with has about doubled every 17years. First, this may be measured by the books published (a classic observation is libraries must double
their holdings every 17 years if they are to maintain their relative position). Second, when I went to Bell2 CHAPTER 1
Telephone Laboratories in 1946 they were trying to decrease the size of the staff from WW-II size down to
about 5500. Yet during the 30 years I was there I observed a fairly steady doubling of the number ofemployees every 17 years, regardless of the administration having hiring freezes now and then, and such
things. Third, the growth of the number of scientists generally has similarly been exponential, and it is saidcurrently almost 90% of the scientists who ever lived are now alive! It is hard to believe in your future there
The Growth of Knowledge
- Future professionals face a dramatic decrease in expected growth rates, necessitating constant lifelong learning.
- Great scientists and engineers frequently use back-of-the-envelope calculations to test the compatibility of different data points.
- The model assumes that the growth of knowledge is directly proportional to the number of scientists currently alive.
- Mathematical modeling suggests that if knowledge doubles every 17 years, it is consistent with the claim that 90% of all scientists are currently alive.
- Initial estimations use concrete numbers to gain an intuitive 'feel' before moving to more complex parametric equations.
I have frequently observed great scientists and engineers do this much more often than âthe run of the millâ people, hence it requires illustration.
will be a dramatic decrease in these expected rates of growth, hence you face, even more than I did, the
constant need to learn new things.
Here I make a digression to illustrate what is often called âback of the envelop calculationsâ. I have
frequently observed great scientists and engineers do this much more often than âthe run of the millâ
people, hence it requires illustration. I will take the above two statements, knowledge doubles every 17years, and 90% of the scientists who ever lived are no w alive, and ask to what ex tent they are compatible.
The model of the growth of knowledge and the growth of scientists assumed are both exponential, with thegrowth of knowledge being proportional to the number of scientists alive. We begin by assuming thenumber scientists at any time t is
and the amount of knowledge produced annually has a constant k of proportionality to the number of
scientists alive. Assuming we begin at minus infinity in time (the error is small and you can adjust it to
Newtonâs time if you wish), we have the formula
hence we know b. Now to the other statem ent. If we allow the lifetime of a scientist to be 55 years (it seems
likely that the statement meant living and not practicing, but excluding childhood) then we have
which is very close to 90%.
Typically the first back of the envelop calculations use, as we did, definite numbers where one has a feel
for things, and then we repeat the calculations with parameters so you can adjust things to fit the data better
and understand the general case. Let the doubling period be D, and the lifetime of a scientist be L. The first
equation now becomes
and the second becomes:
ORIENTATION 3
The Power of Back-of-the-Envelope Calculations
- Back-of-the-envelope calculations allow scientists to quickly verify the validity of quantitative claims and identify overlooked variables.
- Engaging in rapid modeling helps internalize results and maintains the mental agility required for more complex future applications.
- The rapid growth of new knowledge is compounded by the high rate of obsolescence, with technical knowledge estimated to have a 15-year half-life.
- The transition from vacuum tubes to transistors serves as a primary example of how quickly specialized expertise can become irrelevant.
- The exponential growth of information means that a child entering college may face up to eight times the amount of knowledge their parent encountered.
I found it very valuable at the physics table I used to eat with; I sometimes cleared up misconceptions at the time they were being formed, thus advancing matters significantly.
With D=17 years we have 17Ă3.3219=56.47âŚyears for the lifetime of a scientist, which is close to the 55
we assumed. We can play with ratio of L/D until we find a slightly closer fit to the data (which was
approximate, though I believe more in the 17 years for doubling than I do in the 90%). Back of the envelopcomputing indicates the two remarks are reasonably compatible. Notice th e relationship applies for all time
so long as the assumed simple relationships hold.
The reason back of the envelop calculations are widely used by great scientists is clearly revealedâyou
get a good feeling for the truth or falsity of what wa s claimed, as well as realize which factors you were
inclined not to think about, such as exactly what was meant by the lifetime of a scientist. Having done the
calculation you are much more likely to retain the results in your mind. Furthermore, such calculations keepthe ability to model situations fresh and ready for mo re important applications as they arise. Thus I
recommend when you hear quantitative remarks such as the above you turn to a quick modeling to see if
you believe what is being said, especially when given in the public media like the press and TV. Very oftenyou find what is being said is nonsense, either no definite statement is made which you can model, or if youcan set up the model then the results of the model do not agree with what was said. I found it very valuableat the physics table I used to eat with; I sometimes cleared up misconceptions at the time they were being
formed, thus advancing matters significantly.
Added to the problem of the growth of new knowledge is the obsolescence of old knowledge. It is
claimed by many the half-life of the technical knowledge you just learned in school is about 15 yearsâin 15years half of it will be obsolete (either we have gone in other directions or have replaced it with new
material). For example, having taught myself a bit about vacuum tubes (because at Bell Telephone
Laboratories they were at that time obviously important) I soon found myself helping, in the form ofcomputing, the development of transistorsâwhich obsoleted my just learned knowledge!
To bring the meaning of this doubling down to your own life, suppose you have a child when you are x
years old. That child will face, when it is in college, about y times the amo unt you faced.
y x
factor of increase years
2 17
3 27
4 34
5 39
6 44
7 48
8 51
The Knowledge Explosion
- The volume of technical knowledge is doubling at an exponential rate, affecting everything from mathematics to personal lifestyle choices.
- Future generations will face a staggering mass of information, with technical knowledge expected to quadruple within a single career span.
- To avoid obsolescence, professionals must prioritize mastering fundamentals rather than trying to memorize every fleeting technical detail.
- The ability to rapidly learn and adapt to entirely new fields is a critical survival skill for engineers and scientists.
- Fundamentals can be identified by their longevity and their ability to serve as the logical foundation from which an entire field can be derived.
If you were at times awed by the mass of knowledge you faced when you went to college, or even now, think of your childrenâs troubles when they are there!
This doubling is not just in theorems of mathematic s and technical results, bu t in musical recordings of
Beethovenâs Ninth, of where to go skiing, of TV programs to watch or not to watch. If you were at times awedby the mass of knowledge y ou faced when you went to college, or even now, think of your childrenâs
troubles when they are there! The technical knowledge i nvolved in your life will quadruple in 34 years, and
many of you will then be near the high point of your career. Pick your estimated years to retirement andthen look in the left-hand column for the probable factor of increase over th e present current knowledge
when you finally quit!
What is my answer to this dilemma? One answer is you must concentrate on fu ndamentals, at least what
you think at the time are fundamentals, and also develop the ability to learn new fields of knowledge when4 CHAPTER 1
they arise so you will not be left behind, as so many good engineers are in the long run. In the position I
found myself in at the Laboratories, where I was the onl y one locally who seemed (at least to me) to have a
firm grasp on computing, I was forced to learn nume rical analysis, computers, pretty much all of the
physical sciences at least enough to cope with the many different computing problems which arose and whose
solution could benefit the Labs, as well as a lot of the social and some the biological sciences. Thus I am aveteran of learning enough to get along without at the same time devoting all my effort to learning newtopics and thereby not contributing my share to the total effort of the organization. The early days oflearning had to be done while I was developing an d running a computing cente r. You will face similar
problems in your career as it progresses, and, at times, face problems which seem to overwhelm you.
How are you to recognize âfundamenta lsâ? One test is they have lasted a long time. Another test is from
the fundamentals all the rest of the field can be derived by using the standard methods in the field.
I need to discuss science vs. engineering. Put glibly:
Science, Engineering, and Future Prediction
- Science is defined by exploring the unknown, while engineering relies on applying the known, though the two fields are increasingly merging.
- The rapid pace of progress necessitates lifelong self-teaching, as much of the knowledge required for a career is created after formal education ends.
- Predicting the future is notoriously difficult, with methods ranging from simple linear extrapolation to complex historical analysis.
- Human factors like ego, inertia, and organizational rules often influence the evolution of technology more than physical limitations.
- Long-term predictions are frequently pessimistic because people fail to grasp the power of geometric growth and compounding knowledge.
- The field of Artificial Intelligence serves as a notable exception where long-term predictions have been consistently over-optimistic.
In science if you know what you are doing you should not be doing it. In engineering if you do not know what you are doing you should not be doing it.
In science if you know what you are doing you should not be doing it.
In engineering if you do not know what you are doing you should not be doing it.
Of course, you seldom, if ever, see ei ther pure state. All of engineering i nvolves some creativity to cover the
parts not known, and almost all of science includes some practical engineering to translate the abstractionsinto practice. Much of present science rests on engineer ing tools, and as time goes on, engineering seems to
involve more and more of the science part. Many of the large scientific projects involve very serious
engineering problemsâthe two fields are growing togeth er! Among other reasons for this situation is almost
surely we are going forward at an accelerated pace, and now there is not time to allow us the leisure which
comes from separating the two fields. Furthermore, both the science and the engineering you will need foryour future will more and more often be created after you left school. Sorry! But you will simply have toactively master on your own the many new emerging fields as they arise, without having the luxury of being
passively taught.
It should be noted that engineering is not just applied science, which is a distinct third field (though it is
not often recognized as such) which lies between science and engineering.
I read somewhere there are 76 differ ent methods of predicting the futureâbut very number suggests there
is no reliable method which is widely accepted. The most trivial method is to predict tomorrow will be
exactly the same as todayâwhich at times is a good bet. The next level of sophistication is to use the currentrates of change and to suppose they will stay the same âlinear prediction in the va riable used. Which variable
you use can, of course, strongly aff ect the prediction made! Both methods are not much good for long-term
predictions, however.
History is often used as a long-term guide; some people believe history repeats itself and others believe
exactly the opposite! It is obvious:
The past was once the future and the future will become the past.
In any case I will often use history as a background for the extrapolations I make. I believe the best
predictions are based on understanding the fundamental forces involved, and this is what I depend onmainly. Often it is not physical limitations which cont rol but rather it is human made laws, habits, and
organizational rules, regulations, pe rsonal egos, and inertia, which dominate the evolution to the future. You
have not been trained along these lines as much as I be lieve you should have been, and hence I must be careful
to include them whenev er the topics arise.ORIENTATION 5
There is a saying,âShort term predictions are alwa ys optimistic and long term predictions are always
pessimisticâ. The reason, so it is cl aimed, the second part is true is for most people the geometric growth
due to the compounding of knowledge is hard to gr asp. For example for money a mere 6% annual growth
doubles the money in about 12 years! In 48 years the growth is a factor of 16. An example of the truth of
this claim that most long-term predictions are low is th e growth of the computer field in speed, in density of
components, in drop in price, etc. as well as the spread of computer s into the many corners of life. But the
field of Artificial Intelligence (AI) provides a very good counter example. Almost all the leaders in the fieldmade long-term predictions which have almost never come true, and are not likely to do so within yourlifetime, though many will in the fullness of time.
I shall use history as a guide many times in spite of Henry Ford, Sr. saying, âHistory is Bunkâ. Probably
Fordâs points were:
1. History is seldom reported at all accurately, and I have found no two reports of what happened at Los
Alamos during WW-II seems to agree.
2. Due to the pace of progress the fu ture is rather disconnected from the past; the presence of the modern
Vision and Future Determinism
- Historians often present the past as a series of inevitable trends while viewing the future as a realm of infinite possibility.
- There are four ways to resolve the contradiction between past determinism and future potential, including acknowledging the power of individual choice.
- Human biological evolution and social institutions are likely to constrain the future more than the rapid pace of technological advancement.
- Unforeseen inventions can disrupt even the most rigorous predictions, making foresight a difficult but necessary endeavor.
- The 'drunken sailor' analogy illustrates that having a consistent vision allows for linear progress rather than a random walk.
- The primary differentiator between high achievers and others is the possession of a vision versus merely reacting to current events.
In a lifetime of many, many independent choices, small and large, a career with a vision will get you a distance proportional to n, while no vision will get you only the distance square root of n.
computer is an example of the gr eat differences which have arisen.
Reading some historians you get the impression the past was determined by big trends, but you also havethe feeling the future has great possibilities. You can handle this apparent contradiction in at least four ways:
1. You can simply ignore it.2. You can admit it.3. You can decide the past was a lot less determined than historians usually indicate and individual
choices can make large differences at times. Alexander the Great, Napoleon, and Hitler had great
effects on the physical side of life, while Pythagor as, Plato, Aristotle, Newton, Maxwell, and Einstein
are examples on the mental side.
4. You can decide the future is less open ended than you would like to believe, and there is really less
choice than there appears to be.
It is probable the future will be more limited by the slow evolution of the human animal and thecorresponding human laws, social institution, and organizations than it will be by the rapid evolution oftechnology.
In spite of the difficulty of predicting the future and that:
Unforeseen technological inventions can comp letely upset the most careful predictions,
you must try to fo resee the future you will face. To illustrate the importance of this poin t of trying to foresee
the future I often use a standard story.
It is well known the drunken sailor who staggers to the left or right with n independent random steps will,
on the average, end up about
steps from the origin. But if there is a pretty girl in one direction, then his
steps will tend to go in that direction and he will go a distance proportional to n. In a lifetime of many, many
independent choices, small and large, a career with a vision will get you a distance proportional to n, while
no vision will get you only the distance
In a sense, the main difference between those who go far and
those who do not is some people have a vision and th e others do not and therefor e can only react to the
current events as they happen.6 CHAPTER 1
The Necessity of Vision
- The primary goal of the course is to compel students to create a detailed vision of their future career, as drifting is the primary obstacle to greatness.
- A successful vision requires balancing what is scientifically possible, what is likely to happen through engineering, and what is ethically desirable.
- The author advocates for 'Friday afternoon' thinking, dedicating 10% of one's time to imagining future scientific and social shifts.
- Standard education fragments knowledge into departments, but professional success requires recognizing the homogeneity and unity of all information.
- Computers will dominate the future of technical life due to their inherent advantages in speed, reliability, and freedom from human boredom.
No vision, not much of a future.
One of the main tasks of this course is to start you on the path of creating in some detail your vision of
your future . If I fail in this I fail in the whole course. You will probably object that if you try to get a vision
now it is likely to be wrongâand my reply is from observation I have seen the accuracy of the vision
matters less than you might suppose, getting anywhere is better than drifting, there are potentially manypaths to greatness for you, and just which path you go on, so long as it takes you to greatness, is none of my
business. You must, as in the case of forging your personal style, find your vision of your future career, andthen follow it as best you can.
No vision, not much of a future.
To what extent history does or does not repeat itself is a moot question. But it is one of the few guides you
have, hence history will often play a large role in my discussionsâI am trying to provide you with some
perspective as a possible guide to create your vision of your future. The other main tool I have used is anactive imagination in trying to s ee what will happen. For many years I devoted about 10% of my time
(Friday afternoons) to trying to understand what would happen in the future of computing, both as ascientific tool and as shaper of the social world of work and play. In forming your plan for your future youneed to distinguish three different questions:
What is possible?
What is likely to happen?What is desirable to have happen?
In a sense the first is Scienceâwhat is possible. The second in Engineeringâwh at are the human factors
which chose the one future that does happen from the ensemble of all possible futures. The third, is ethics,morals, or what ever other word you wish to apply to value judgments. It is important to examine all three
questions, and in so far as the second differs from the third, you will probably have an idea of how to alterthings to make the more desirable future occur, ra ther than let the inevitable happen and suffer the
consequences. Again, you can see why having a vision is what tends to separate the leaders from thefollowers.
The standard process of organizing knowledge by departments, and subdepartments, and further breaking
it up into separate courses, tends to conceal the hom ogeneity of knowledge, an d at the same time to omit
much which falls between the course s. The optimization of the individu al courses in turn means a lot of
important things in Engineering practice are skipped since they do not appear to be essential to any onecourse. One of the functions of this book is to mentio n and illustrate many of these missed topics which are
important in the practice of Science and Engineering. Another goal of the course is to show the essentialunity of all knowledge rather than the fragments which appear as the individual topics are taught. In yourfuture anything and everything you know might be useful, but if you believe the problem is in one area youare not apt to use information that is relevant but which occurred in another course.
The course will center aroun d computers. It is not me rely because I spent much of my career in Computer
Science and Engineering, rather it s eems to me computers will dominate your technical lives . I will repeat a
number of times in the book the following facts: Computers when compared to Humans have theadvantages:
Economics âfar cheaper, and getting more so
Speed âfar, far fasterORIENTATION 7
Accuracy âfar more accurate (precise)
Reliability âfar ahead (many have error correction built into them)
Rapidity of control âmany current airplanes are unstable
and require rapid computer c ontrol to make them practical
Freedom from boredom âan overwhelming advantageBandwidth in and out âagain overwhelming
Ease of retraining âchange programs, not unlearn and then learn the new thing consuming hours and hours of
human time and effort
Excellence and Digital Foundations
- Machines offer distinct management advantages over humans in hostile environments because they lack personal needs, egos, and social complications.
- The author argues that a life dedicated to achieving excellence and making significant contributions is more rewarding than one of mere comfort.
- True fulfillment is found in the struggle toward a goal rather than the achievement itself, echoing the Socratic ideal of the examined life.
- The technological landscape is completing a total shift from continuous (analog) signaling to discrete (digital) pulse-based systems.
- Digital signaling is superior to analog because it prevents the compounding of errors and noise during the amplification process.
It has often been observed the true gain is in the struggle and not in the achievementâa life without a struggle on your part to make yourself excellent is hardly a life worth living.
Hostile environments âouter space, unde rwater, high radiation fi elds, warfare, manufacturing situations that are
unhealthful, etc.
Personnel problems
âthey tend to dominate manage ment of humans but not of machines; with machines there
are no pensions, personal squa bbles, unions, personal leave, egos, deaths of relatives,
recreation, etc.
I need not list the advantages of humans over computersâalmost every one of you has already objected to
this list and has in your mind started to cite the advantages on the other side.
Lastly, in a sense, this is a religious courseâI am preaching the message that, with apparently only one
life to live on this earth, you ought to try to make significant contributions to humanity rather than just getalong through life comfortablyâthat the life of trying to achieve excellence in some area is in itself aworthy goal for your life. It has often been observed the true gain is in the struggle and not in theachievementâa life without a struggle on your part to make yourself excellent is hardly a life worth living.
This, it must be observed, is an opinion and not a fact, but it is based on observing many peopleâs lives andspeculating on their total happiness rather than the mo ment to moment pleasures they enjoyed. Again, this
opinion of their happiness must be my own interpretation as no one can know anotherâs life. Many reportsby people who have written about the âgood lifeâ agree with the above opinion. Notice I leave it to you topick your goals of excellence, but claim only a life without such a goal is not really living but it is merelyexistingâin my opinion. In anci ent Greece Socrates (469â399) said:
The unexamined life is not worth living.8 CHAPTER 1
2
Foundations of the Digital (Discrete)
We are approaching the end of the revolution of going from signaling with continuous signals to signaling
with discrete pulses, and we are now probably moving from using pulses to using solitons as the basis forour discrete signaling. Many signals occur in Nature in a continuous form (if you disregard the apparentdiscrete structure of things built out of molecules and electrons). Telephone voice transmission, musical
sounds, heights and weights of people, distance cove red, velocities, densitie s, etc. are examples of
continuous signals. At present we usually convert the continuous signal almost immediately to a sampleddiscrete signal; the sampling being us ually at equally spaced intervals in time and the amount of the signal
being quantized to a comparatively few levels. Quantizat ion is a topic we will ignore in these chapters,
though it is important in some situations, especi ally in large scale comp utations with numbers.
Why has this revolution happened?1. In continuous signaling (transmission) you often ha ve to amplify the signal to compensate for natural
losses along the way. Any error made at one stage, before or during amplification, is naturally amplified bythe next stage. For example, the telephone company in sending a voice across th e continent might have a
total amplification factor of 10
120. At first 10120 seems to be very large so we do a quick back of the envelop
The Digital Information Revolution
- Digital signaling uses repeaters rather than amplifiers to automatically remove noise, allowing for high-fidelity transmission without requiring exquisite hardware accuracy.
- The transition from analog to digital computation enables deeper and more accurate processing, though analog systems remain useful for simple, low-accuracy tasks.
- Integrated circuits revolutionized computing by eliminating problematic soldered joints and increasing speed through high component density.
- Interconnection costs scale dramatically by orders of magnitude, from fractions of a cent on-chip to dollars between frames.
- Society is shifting from a material-based economy to an information-service economy, with a projected 75% of the workforce handling information by 2020.
- Information differs from material goods because it is organized rather than consumed, despite being stored in physical forms like books or films.
Noise introduced at one spot, if not too much to make the pulse detection wrong at the next repeater, is automatically removed.
modeling to see if it is reasonable. Consider the system in more detail . Suppose each amplifier has a gain of
100, and they are spaced every 50 miles. The actual path of the signal may well be over 3000 miles, hencesome 60 amplifiers, hence the above factor does seem reasonable now we have seen how it can arise. It
should be evident such amplifiers ha d to be built with exquisite accuracy if the system was to be suitable for
human use.
Compare this to discrete signaling. At each stage we do not amplify the signal, but rather we use the
incoming pulse to gate, or not, a standard source of pulses; we actually use repeaters, not amplifiers . Noise
introduced at one spot, if not too much to make the pulse detection wrong at the next repeater, isautomatically removed. Thus with remarkable fidelity we can transmit a voice signal if we use digital
signaling, and furthermore the equipment need not be built extremely accu rately. We can use, if necessary,
error detecting and error correcting codes to further defeat the noise. We will examine these codes later,
Chapters 10 â12. Along with this we have developed the area of digital filters which are often much more
versatile, compact, and cheaper than are analog filters, Chapters 14 â17. We should note here transmission
through space (typically signaling) is the same as transmission through time (storage).
Digital computers can take advantage of these features and carry out very deep and accurate
computations which are beyond the reach of analog computation. Analog computers have probably passed
their peak of importance, but should not be dismissed lightly. They have some features which, so long asgreat accuracy or deep computat ions are not required, make th em ideal in some situations.
2. The invention and development of transistors and the integrated circuits, ICs, has greatly helped the
digital revolution. Before ICs the problem of soldered joints dominated the building of a large computer,
and ICs did away with most of this problem, though soldered joints are still tr oublesome. Furthermore, the
high density of components in an IC means lower cost and higher speeds of computing (the parts must beclose to each other since otherwise the time of transmission of signals will significantly slow down thespeed of computation). The steady d ecrease of both the voltage and curr ent levels has contributed to the
partial solving of heat dissipation.
It was estimated in 1992 that interc onnection costs were approximately:
Interconnection on the chip $10â5=0.001 cent
Interchip $10â2=1 cent
Interboard $10â1=10 cents
Interframe $100=100 cents
3. Society is steadily moving from a material goods so ciety to an information serv ice society. At the time of
the American Revolution, say 1780 or so, over 90% of the people were essentially farmersânow farmersare a very small percent of worker s. Similarly, before WW-II most workers were in factoriesânow less
than half are there. In 1993, there were more people in Government (excluding the military), than therewere in manufacturing! What will the situation be in 2020? As a guess I would say less than 25% of thepeople in the civilian work force will be handling things, the rest will be handling information in some formor other. In making a movie or a TV program you are making not so much a thing, though of course it doeshave a material form, as you are organizing information. Information is, of course, stored in a materialform, say a book (the essence of a book is information) , but information is not a material good to be consumed
like food, a house, clothes, an automobile, or an airplane ride for transportation.
The information revolution arises from the above three items plus their synergistic interaction, though the
following items also contribute.
4. The computers make it possible for robots to do many things, including much of the present
The Evolution of Mechanization
- Robotic control will likely evolve beyond standard von Neumann computing to include neural networks and fuzzy logic.
- Robots in manufacturing prioritize tighter quality control, lower costs, and the creation of fundamentally different products.
- Successful mechanization requires an imaginative redesign of the product rather than a literal imitation of hand-crafted versions.
- Large-scale organizational success depends on a flexible 'give-and-take' approach to process transformation.
- Field maintenance must be integrated into the initial design phase of complex systems to prevent it from dominating long-term costs.
It has rarely proved practical to produce exactly the same product by machines as we produced by hand.
manufacturing. Evidently computers will play a do minant role in robot operation, though one must
be careful not to claim the standard von Neumann ty pe of computer will be the sole control mechanism,
rather probably the current neural net computers, fuzzy set logic, and variations will do much of the control.Setting aside the childâs view of a robot as a machin e resembling a human, but rather thinking of it as a
device for handling and controlling things in the material world, robots used in manufacturing do thefollowing:
A. Produce a better product under tighter control limits.
B. Produce usually a cheaper product.C. Produce a different product.
This last point needs careful emphasis.
When we first passed from hand accounting to machine accounting we found it necessary, for
economical reasons if no other, to somewhat alter the accounting system. Similarly, when we passed fromstrict hand fabrication to machine fabrication we pa ssed from mainly screws and bolts to rivets and
welding.
It has rarely proved practical to produce exactly the same produc t by machines as we produced by
hand.10 CHAPTER 2
Indeed, one of the major items in the conversion from hand to machine production is the imaginative
redesign of an equivalent product . Thus in thinking of mechanizing a large organization, it wonât work if
you try to keep things in detail exactly the same, rather there must be a larger give -and-take if there is to be
a significant success. You must get the essentials of the job in mind and then design the mechanization to do
that job rather than trying to mechanize the current versionâif you want a significant success in the long
run.
I need to stress this point; mechanization requires you produce an equivalent product, not identically the
same one. Furthermore, in any design it is now essential to consider field maintenance since in the long runit often dominates all other costs. The more complex the designed system the more field maintenance mustbe central to the final design. Only when field maintena nce is part of the original design can it be safely
controlled; it is not wise to try to graft it on later. This applies to both mechanical things and to humanorganizations.
5. The effects of computers on Science have been very large, and will probably continue as time goes on.
The Rise of Simulation
- The author's experience at Los Alamos proved that large-scale computing is essential when physical experiments are impossible or dangerous.
- A massive shift has occurred from performing 90% of experiments in physical labs to performing over 90% via computer simulation.
- Simulations offer greater flexibility and lower costs, allowing researchers to test scenarios that cannot be replicated in a physical environment.
- There is a growing risk of returning to 'Middle Age scholasticism' by trusting computer models more than the actual behavior of Nature.
- Computers enable the engineering of unstable systems, such as high-speed aircraft, by providing stabilization speeds beyond human capability.
- Modern engineering has shifted from the limitation of 'what can we do' to the ethical and creative choice of 'what do we want to do.'
We are now looking more and more in books and less and less at Nature! There is clearly a risk we will go too far occasionallyâand I expect this will happen frequently in the future.
My first experience in large scale computing was in the design of the original atomic bomb at Los Alamos.
There was no possibility of a small scale experiment either you have a critical mass or you do notâandhence computing seemed at that time to be the only practical approach. We simulated, on primitive IBM
accounting machines, various pr oposed designs, and they gradually came down to a design to test in the
desert at Alamagordo, NM.
From that one experience, on thinking it over carefu lly and what it meant, I realized computers would
allow the simulation of many different kinds of expe riments. I put that vision into practice at Bell
Telephone Laboratories for many years. Somewhere in the mid-tolate 1950s in an address to the President
and V.Ps of Bell Telephone Laboratories I said, âAt present we are doing 1 out of 10 experiments on the
computers and 9 in the labs, but before I leave it will be 9 out of 10 on the machinesâ. They did not believeme then, as they were sure real obs ervations were the key to experiments and I was just a wild theoretician
from the mathematics department, but you all realize by now we do somewhere between 90 % to 99 % ofour experiments on the machines and th e rest in the labs. And this trend will go on! It is so much cheaper to
do simulations than real experiments, so much more flexible in testing, and we can even do things which
cannot be done in any lab, that it is inevitable the trend will continue for some time. Again, the product waschanged!
But you were all taught about the evils of the Middle Age scholasticismâpeople deciding what would
happen by reading in the books of Aristotle (384â322) rather than looking at Nature. This was Galileoâs(1564â1642) great point which started the modern scientific revolutionâlook at Nature not in books! But whatwas I saying above? We are now looking more and more in books and less and less at Nature! There isclearly a risk we will go too far o ccasionallyâand I expect this will happen frequently in the future. We must
not forget, in all the enthusiasm fo r computer simulations, occasionally we must look at Nature as She is.
6. Computers have also greatly affected Engineering. Not only can we design and build far more complex
things than we could by hand, we can explore many more alternate designs. We also now use computers tocontrol situations such as on the modern high speed airplane where we build unstable designs and then usehigh speed detection and computers to stabilize them since the unaided pilo t simply cannot fly them directly.
Similarly, we can now do unstable experiments in th e laboratories using a fast computer to control the
instability. The result will be that the experiment will measure something very accurately right on the edgeof stability.
As noted above, Engineering is coming closer to Scie nce, and hence the role of simulation in unexplored
situations is rapidly increasing in Engineeri ng as well as Science. It is also true computers are now often an
essential component of a good design .FOUNDATIONS OF THE DIGITAL (DISCRETE) REVOLUTION 11
In the past Engineering has been dominated to a great extent by âwhat can we doâ, but now âwhat do we
want to doâ looms greater since we now have the power to design almost anything we want. More than everbefore, Engineering is a matter of choice and balance rather than just doing what can be done. And moreand more it is the human factors which will determine good designâa topic which needs your seriousattention at all times.
7. The effects on society are also large. The most obvious illustration is computers have given top
The Curse of Micromanagement
- Top management frequently fails to resist the urge to micromanage, even when publicly claiming to decentralize.
- Micromanagement prevents lower-level managers from gaining the decision-making experience necessary for future leadership roles.
- Centralized planning often fails because it lacks the 'local view' and specific details known only to those on the front lines.
- The 'Not Invented Here' (NIH) syndrome flourishes in centrally controlled systems, stifling grassroots innovation.
- A counter-trend is emerging where small, independent organizations form loose associations to maintain autonomy and efficiency.
- Computers have historically enabled micromanagement, but they are now also transforming entertainment and personal life in ways yet to be fully realized.
The people at the bottom do not have the larger, global view, but at the top they do not have the local view of all the details, many of which can often be very important, so either extreme gets poor results.
management the power to micromanage their organization, and top management has shown little or no
ability to resist using this power. You can regularly read in the papers some big corporation isdecentralizing, but when you follow it for several years you see they merely intended to do so, but did not.
Among other evils of micromanagement is lower management does not get the chance to make
responsible decisions and learn from their mistakes, but rather because the older people finally retire then
lower management finds itself as top management âwithout having had many real experiences inmanagement!
Furthermore, central planning has been repeatedly shown to give poor results (consider the Russian
experiment for example or our own bureaucracy). The persons on the spot usually have better knowledgethan can those at the top and hence can often (not always) make better decisions if things are notmicromanaged. The people at the bottom do not have the larger, global view, but at the top they do not have
the local view of all the details, many of which can often be very important, so either extreme gets poor
results .
Next, an idea which arises in the field, based on the direct experience of the people doing the job, cannot
get going in a centrally controlled system since the managers did not think of it themselves. The not
invented here (NIH) syndrome is one of the major curses of our society, and computer s with their ability to
encourage micromanagement are a significant factor.
There is slowly coming, but apparently definitely, a counter trend to micromanagement. Loose
connections between small, somewhat independent or ganizations, are gradually arising. Thus in the
brokerage business one company has set itself up to sell its services to other small subscribers, for example,
computer and legal services. This leaves the br okerage decisions of thei r customers to the local
management people who are close to the front line of activity. Similarly, in the pharmaceutical area someloosely related companies carry out th eir work and intertrade among themse lves as they see fit. I believe
you can expect to see much more of this loose association between small organizations as a defense against
micromanagement from the top which occurs so often in big organizations. There has always been some
independence of subdivisions in organizations, but the power to micromanage from the top has apparentlydestroyed the conventional lines and autonomy of decision makingâand I doubt the ability of most topmanagements to resist for long the power to micromanage. I also doubt many large companies will be ableto give up micromanagement; most will probably be replaced in th e long run by smaller organizations
without the cost (overhead) and errors of top management. Thus computers are affecting the very structureof how Society does its business, and for the moment apparently for the worse in this area.
8. Computers have already invaded the entertainment field. An informal surv ey indicates the average
American spends far more time watching TV than in eating-again an information field is taking precedenceover the vital material field of eat ing! Many commercials and some pr ograms are now either partially or
completely computer produced.
How far machines will go in changing society is a matter of speculationâwhich opens doors to topics
that would cause trouble if discussed openly! Hence I must leave it to your imaginations as to what, usingcomputers on chips, can be done in such areas as sex, marriage, sports, games, âtravel in the comforts of
home via virtual realitiesâ, and other human activities.12 CHAPTER 2
Evolution of the Computer Revolution
- Computers have rapidly transitioned from simple number crunching to complex symbol manipulation, decision-making, and operational control.
- Modern warfare, exemplified by the Gulf War, has shifted into a domain where information dominance is the primary factor for success.
- The increasing role of machines in decision-making suggests that traditional human roles in business and the military are becoming obsolete.
- Future leaders must critically re-evaluate past doctrines and 'rethink everything' to adapt to a world saturated by artificial intelligence.
- Technological and field growth typically follows an 'S' shaped curve, starting slowly before rising rapidly and eventually hitting natural limits.
- Mathematical modeling of growth requires accounting for finite limits, as unlimited exponential growth is physically impossible in a finite universe.
I believe computers will be almost everywhere since I once saw a sign which read, âThe battle field is no place for the human beingâ.
Computers began mainly in the number crunching field but passed rapidly on to information retrieval
(say airline reservation systems), word processing which is spreading everywhere, symbol manipulation asis done by many programs such as those which can do analytic integration in the calculus far better andcheaper than can the students, and in logical and decision areas wher e many companies use such programs
to control their operations from moment to moment. The future computer invasion of traditional fieldsremains to be seen and will be discussed later under the heading of artificial intelligence (AI), Chapters 6 â8.
9. In the military it is easy to obse rve (in the Gulf War for example), th e central role of information, and
the failure to use the information about oneâs own situation killed many of our own people! Clearly that warwas one of information above all else, and it is probably one indicator of the future. I need not tell you suchthings since you are all aware, or should be, of this trend. It is up to you to try to foresee the situation in the
year 2020 when you are at the peak of your careers. I believe computers will be almost everywhere since Ionce saw a sign which read, âThe battle field is no place for the human beingâ. Similarly for situationsrequiring constant decision making. The many advantages of machines over humans were listed near the
end of the last chapter and it is hard to get around these advantages, though they are certainly not everything.Clearly the role of humans will be quite different from what it has traditionally been, but many of you willinsist on old theories you were taught long ago as if they would be automatically true in the long future. It will
be the same in business, much of what is now taught is based on the past, and has ignored the computerrevolution and our responses to some of the evils the revolution has brought; the gains are generally clear tomanagement, the evils are less so.
How much the trends, predicted in part 6 above, toward and away from mi cromanagement will apply
widely and is again a topic best left to youâbut you will be a fool if you do not give it your deep andconstant attention. I suggest you must rethink everything you ever learned on the subject, question every
successful doctrine from the past, and finally decide for yourself its future applicability. The Buddha told
his disciples, âBelieve nothing, no matter where you read it, or who said it, no matter if I have said it, unless
it agrees with your own reason and your own common senseâ. I say the same to youâ you must assume the
responsibility for what you believe .
I now pass on to a topic that is often neglected, the rate of evolution of some special field which I will
treat an another example of âback of the envelop computationâ. The growth of most, but by no means all,
fields follow an âSâ shaped curve. Things begin slowly, then rise rapidly, and later flatten off as they hitsome natural limits.
The simplest model of growth assumes the rate of gr owth is proportional to the current size, something
like compound interest, unrestrained bacterial and human population growth, as well as many otherexamples. The corresponding differential equation is
whose solution is, of course,
But this growth is unlimited and all things must have limits, even knowledge itself since it must be recorded
in some form and we are (currently) told the universe is finite! Hence we must include a limiting factor in
the differential equation. Let L be the upper limit. Then the next simplest growth equation seems to be
At this point we, of course, reduce it to a sta ndard form that eliminat es the constants. Set y=Lz, and x=t/kL2,
then we haveFOUNDATIONS OF THE DIGITAL (DISCRETE) REVOLUTION 13
as the reduced form for the growth problem, where the saturation level is now 1. Separation of variables
plus partial fractions yields:
S-Curves and Computing Limits
- The growth of systems often follows an 'S' curve, where initial conditions determine the starting point but the fundamental shape remains constant.
- Physical constraints, such as the speed of light and heat dissipation, suggest that single-processor computer performance is approaching a saturation point.
- The shift toward highly parallel processing indicates that the industry is feeling the upper limits of the current technological growth curve.
- New innovations can trigger a transition to a new 'S' curve, effectively launching a new cycle of growth from the saturation level of the previous one.
- Future electrical engineering will shift from fundamental circuit design to the strategic selection and programming of off-the-shelf integrated chips.
- Using general-purpose chips is often superior to custom designs because the broader user base helps identify errors and reduces individual design costs.
Often a new innovation will set the growth of a field onto a new âSâ curve which takes off from around the saturation level of the old one.
A is, of course, determined by th e initial conditions, where you put t (or x)=0. You see immediately the âSâ
shape of the curve; at t =ââ, z=0; at t=0, z= A/(A+1); and at t =+â, z=1.
A more flexible model for the growth is (in the reduced variables)
This is again a variables separable equation, and also yields to numerical inte gration if you wish. We can
analytically find the steepest slope by differentiating the right hand side and equating to 0. We get
Hence at the place
we have the maximum slope
A direction field sketch Figure 2.I will often indicate the na ture of the solution and is particularly easy to do
as the slope depends
only on y and not on xâthe isoclines are horizo ntal lines so the solution can be slid along the x-axis
without changing the âshapeâ of the solution. For a given a and b there is really only one shape, and the
Figure 2.I
14 CHAPTER 2
initial conditions determine where you look, not what you look at. When the differential equation has
coefficients which do not depend on the independent variable then you have this kind of effect.
In the special case of a=b we have
maximum slope=1/22a.
The curve will in this case be o dd symmetric about the point where z=1/2.
In the further special case of a=b=1/2 we get the solution
Here we see the solution curve has a finite range. For larger exponents a and b we have clearly an infinite
range.
As an application of the above cons ider the rate of increase in comput er operations per second has been
fairly constant for many yearsâthus we are clearly on the almost straight line part of the âSâ curve. (More
on this in the next chapter.) In this case we can more or less know the saturation point for the vonNeumann, single processor, type of computer since we believe: (1) the world is made out of molecules, and(2) using the evidence from the two relativity theories, special and general, gives a maximum speed ofuseful signaling, then there are defin ite limits to what can be done with a single processor. The trend to highly
parallel processors is the indication we are feeling the upper saturation limit of the âSâ curve for single
processor computers. There is also the nasty problem of heat dissipation to be considered. We will discuss
this matter in more detail in the next chapter.
Again we see how a simple model, while not very exact in detail, suggests the nature of the situation.
Whether parallel processing fits into this picture, or is an independent curve is not clear at this moment.
Often a new innovation will set the growth of a field onto a new âSâ curve which takes off from around thesaturation level of the old one, Figure 2.II . You may want to explore models which do not have a hard upper
saturation limit but rather finally grow logarith mically; they are sometim es more appropriate.
It is evident Electrical Engineering in the future is going to be, to a large extent, a matter of: (1) selecting
chips off the shelf or from a catalog, (2) putting the chips together in a suitable manner to get what you
want, and (3) writing the corresponding programs. Awar eness of the chips, and circuit boards which are
currently available will be an essentia l part of Engineering, much as the Vacuum Tube Catalog was in the
old days.
As a last observation in this area let me talk about special purpose IC chips. It is immensely ego
gratifying to have special purpose chips for your special job, but there are very high costs associated with
Figure 2.II
FOUNDATIONS OF THE DIGITAL (DISCRETE) REVOLUTION 15
them. First, of course, is the design cost. Then there is the âtrouble shootingâ of the chip. Instead, if you will
find a general purpose chip, which may possibly cost a bit more, then you gain the following advantages:
1. Other users of the chip will help find the errors, or other weaknesses, if there are any.
General Chips and Ancient Computing
- General purpose chips benefit from a community of users who contribute to documentation and continuous upgrades.
- The rapid pace of technological progress means systems often become obsolete before they are fully operational.
- Relying on special purpose chips can trap a designer in an outdated architecture, whereas general chips allow for flexible software updates.
- The history of computing traces back to primitive tools like pebbles and bone markings used for tracking lunar phases.
- Ancient structures like Stonehenge demonstrate that early civilizations possessed significant astronomical and computational sophistication.
You will hardly get a system installed and working before there are significant improvements which you can adapt by mere program changes.
2. Other users will help write the manuals needed to use it.3. Other users, including the manufacturer, will suggest upgrades of the chip, hence you can expect a
steady stream of improved chips with little or no effort on your part.
4. Inventory will not be a serious problem.5. Since, as I have been repeatedly said, technical progress is going on at an increasing rate, it follows
technological obsolescence will be much more rapid in the future than it is now. You will hardly get asystem installed and working before there are significant improvements which you can adapt by mereprogram changes If you have used general purpose chips and good programming methods rather than
your special purpose chip which will almost certainly tie you down to your first design.
Hence beware of special purpose chips!
though many times they are essential. 16 CHAPTER 2
3
History of Computersâ Harware
The history of computing probably began with prim itive man using pebbles to compute the sum of two
amounts. Marshack (of Harvard) found what had been be lieved to be mere scratches on old bones from cave
man days were in fact carefully scribed lines appa rently connected with the moonâs phases. The famous
Stonehenge on the Salisbury plain in England had three building stages, 1900â1700, 1700â1500, and 1500â
1400 B.C., and were apparently closely connected w ith astronomical observatio ns, indicating considerable
astronomical sophistication. Work in archeoas tronomy has revealed many primitive peoples had
Evolution of Computing Tools
- Ancient civilizations like China, India, and Mexico developed sophisticated astronomical observatories long before modern technology.
- The transition from Roman numerals to Arabic numerals in the 1400s was a pivotal shift for pure computing despite initial legal resistance.
- The invention of logarithms and the slide rule marked a major advancement in analog computing, becoming the standard badge of the engineering profession.
- The development of the differential analyzer and electronic analog computers during WWII allowed for complex military calculations like missile trajectories.
- Early digital computing evolved from 'Napierâs bones' to mechanical desk calculators, including a lost machine designed for Kepler in 1623.
- The history of computing is split between the analog path of physical lengths and voltages and the digital path of discrete numbers.
Slide rules in the 1930s and 1940s were standard equipment of the engineer, usually carried in a leather case fastened to the belt as a badge of oneâs group on the campus.
considerable knowledge about astro nomical events. China, India, a nd Mexico were prominent in this
matter, and we still have their structures which we call observatories, though we have too littleunderstanding of how they were used. Our western plains have many traces of astronomical observatorieswhich were used by the Indians.
The sand pan and the abacus are instruments more closely connected with computing, and the arrival of
the Arabic numerals from India meant a great step forward in the area of pure computing. Great resistance
to the adoption of the Arabic numerals (not in their original Arabic form) was en countered from officialdom,
even to the extent of making them illegal, but in time (the 1400s) the practical ities and economic advantages
triumphed over the more clumsy Roman (and earlier Greek) use of letters of the alphabet as symbols for the
numbers.
The invention of logarithms by Napier (1550â1617) was the next great step. From it came the slide rule,
which has the numbers on the parts as lengths proportional to the logs of the numbers, hence adding twolengths means multiplying the two numbers. This analog device, the slide rule, was another significant stepforward, but in the area of analog not digital computers. I once used a very elabor ate slide rule in the form
of a (6â8â) diameter cylinder and about two feet long, with many, many suitable scales on both the outer
and inner cylinders, and equipped with a magnifying glass to make the r eading of the scales more accurate.
Slide rules in the 1930s and 1940s were standard equi pment of the engineer, usua lly carried in a leather
case fastened to the belt as a badge of oneâs group on the campus. The standard engineerâs slide rule was a
â10 inch loglog decitrig slide rul eâ meaning the scales were 10â long, included loglog scales, square and
cubing scales, as well as numerous trigonometric scales in decimal parts of the degree. They are no longer
manufactured!
Continuing along the analog path, the next important step was the differential an alyzer, which at first had
mechanical integrators of the analog form. The earliest successful ones were made around 1930 by
Vannevar Bush of MIT. The later RDA #2, while stil l analog and basically mechanical, had a great deal of
electronic interconnections. I used it for some time (1947â1948) in computing Nike guided missiletrajectories in the earliest design stages.
During WW-II the electronic analog computers came into the military field use. They used condensers as
integrators in place of the earlier mechanical wheels an d balls (hence they could on ly integrate with respect
to time). They meant a large, practical step forw ard, and I used one such machine at Bell Telephone
Laboratories for many years. It was constructed from parts of some old M9 gun directors. Indeed, we usedparts of some later condemned M9s to build a second computer to be used either independently or with thefirst one to expand its capacity to do larger problems.
Returning to digital computing Napier also designed âNapierâs bonesâ which were typically ivory rods
with numbers which enabled one to multiply numbers easily; these are digital and not to be confused with
the analog slide rule.
From the Napier bones probably came the more modern desk calculators. Schickert wrote (Dec. 20,
1623) to Kepler (of astronomical fame) that a fire in his lab burned up the machine he was building forKepler. An examination of his records and sketches indicates it would do the four basic operations of arithmeticâ provided you have some charity as to just what multiplication and di vision are in such a machine. Pascal
Evolution of Mechanical Computing
- Early computing history traces from Pascal's tax-assessing adder to Leibniz's unreliable machines for multiplication and division.
- Charles Babbage designed the difference engine for error-free table printing and the analytical engine, which prefigured modern computer architecture.
- The transition to practical desk calculators like the Comptometer and Friden led to the formation of human computing groups in major laboratories.
- Herman Hollerith revolutionized data processing by introducing punched cards to solve the logistical crisis of the 1890 US Census.
- Mechanical IBM 601 punches were eventually utilized at Los Alamos to perform the complex calculations required for the first atomic bombs.
Babbage insisted the printing be done by the machine to prevent any human errors creeping in.
(1623â1662) who was born that same year is often credited with the invention of the desk computer, but hiswould only add and subtractâonly those operations were needed to aid his tax assessing father. Leibnitz (ofcalculus fame) also tinkered with computers and included multiplication and division, though his machineswere not reliable.
Babbage (1791â1871) is the next great name in the digital field, and he is often considered to be the father
of modern computing. His first design was the difference engine, based on the simple idea that a polynomial
can be evaluated at successive, equally spaced, va lues by using only a sequence of additions and
subtractions, and since locally most functions can be represented by a suitable polynomial this couldprovide âmachine made tablesâ (Bab bage insisted the printing be done by the machine to prevent any
human errors creeping in). The English Government ga ve him financial support, but he never completed
one. A Norwegian father and son (Scheutz) did make several which worked and Babbage congratulatedthem on their success. One of their machines was so ld to the Albany observatory, New York, and was used
to make some astronomical tables.
As has happened so often in the field of computing, Babbage had not finished with the difference engine
before he conceived of the much more powerful analytical engine, which is not far from the current von
Neumann design of a computer. He never got it to work; a group in England constructed (1992) a machinefrom his working drawings and successfully operated it as he had designed it to work!
The next major practical stage was the Comptometer which was merely an adding device, but by repeated
additions, along with shifting, this is equivalent to mu ltiplication, and was very widely used for many, many
years.
From this came a sequence of more modern desk calculators, the Millionaire, then the Marchant, the
Friden, and the Monroe. At first they were hand controlled and hand powered, but gradually some of thecontrol was built in, mainly by mechanical levers. Beginning around 1937 they gradually acquired electricmotors to do much of the power part of the computing. Before 1944 at least one had the operation of squareroot incorporated into the machine (still mechanical levers intricat ely organized). Such hand machines were
the basis of computing groups of people running them to provide computing power. For example, when Icame to the Bell Telephone Laboratories in 1946 there were four such gr oups in the Labs, typically about
six to ten girls in a group; a small group in the Mathematics department, a larger one in networkdepartment, one in switching, and one in quality control.
Punched card computing began because one far seeing person saw the Federal census, that by law must
be done every 10 years, was taking so much time the next one (1890) would not be done before thefollowing one started unless they turned to machine methods. Hollerith, took on the job and constructed the
first punched card machines, and with succeeding censuses he built mo re powerful machines to keep up
with both the increased population and the increased nu mber of questions asked on the census. In 1928 IBM18 CHAPTER 3
began to use cards with rectangular holes so electric brushes could easily detect the presence or absence of a
hole on a card at a given place. Powe rs, who also left the census group, kept the card form with round holes
which were designed to be detected by mechanical rods as âfingersâ.
Around 1935 the IBM built the 601 mechanical punch which did multiplications, and could include two
additions to the product at the same time. It became one of the mainst ays of computingthere were about
1500 of them on rental and they averaged perhaps a multiplication per 2 or 3 seconds. These, along withsome special triple product and division machines, were used at Los Alamos to compute the designs for thefirst atomic bombs.
The Dawn of Electronic Computing
- Early relay computers by Stibitz, Zuse, and Aitken introduced concepts like remote terminals, time-sharing, and multiprocessing.
- The ENIAC, delivered in 1946, marked the start of the electronic age despite its massive size and cumbersome plug-board wiring.
- Mauchly and Eckert's 1946 course on computer design catalyzed the creation of various machines, including the EDSAC and the Maniac series.
- John von Neumann is often credited with the concept of internal programming due to his report on the EDVAC project.
- Early experts drastically underestimated the market for computers, believing 18 machines would saturate the entire demand.
- The failure to predict the computer revolution stemmed from an inability to imagine entirely new applications beyond current tasks.
I well recall a group of us, after a session on the IBM 701 at a meeting where they talked about the proposed 18 machines, all believed this would saturate the market for many years!
In the mechanical, meaning relay, area George Stib itz built (1939) the complex number computer and
exhibited it at Dartmouth (1940) when the main frame was in New York, thus an early remote terminalmachine, and since it normally had three input stations in different locations in the Labs it was, if you are
kind, a âtime shared machineâ.
Konrad Zuse in Germany, and Howard Aitken at Harv ard, like Stibitz, each produced a series of relay
computers of increasing co mplexity. Stibitzâs Model 5 had two computers in the same machine and could
share a job when necessary, a multipr ocessor machine if you wish. Of th e three men probably Zuse was the
greatest, considering both the difficulties he had to cont end with and his later contributions to the software side
of computing.
It is usually claimed the electron ic computer age began with the ENIAC built for the U.S. Army and
delivered in 1946. It had about 18,000 vacuum tubes, was physically huge, and as originally designed it waswired much like the IBM plug boards, but its interconnections to describe any particular problem ran aroundthe entire machine room! So long as it was used, as it was originally intended, to compute ballistictrajectories, this defect was not serious. Ultimately, like the later IBM CPC, it was cleverly rearranged by
the users to act as if it were programmed from instru ctions (numbers on the ballistic tables) rather than from
wiring the interconnections.
Mauchly and Eckert, who built the ENIAC, found, just as Babbage had, before the completion of their
first machine they already envisioned a larger, in ternally programmed, ma chine, the EDVAC. Von
Neumann, as a consultant to the pr oject, wrote up the report, and as a consequence internal programming is
often credited to him, though so far as I know he never either claimed or denied that attribution. In thesummer of 1946, Mauchly and Eckert gave a course, open to all, on how to design and build electronic
computers, and as a result many of the attendees went off to build their own; Wilkes, of Cambridge,England, being the first to get one running usefully, the EDSAC.
At first each machine was a one-of-a-kind, though ma ny were copied from (but often completed before)
the Institute for Advanced Studies machine under von Neumannâs direction, because the engineering of that
machine was apparently held up. As a result, many of the so-called copies, like the Maniac-I (1952) (whichwas named to get rid of the idiotic naming of machines), and built under the direction of N.C. Metropolis, wasfinished before the Institute machine. It, and the Maniac-II (1955), were bu ilt at Los Alamos, while the Maniac-
III (1959) was built at the University of Chicago. The Federal government, especially through the military,supported most of the early machin es, and great credit is due to th em for helping start the Computer
Revolution.
The first commercial production of electronic computers was under Mauchly and Eckert again, and since
the company they formed was merg ed with another, their machines were finally called UNIVACS.
Especially noted was the one for the Census Bureau. IBM came in a bit la te with 18 (20 if you count secret
cryptographic users) IBM 701s. I well recall a group of us, after a session on the IBM 701 at a meeting
where they talked about the proposed 18 machines, all be lieved this would saturate the market for many years!
Our error was simply we thought only of the kinds of things we were currently doing, and did not think inHISTORY OF COMPUTERSâHARDWARE 19
the directions of entirely new applications of machines . The best experts at the ti me were flatly wrong! And
not by a small amount either! Nor for the last time!
Let me turn to some comparisons:
Hand calculators 1/ 20 ops. per sec.
Relay machines 1 op. per sec. typically
Magnetic drum machines 15â1000 depending so mewhat on fixed or floating point
The Scale of Computing Speed
- The author contrasts the exponential growth of computer speeds from the 701 model to 1990s machines, predicting another hundredfold increase.
- To humanize these speeds, the author notes that a modern machine performs more operations in three seconds than there are seconds in a human lifetime.
- Physical constraints like the speed of light dictate that high-speed components must be placed extremely close together to avoid signal lag.
- At the femtosecond scale, light only travels across approximately 300 atoms, necessitating microscopic hardware architecture.
- Heat dissipation remains a critical barrier, as increasing component density and state-change frequency threaten to melt the hardware.
- The transition to lower voltages is a necessary strategy to compensate for the thermal energy generated by dense, high-speed circuits.
Thus in 3 seconds a machine doing 109 floating point operations per second (flops) will do more operations than there are seconds in your whole lifetime, and almost certainly get them all correct!
701 type 1000 ops. per sec.
Current (1990) 109 (around the fastest of the von Neumann type).
The changes in speed, and corresponding storage capacities, that I have had to live through should give you
some idea as to what you will have to endure in your careers. Even for von Neumann type machines there is
probably another factor of speed of around 100 before reaching the saturation speed.
Since such numbers are actually beyond most human experience I need to introduce a human dimension
to the speeds you will hear about. First notation (the parentheses contain the standard symbol)
milli(m) 10â3kilo (K) 103
micro (Âľ) 10â6mega (M) 106
nano(n) 10â9giga (G) 109
pico(p) 10â12terra(T) 1012
femto (f) 10â15
atto(a) 10â18
Now to the human dimensions. In one day there are 60Ă 60Ă24=86,400 seconds. In one year there are close
to 3.15Ă107 seconds, and in 100 years, probably greater than your lifetime, th ere are about 3.15Ă109
seconds. Thus in 3 seconds a machine doing 109 floating point operations per second (flops) will do
more operations than there are seconds in your whol e lifetime, and almost certa inly get them all correct!
For another approach to human dimensions, the velocity of light in a vacuum is about 3Ă1010 cm/sec,
(along a wire it is about 7/10 as fast). Thus in a nanosecond light goes 30 cm, about one foot. At a
picosecond the distance is, of course, about 1/100 of an inch. These represent th e distances a signal can go
(at best) in an IC. Thus at some of the pulse rates we now use the parts must be very close to each other-close in human dimensionsâor else much of the potential speed will be lost in going between parts. Alsowe can no longer used lu mped circuit analysis.
How about natural dimensions of length instead of human dimensions? Well, atoms come in various
sizes running generally around 1 to 3 angstroms (an angstrom is 10
â8 cm.) and in a crystal are spaced around
10 angstroms apart, typically, though there are exceptions. In 1 femtos econd light can go across about 300
atoms. Therefore the part s in a very fast computer must be small and very close together!
If you think of a transistor using impurities, and the impurities run around 1 in a million typically, then
you would probably not believe a transistor with 1 impure atom, but maybe, if you lower the temperature toreduce background noise, 1000 impurities is within your imaginationâthus making the solid state device ofat least around 1000 atoms on a side. With interconnections at times running at least 10 device distancesyou see why you feel getting below 100,000 atoms distance between some interconnected devices is reallypushing things (3 picoseconds).20 CHAPTER 3
Then there is heat dissipation. Whil e there has been talk of thermodyn amically reversible computers, so
far it has only been talk and published papers, and heat still matters. The more parts per unit area, and thefaster the rate of state change, the more the heat gene rated in a small area which must be gotten rid of before
things melt. To partially compensate we have been go ing to lower, and lower voltages, and are now going to
2
The Saturation of Single Processors
- Engineers are exploring diamond and other crystal structures to manage heat conduction as integrated circuits reach physical limits.
- Computer architecture is shifting toward parallel processing, pipelines, and cache memories to bypass the speed limitations of single arithmetic units.
- The growth of single-processor computer speed follows an 'S' curve, moving from rapid linear growth toward an inevitable saturation point.
- A lack of a standard parallel architecture leads to fragmented efforts and competing designs with varying strategies for speed.
- The author reflects on the exponential growth of computing demand at Bell Labs, which doubled every 15 to 20 months for years.
- A shift in perspective suggests that the true value of computing lies in generating insight rather than merely increasing the volume of numerical operations.
The purpose of computing is insight, not numbers.
or 3 volts operating the IC. The possibility the base of the chip have a diamond layer is currently being
examined since diamond is a very good heat conductor, much better than copper. There is now a reasonablepossibility for a similar, possibly less expensive than diamond, crystal structure with very good heat
conduction properties.
To speed up computers we have gone to 2, to 4, and even more, arithmetic units-in the same computer,
and have also devised pipelines and cache memories . These are all small steps towards highly parallel
computers.
Thus you see the handwriting on the wall for th e single processor machineâwe are approaching
saturation. Hence the fascination with highly parallel machines. Unfortunately there is as yet no singlegeneral structure for them, but rather many, many competing designs, all generally requiring differentstrategies to exploit their potential speeds and having different advantages and disadvantages. It is not likelya single design will emerge for a standard parallel co mputer architecture, hence there will be trouble and
dissipation in efforts to pursue the various promising directions.
From a chart drawn up long ago by Los Alamos (LANL) using the data of the fa stest current computer on
the market at a given time they found the equation for the number of operations per second was
and it fitted the data fairly well. Here time begins at 1943. In 1987 the extrapolated value predicted (byabout 20 years!) was about 3Ă10
8 and was on target. The limiting asymptote is 3.576Ă109 for the von Neumann
type computer with a single processor.
Here, in the history of the growth of computers, you see a realization of the âSâ type growth curve; the
very slow start, the rapid rise, the long stretch of almost linear growth in the rate, and then the facing of the
inevitable saturation.
Again, to reduce things to human size. When I first got digital computing really going inside Bell
Telephone Laboratories I began by renting computers outside for so many hours the head of the
Mathematics department figured out for himself it would be cheaper to get me one insideâa deliberate plot
on my part to avoid arguing with him as I thought it useless and would only produce more resistance on his
part to digital computers. Once a boss says âno!â it is very hard to get a different decision, so donât let them
say âNo!â to a proposal. I found in my early years I was doubling the number of computations per year
about every 15 months. Some years later I was reduced to doubling the amount about every 18 months. The
department head kept telling me I could not go on at that rate forever, and my polite reply was always, âYou
are right, of course, but you just watch me doubl e the amount of computing every 18â20 months!â Because
the machines available ke pt up the corresponding ra te enabled me, and my successors, for many years to
double the amount of computing done. We lived on the almost straight line part of the âSâ curve all those
years.
However, let me observe in all honesty to the Department Head, it was remarks by him which made me
realize it was not the number of operations done th at mattered, it was, as it were, the number of micro-
Nobel prizes I computed that mattered. Thus the motto of a book I published in 1961:
The purpose of computing is insight, not numbers.
A good friend of mine revised it to:HISTORY OF COMPUTERSâHARDWARE 21
The Machine View
- Computers are fundamentally constructed from binary devices, including two-state storage units and gates that either block or pass signals.
- The basic machine cycle consists of fetching an instruction from a specific address, decoding it, executing it, and incrementing the address register.
- At the hardware level, a computer possesses no global knowledge or inherent meaning; it simply reacts to bits according to other bits.
- The author draws a parallel between the mindless operation of computer gates and the Democritean view of humans as merely atoms and void.
- Adopting a strictly mechanical view of the computer is essential for debugging, as it requires assuming the machine has no free will or self-awareness.
We see the machine does not know where it has been, nor where it is going to go; it has at best only a myopic view of simply repeating the same cycle endlessly.
The purpose of computing numbers is not yet in sight.
It is necessary now to turn to some of the details of how for many years computers were constructed. The
smallest parts we will examine are two state devices for storing bits of information, and for gates which
either let a signal go through or block it. Both are binary devices, and in the current state of knowledge theyprovide the easiest, fastest methods of computing we know.
From such parts we construct combinations which enable us to store longer arrays of bits; these arrays are
often called number registers. The logical control is ju st a combination of storag e units including gates. We
build an adder out of such devices, as well as every larger unit of a computer.
Going to the still larger units we have the machin e consisting of: (1) a storage device, (2) a central
control, (3) an ALU unit, meaning Arithmetic and Logic Unit. There is in the central control a single
register which we will call the Current Address Register (CAR). It holds the address of where the next
instruction is to be found, Figure 3.I .
The cycle of the computer is:
1. Get the address of the next instruction from the CAR.2. Go to that address in stor age and get that instruction.
3. Decode and obey that instruction.4. Add 1 to the CAR address, and start in again.
We see the machine does not know where it has been, nor where it is going to go; it has at best only amyopic view of simply repeating the same cycle endlessly. Below this level the individual gates and twoway storage devices do not know any meaningâthey
simply react to what they are supp osed to do. They too have no glob al knowledge of what is going on,
nor any meaning to attach to an y bit, whether storage or gating.
There are some instructions which, depending on some state of the machine, put the address of their
instruction into the CAR, (and 1 is not added in such cases), and then the machin e, in starting its cycle,
simply finds an address which is not the immediate succes sor in storage of the prev ious instruction, but the
location inserted into the CAR.
I am reviewing this so you will be clear the machin e processes bits of information acco rding other bits,
and as far as the machine is concerned there is no meaning to anything which happensâ it is we who attach
meaning to the bits . The machine is a âmachineâ in the classical se nse; it does what it does and nothing else
(unless it malfunctions). There are, of course, real time interrupts, and other ways new bits get into themachine, but to the machine they are only bits.
Figure 3.I
22 CHAPTER 3
But before we leave the topic, recall in ancient Greece Democritus (460?â362?) observed; âAll is atoms
and voidâ. He thus expressed the view of many physicists today, the world, including you and me, is madeof molecules, and we exist in a radi ant energy field. There is nothing more! Are we machines? Many of you
do not wish to settle for this, but feel there is more to you than just a lot of molecules banging against oneanother mindlessly, which we see is one view of a computer. We will examine this point in Chapters 6 â8
under the title of Artificial Intelligence (AI).
There is value in the machine view of a computer, that it is just a co llection of storage devices and gates
processing bits, and nothing more. This view is useful, at times, when debugging (finding errors) in aprogram; indeed is what you must assume when you try to debug. You assume the machine obeys theinstructions one at a time, and does nothing moreâit has no âfree willâ or any of the other attributes such asthe self-awareness and self-consciousness we often associate with humans.
How different are we in practice from the machines? We would all like to think we are different from
machines, but are we essentially? It is a touchy point for most people, and the emotional and religious aspectstend to dominate most arguments. We will return to this point in the Chapters 6 â8 on AI when we have
Evolution of Computer Software
- The text transitions from discussing hardware history to the foundational developments in software.
- It emphasizes that understanding software requires a baseline knowledge of the hardware it runs on.
- The section serves as a structural bridge between physical machine components and logical instructions.
- It highlights the historical progression of programming and operating systems.
- The narrative focuses on how software evolved to manage increasingly complex hardware architectures.
History of ComputersâSoftware
more background to discuss it reasonably. HISTORY OF COMPUTERSâHARDWARE 23
4
History of Computersâ Software
Evolution of Computer Control
- Early computing relied on manual control and physical plug boards to direct data flow and operations.
- The transition to relay machines introduced punched paper tapes, which were physically difficult to manage and prone to mechanical errors.
- Internal programming emerged as storage became available, though its true origin is debated between von Neumann and the Mauchly-Eckert team.
- Early programmers faced the immense complexity of 'minimum latency coding,' manually calculating data placement to sync with rotating storage hardware.
- The development of the SOAP program marked a milestone in self-optimization, where a program could be used to improve its own efficiency.
- Initial coding was performed in absolute binary, requiring programmers to write every instruction and memory address in raw machine code.
Paper tapes are a curse when doing one-shot problems âthey are messy, and gluing them to make corrections, as well as loops, is troublesome (because, among other things, the glue tends to get into the reading fingers of the machine!).
As I indicated in the last chapter, in the early days of computing the control part was all done by hand. The
slow desk computers were at first controlled by hand, for example multiplication was done by repeatedadditions, with column shifting after each digit of the multiplier. Division was similarly done by repeated
subtractions. In time electric motors were applied both for power and later for more automatic control over
multiplication and division. The punch card machines were controlled by plug board wiring to tell themachine where to find the information, what to do with it, and where to put the answers on the cards (or on
the printed sheet of a tabulator), but some of the control might also come from the cards themselves,
typically X and Y punches (other digits could, at times, control what happened). A plug board was specially
wired for each job to be done, and in an accounting office the wired boards were usually saved and used
again each week, or month, as they were needed in the cycle of accounting.
When we came to the relay machines, after Stibitzâ s first Complex Number Computer, they were mainly
controlled by punched paper tapes. Paper tapes are a curse when doing one-shot problems âthey aremessy, and gluing them to make corrections, as well as loops, is troublesome (because, among other things,the glue tends to get into the reading fingers of the m achine!). With very little internal storage in the early
days the programs could not be economically stored in the machines (though I am inclined to believe thedesigners considered it).
The ENIAC was at first (1945â1946) controlled by wiring as if it were a gigantic plugboard, but in time
Nick Metropolis and Dick Clippenger converted it to a machine that was programmed from the ballistictables, which were huge racks of dials into which decimal digits of the program could be set via the knobsof the decimal switches.
Internal programm ing became a reality when storage was reasona bly available, and, while it is commonly
attributed to von Neumann, he was only a consultant to Mauchly and Eckert and their team. According toHarry Huskey internal programming was frequently discussed by them before von Neumann began the
consulting. The first, at all widely available discus sion (after Lady Lovelace wrote and published a few
programs for the proposed Babbage analytical engine), was the von Neumann Army reports which werewidely circulated, but never pu blished in any referred place.
The early codes were one address mainly, meaning each instruction cont ained an instruction part and the
address where the number was to be found or sent to. We also had two address codes, typically for rotating
drum machines, so the next instruction would be immediately available once the previous one wascompletedâ the same applied to mercury delay line s, and other storage devices which were serially
available. Such coding was called minimum latency coding, and you can imagine the trouble the
programmer had in computing where to put the next instruction and numbers (to avoid delays and conflictsas best possible), let alone in locating programming errors (bugs). In time a program named SOAP
(symbolic optimizing assembly program) was available to do this optimizing using the IBM 650 machine
itself. There were also three and four ad dress codes, but I will ignore them here.
An interesting story about SOAP is a copy of the pr ogram, call it program A, wa s both loaded into the
machine as a program, and processed as data. The output of this was program B. Then B was loaded into the650 and A was run as data to produce a new B program. The difference between the two running times toproduce program B indicated how much the optimization of the SOAP program (by SOAP itself) produced.An early example of self-compiling as it were.
In the beginning we programmed in absolute binary, meaning we wrote the act ual address where things
were in binary, and wrote the instruction part also in binary!
The Rise of Symbolic Programming
- Early programmers used octal and hexadecimal systems to manage binary code, requiring them to memorize complex addition and multiplication tables.
- Correcting errors in absolute binary code led to a 'can of spaghetti' structure because inserting instructions required manually updating every address in the program.
- The development of relocatable programs and mathematical libraries allowed for the first instances of reusable software, moving away from fixed storage locations.
- The introduction of symbolic names (like ADD) and symbolic addresses was initially met with fierce resistance from 'heroic' programmers who viewed it as a waste of machine capacity.
- Despite the clear efficiency of Symbolic Assembly Programs (SAP), many veteran programmers dismissed them as 'sissy stuff' and preferred the labor-intensive absolute method.
- The transition to modern programming was delayed for years by a culture that valued manual control over the logical benefits of automation and abstraction.
As a result the control path of the program through storage soon took on the appearance of a can of spaghetti.
There were two trends to escape this, octal,
where you simply group the binary digits in sets of three, and hexadecimal where you take four digits at a
time, and had to use A, B, C, D, E, F for the repr esentation of other numbers beyond 9 (and you had, of
course, learn the multiplication and addition tables to 15).
If, in fixing up an error, you wanted to insert so me omitted instructions then you took the immediately
preceding instruction and replaced it by a transfer to some empty space. There you put in the instruction you
just wrote over, added the instructio ns you wanted to insert, and then followed by a transf er back to the
main program. Thus the program soon became a sequen ce of jumps of the contro l to strange places. When,
as almost always happens, there were errors in the co rrections you then used the same trick again, using
some other available space. As a result the control path of the program through storage soon took on theappearance of a can of spag hetti. Why not simply insert them in th e run of instructions? Because then you
would have to go over the entire program and change all the addresses which refered to any of the movedinstructions! Anything but that!
We very soon got the idea of reusable software, as it is now called. Indeed Babbage had the idea. We
wrote mathematical libraries to reuse blocks of code . But an absolute address library meant each time the
library routine was used it had to occupy the same locations in storage. When the complete library becametoo large we had to go to relocatable programs. The necessary programming tricks were in the von
Neumann reports, which were never formally published.
The first published book devoted to programming was by Wilkes, Wheeler, and Gill, and applied to the
Cambridge, England EDSAC (1951). I, among others, learned a lot from it, as you will hear in a fewminutes.
Someone got the idea a shor t piece of program could be written whic h would read in the symbolic names
of the operations (like ADD) and translate them at input time to the binary represen tations used inside the
machine (say 01100101). This was soon followed by th e idea of using symbolic addressesâa real heresy
for the old time programmers. You do not now see much of the old heroic absolute programming (unlessyou fool with a hand held programmable computer and try to get it to do more than the designer and builderever intended).
I once spent a full year, with the help of a lady programmer from Bell Telephone Laboratories, on one big
problem coding in absolute binary for the IBM 701, wh ich used all the 32K regist ers then available. After
that experience I vowed never again would I ask anyone to do such labor. Having heard about a symbolicsystem from Poughkeepsie, IBM, I ask her to send for it and to use it on the next problem, which she did.As I expected, she reported it was much easier. So we told everyone about the new method, meaning about
100 people, who were also eating at the IBM cafeteria near where the machine was. About half were IBM
people and half were, like us, outsiders renting time. To my knowledge only one personâyes, only oneâof
all the 100 showed any interest!
Finally, a more complete, and more useful, Symbolic Assembly Program (SAP) was devisedâafter more
years than you are apt to believe during which most programmers continued their heroic absolute binaryHISTORY OF COMPUTERSâSOFTWARE 25
programming. At the time SAP first appeared I would guess about 1% of the older programmers were
interested in itâusing SAP was âsissy stuffâ, and a real programmer would not stoop to wasting machinecapacity to do the assembly. Yes! Programmers wanted no part of it, though when pressed they had to admit
their old methods used more machine time in locating and fixing up errors than the SAP program ever used.One of the main complaints was when using a symb olic system you do not know where anything was in
The Resistance to FORTRAN
- Early programmers resisted symbolic mapping and FORTRAN, preferring to work in absolute binary addresses despite the inefficiency.
- Professional groups, including programmers, doctors, and lawyers, often fail to apply their own expertise to their own work habits.
- Using FORTRAN allowed the author's team to produce ten times more output than peers who viewed high-level languages as being 'for sissies.'
- The evolution of software is characterized by a transition from absolute to virtual machines, buffering the user from hardware complexities.
- The success of FORTRAN was largely due to its psychological appeal, as it translated familiar mathematical formulas rather than requiring new ways of thinking.
- The development of monitor systems was necessary to stop the massive waste of expensive machine and human time during operation.
Third, even if it did work, no respectable programmer would use itâit was only for sissies!
storageâthough in the early days we supplied a mapping of symbolic to actual storage, and believe it or not
they later lovingly pored over such sheets rather than realize they did not need to know that information ifthey stuck to operating within the systemâno! When correcting errors they preferred to do it in absolutebinary addresses.
FORTRAN, meaning FORmula TRANslation, was pr oposed by Backus and friends, and again was
opposed by almost all programmers. First, it was said it could not be done. Second. if it could be done, it
would be too wasteful of machine time and capacity. Third, even if it did work, no respectable programmerwould use itâit was only for sissies!
The use of FORTRAN, like the earlier symbolic programming, was very slow to be taken up by the
professionals. And this is typical of almost all professional groups. Doctors clearly do not follow the advicethey give to others, and they also have a high proportion of drug addicts. Lawyers often do not leave decentwills when they die. Almost all professionals are sl ow to use their own expertise for their own work. The
situation is nicely summarized by the old saying, âThe shoe makerâs children go without shoesâ. Considerhow in the future, when you are a great expert, you will avoid this typical error!
With FORTRAN available and running, I told my programmer to do the next problem in FORTRAN, get
her errors out of it, let me test it to see it was doing th e right problem, and then she could, if she wished, rewrite
the inner loop in machine language to speed things up and save machine time. As a result we were able,with about the same amount of effort on our part, to produce almost 10 times as much as the others were
doing. But to them programming in FORTRAN was not for real programmers!
Physically the management of the IBM 701, at IB M Headquarters in NYC where we rented time, was
terrible. It was a sheer waste of mach ine time (at that time $300 per hour was a lot) as well as human time.
As a result I refused later to order a big machine until I had figured out how to have a monitor systemâwhich someone else finally built for our first IBM 709, and later modified it for the IBM 7096.
Again, monitors, often called âthe systemâ these days, like all the earlier steps I have mentioned, should
be obvious to anyone who is involved in using the machines from day to day; but most users seem too busyto think or observe how bad things are and how much the computer could do to make things significantlyeasier and cheaper. To see the obvious it often takes an outsider, or else someone like me who is thoughtfuland wonders what he is doing and why it is all necessary. Even when told, the old timers will persist in the
ways they learned, probably out of pride for their past and an unwillingness to admit there are better ways
than those they were using for so long.
One way of describing what happened in the history of software is that we were slowly going from
absolute to virtual machines . First, we got rid of the actual code instructions, then the actual addresses, then
in FORTRAN the necessity of learning a lot of the insides of these complicat ed machines and how they
worked. We were buffering the user from the machine itself. Fairly ear ly at Bell Telephone Laboratories we
built some devices to make the tape units virtual, machine independent. When, and only when, you have a
totally virtual machine will you have the ability to transfer software from one machine to another withoutalmost endless trouble and errors.
FORTRAN was successful far beyond anyon eâs expectations because of the psychological fact it was just
what its name impliedâ FORmula TRANslation of the things one had always done in school; it did not
require learning a new set of ways of thinking.26 CHAPTER 4
Logic vs Psychology in Programming
- The failure of Algol demonstrates that logically perfect languages often fail because they are not 'humane' or psychologically intuitive for human users.
- Problem Oriented Languages (POLs) failed to gain dominance due to high learning costs and the inability to handle cross-disciplinary problems.
- LISP emerged almost by accident when a student realized the theoretical elements could be used to write a self-compiling system.
- The author argues that the creators of new fields, including von Neumann and Einstein, rarely understand the full implications of their inventions as well as their followers do.
- Early computing pioneers often failed to grasp the generality of tools like interpreters or the fact that computers are symbol manipulators rather than just number crunchers.
It has been said in physics no creator of any significant thing ever understood what he had done.
Algol, around 1958â1960, was backed by many worldwide computer organizations, including the ACM.
It was an attempt by the -theoreticians to greatly improve FORTRAN. But being logicians, they produced alogical, not a humane, psycholo gical languageand of course, as you know , it failed in the long run. It was,
among other things, stated in a Bo olean logical form which is not comprehensible to mere mortals (and
often not even to the logicians themselves!). Many other logically designed languages which were supposedto replace the pedestrian FORTRAN have come an d gone, while FORTRAN (somewhat modified to be
sure) remains a widely used language, indicating clearly the power of psychologically designed languagesover logically designed languages.
This was the beginning of a great hope for special languages, POLs they were called, meaning Problem
Oriented Languages. There is some merit in this idea, but the great enthusia sm faded because too many
problems involved more than one special field, and the languages were usually incompatible. Furthermore,in the long run, they were too costly in the learning phase for humans to master all of the various ones theymight need.
In about 1962 LISP language began. Various rumors floated around as to how actually it came about-the
probable truth is something like this: John McCarthy suggested the elements of the language for theoretical
purposes, the suggestion was taken up and significantly elaborated others, and when some student observedhe could write a compiler for it in LISP, using the simple trick of self-compiling, all were astounded,
including, apparently, McCarthy himself. But he urged the student to try, and magically almost overnightthey moved from theory to a real operating LISP compiler!
Let me digress, and discuss my ex periences with the IBM 650. It was a two address drum machine, and
operated in fixed decimal point. I knew from my past experiences in research floating point was necessary(von Neumann to the contrary) and I needed index registers which were not in the machine as delivered.
IBM would some day supply the floating point subroutines, so they said, but that was not enough for me. Ihad reviewed for a Journal the EDSAC book on pr ogramming, and there in Ap pendix D was a peculiar
program written to get a large program into a small storage. It was an interpreter . But if it was in Appendix
D did they see the importance? I doubt it! Furthermore, in the second edition it was still in Appendix D
apparently unrecognized by them for what it was.
This raises, as I wished to, the ugly point of when is something understood? Yes, they wrote one, and
used it, but did they understand the generality of inte rpreters and compilers? I believe not. Similarly, when
around that time a number of us realized computers were actually symbol manipulators and not just number
crunchers, we went around giving talks, and I saw people nod their heads sagely when I said it, but I alsorealized most of them did not understand. Of course you can say Turingâs original paper (1937) clearlyshowed computers were symbol manipulating machines, but on carefully rereading the von Neumannreports you would not guess the authors did-though there is one combinatorial program and a sortingroutine.
History tends to be charitable in this matter. It gives credit for understanding what something means
when we first to do it. But there is a wise sayi ng, âAlmost everyone who opens up a new field does not
really understand it the way the followers doâ. The evid ence for this is, unfortunately, all too good. It has
been said in physics no creator of any significant thing ever understood what he had done. I never foundEinstein on the special relativity theory as clear as some later commentators. And at least one friend of mine
has said, behind my back, âHamming doesnât seem to understand error correcting codes!â He is probablyright; I do not understand what I invented as clearly as he does.
The Inventor's Limited Vision
- Creators often struggle to see the full potential of their inventions because they are blinded by the difficulties of the development process.
- The author outlines four foundational rules for language design: ease of learning, use, debugging, and subroutine integration.
- Effective programming requires a hybrid approach, alternating between top-down philosophical design and bottom-up efficiency checks.
- The IBM 650 was transformed from a two-address fixed-point machine into a three-address floating-point system for the user.
- Historical perspective suggests that even revolutionary figures like Newton are often more tied to the past than the future they help create.
Please remember, the inventor often has a very limited view of what he invented, and some others (you?) can see much more.
The reason this happens so often is thecreators have to fight through so many dark difficulties, and wade through so much misunderstanding andconfusion, they cannot see the light as others can, now the door is open and the path made easy. Pleaseremember, the inventor often has a very limited view of what he invented, and some others (you?) canHISTORY OF COMPUTERSâSOFTWARE 27
see much more. But also remember this when you are the author of some brilliant new thing; in time the
same will probably be true of you. It has been said Newton was the last of the ancients and not the first ofthe moderns, though he was very significant in making our modern world.
Returning to the IBM 650 and me. I started out (1956 or so) with the following four rules for designing a
language:
1. Easy to learn.
2. Easy to use.3. Easy to debug (find and correct errors).4. Easy to use subroutines.
The last is something which need not bother you as in those days we made a distinction between âopenâ and
âclosedâ subroutines which is hard to explain now!
You might claim I was doing top-down programming, but I immediately wrote out the details of the inner
loop to check that it co uld be done efficiently (bottom-up programming) and only then did I resume my top-
down, philosophical approach. Thus, while I believe in top-down programming as a good approach, Iclearly recognize bottom-up progra mming is also needed at times.
I made the two address, fixed point decimal machine look like a three address floating point machineâ
that was my goalâA op. B=C. I used the ten decimal di gits of the machine (it was a decimal machine so far
as the user was concerned) in the form
A address Op. B address C address
xxx x xxx xxx
The Birth of Synthetic Languages
- The author details the creation of a four-step loop on an IBM 650 to interpret a custom three-address language.
- By mapping subroutines to specific instruction numbers, the programmer defines the meaning and behavior of the synthetic language.
- This process demonstrates the practical application of Turing's Universal Turing Machine, allowing one machine to simulate any other.
- The system utilized a memory-partitioning strategy to provide 'designed-in security,' preventing user programs from overwriting the software system.
- The author critiques the tendency of programmers to design 'logical' languages like APL that are powerful but psychologically unfit for human use.
It goes on top of the machineâs language, making the machine into any other machine you want.
How was it done? Easy! I wrote out in my mind the following loop, Figure 4.I . First, we needed a Current
Address Register, CAR, and so I assigned one of the 2000 computer registers of the IBM 650 to do this
duty. Then we wrote a program to do the four steps of the last chapter. (1) Use the CAR to find where to go
for the next instruction of the program you wrote (written in my language, of course). (2) Then take the
instruction apart, and store the three addresses, A, B, and C, in su itable places in the 650 storage. (3) Then
add a fixed constant to the operation (Op.) of the instruction and go to that address. There, for each
instruction, would be a subroutine which described the corresponding operation. You might think I had,
therefore only ten possible operations, but there are only four three-address operations, addition, subtraction,
multiplication, and division, so I used the 0 instruction to mean âgo to the B address and find further details
of what is wantedâ. Each subroutine when it was finish ed transferred the control to a given place in the loop.
(4) We then added 1 to the contents of the CAR regist er, cleaned up some details, and started in again, as
does the original machine in its own internal operation. Of course the transfer instructions, the 7 instructions
as I recall, all put an address into the CAR and transf erred to a place in the loop beyond the addition of 1 to
the contents of the CAR register.
An examination of the process shows whatever meaning you want to att ach to the instructions must come
from the subroutines which are written corresponding to the instruction numbers. Those subroutines define
the meaning of the language . In this simple case each instruction had its own me aning independent of any
other instruction, but it is clearly easy to make some instructions set switches, flags, or other bits so some
later instructions on consulting them will be interprete d in one of several differ ent ways. Thus you see how
it is you can devise any language you want, provided you can uniquely define it in some definite manner. It
goes on top of the machineâs language, making the machine into any other m achine you want. Of course28 CHAPTER 4
this is exactly what Turing proved with his Universal Turing Machine, but as noted above, it was not clearly
understood until we had done it a number of times.
The software system I built was placed in the storage registers 1000 to 1999. Thus any program in the
synthetic language, having only 3 decimal digits could only refer to addresses 000 to 999, and could not
refer to, and alter, any register in the software and thus ruin it; designed in security protection of the
software system from the user.
I have gone through this in some detail since we commonly write a language above the machine language,
and may write several more still higher languages, one on top of the other, until we get the kind of languagewe want to use in expressing our pr oblems to the machine. If you use an interpreter at each stage, then, of
course, it will be somewhat inefficient. The use of a compiler at the top will mean the highest language is
translated into one of the lower languages once and for all, though you may still want an interpreter at somelevel. It also means, as in the EDSAC case, usually a great compression of prog ramming effort and storage.
I want to point out again the difference between writing a logical and a psychological language.
Unfortunately, programmers, being logically oriented, and rarely humanly oriented, tend to write and extollogical languages. Perhaps the suprem e example of this is APL. Logically APL is a great language and to
this day it has its ardent devotees, but it is also not fit for normal humans to use. In this language there is a game
of âone linersâ; one line of code is given and you are asked what it means. Even experts in the language have
been known to stumble badly on some of them.
Redundancy and Human Error
- The APL programming language lacks redundancy, meaning a single character change can fundamentally alter a program's logic.
- Human communication relies on high redundancy levels, approximately 60% for speech and 40% for writing, to ensure clarity.
- Written and spoken languages differ significantly in structure, making it notoriously difficult to write authentic-sounding dialogue.
- Low redundancy in systems leads to undetected errors because humans are inherently unreliable processors of information.
- Spoken language requires higher redundancy to overcome acoustic noise and the inability of the listener to pause or back-scan.
- English orthography and phonetics demonstrate how written language provides more visual cues for disambiguation than spoken sounds.
Almost no one can write dialog so that it sounds right, and when it sounds right it is still not the spoken language.
A change of a single letter in APL can completely alter the meaning, hence the language has almost no
redundancy . But humans are unreliable and require redundancy; our spoken language tends to be around
60% redundant, while the written language is around 40%. You probably think the written and spoken
languages are the same, but you are wrong. To see this difference, try writing dialog and then read how it
sounds. Almost no one can write dialog so that it sounds right, and when it sounds right it is still not the
spoken language.
The human animal is not reliable, as I keep insisting, so low redundancy means lots of undetected errors,
while high redundancy tends to catch the errors. The spoken language goes over an acoustic channel withall its noise and must caught on the fly as it is spoken; the written language is printed, and you can pause,
back scan, and do other things to uncover the author âs meaning. Notice in English more often different
words have the same sounds (âthereâ and âtheirâ for example) than words have the same spelling but
different sounds (ârecordâ as a noun or a verb, and âtearâ as in tear in the eye, vs. tear in a dress). Thus you
Figure 4.I
HISTORY OF COMPUTERSâSOFTWARE 29
The Engineering Efficiency of Language
- Programming languages should be judged by how well they fit the human animal rather than the convenience of computer experts.
- The ideal future of computing involves the domain expert writing code directly, eliminating the 'human interface' of a separate programmer.
- The ADA language is criticized as a 'hacking job' that lacks psychological design, leading developers to write in FORTRAN and convert to ADA only for compliance.
- There is a profound lack of research into the 'engineering efficiency' of languages, including optimal redundancy and structural density for human-machine communication.
- Software problems will persist until we understand how natural languages evolved to suit human communication and apply those lessons to artificial languages.
- The failure of the Japanese 'fifth generation' project highlights the difficulty of using AI to bridge the gap between machines and human problem solvers.
What I wanted to know was how the job of communication can be efficiently accomplished when we have the power to design the language, and when only one end of the language is humans, with all their faults, and the other is a machine with high reliability to do what it is told to do, but nothing else.
should judge a language by how well it fits the human animal as it isâand rememb er I include how they are
trained in school, or else you must be prepared to do a lot of training to handle the new type of language youare going to use. That a language is easy for the comp uter expert does not mean it is necessarily easy for the
non-expert, and it is likely non-experts will do the bulk of the programming (coding if you wish) in the nearfuture.
What is wanted in the long run, of course, is the ma n with the problem does the actual writing of the code
with no human interface, as we all too often have these days, between the pe rson who knows the problem
and the person who knows the programming language. This date is unfortunately too far off to do muchgood immediately, but I would think by the year 2020 it would be fairly universal practice for the expert inthe field of application to do the actual program prep aration rather than have experts in computers (and
ignorant of the field of application) do the progam preparation.
Unfortunately, at least in my opinion, the ADA language was designed by experts, and it shows all the
non-humane features you can expect from them. It is, in my opinion, a typical Computer Science hackingjobâdo not try to understand what you are doing, just get it running. As a result of this poor psychologicaldesign, a private survey by me of knowledgeable people suggests that although a Government contract mayspecify the programming be in ADA, probably over 90% will be done in FORTRAN, debugged, tested, andthen painfully, by hand, be converted to a poor ADA program, with a high probability of errors!
The fundamentals of language are not understood to this day. Somewhere in the early 1950s I took the
then local natural language expert (in the public eye) to visit the IBM 701 and then to lunch, and at desserttime I said, âProfessor Pei, would you please discuss with us the engineering efficiencies of languagesâ. Hesimply could not grasp the question and kept telling us how this particular language put the plurals in themiddle of words, how that language had one feature and not another, etc. What I wanted to know was howthe job of communication can be e fficiently accomplished when we have the power to design the language,
and when only one end of the languag e is humans, with all their faults, and the other is a machine with high
reliability to do what it is told to do, but nothing else. I wanted to know what redundancy I should have forsuch languages, the density of irregular and regular ve rbs, the ratio of synonyms to antonyms, why we have
the number of them that we do, how to compress efficiently the communication channel and still leaveusable human redundancy, etc. As I said, he could not hear the question concerning the engineeringefficiency of languages, and I have not noticed many studies on it since. But until we genuinely understandsuch thingsâassuming, as seems reasonable, the cu rrent natural languages through long evolution are
reasonably suited to the job they do for humansâwe will not know how to design artificial languages forhuman-machine communicat ion. Hence I expect a lot of trouble until we do understand human
communication via natural languages. Of course, the problem of human-machine is significantly differentfrom humanhuman communication, but in which wa ys and how much seems to be not known nor even
sought for.
Until we better understand languages of communicati on involving humans as th ey are (or can be easily
trained) then it is unlikely many of our software problems will vanish.
Some time ago there was the promin ent âfifth generationâ of computers the Japanese planned to use,
along with AI, to get a better interface between the machine and the human problem solvers. Great claims weremade for both the machines and the languages. The resu lt, so far, is the machin es came out as advertised,
and they are back to the dr awing boards on the use of AI to aid in programming.
Programming as Novel Writing
- The 'software problem' persists because we lack a fundamental understanding of how language communicates meaning between humans and machines.
- Programming is currently more akin to creative novel writing than classical engineering, as different programmers produce vastly different solutions to the same problem.
- While utility programs may eventually be engineered, general software development remains a highly creative process resistant to rigid engineering controls.
- The most effective but often ignored method for improving software productivity is simply thinking deeply about the entire problem and its maintenance before writing code.
- Rigorous programming models often fail because the programming process itself is frequently how the actual problem is discovered and defined.
- Higher-level languages and modern tools have significantly improved productivity, with estimates suggesting a 90-fold increase over 30 years.
But you do not expect novelists to âengineer the production of novelsâ. The question arises, âIs programming closer to novel writing than it is to classical engineering?â I suggest yes!
It came out as I predicted at
that time (for Los Alamos), since I did not see the Japa nese were trying to understand the basics of language
in the above engineering sense. Ther e are many things we can do to reduce âthe software problemâ, as it is
called, but it will take some basic understandin g of language as it is used to communicate understanding30 CHAPTER 4
between humans, and between humans and machines, before we will have a really decent solution to this
costly problem. It simply will not go away.
You read constantly about âengineerin g the production of softwareâ, both for the efficiency of production
and for the reliability of the product. But you do not ex pect novelists to âengineer the production of novelsâ.
The question arises, âIs programming cl oser to novel writing than it is to classical engineering?â I suggest
yes! Given the problem of getting a man into outer sp ace both the Russians and the Americans did it pretty
much the same way, all things cons idered, and allowing for some espionage. They were both limited by the
same firm laws of physics. But give two novelists the problem of writing on âthe greatness and misery of
manâ, and you will probably get two very different novels (without saying just how to measure this). Givethe same complex problem to two modern programmers and you will, I claim, get two rather different
programs. Hence my belief current programming practice is closer to novel writing than it is to engineering.The novelists are bound only by their imaginations, whic h is somewhat as the progr ammers are when they are
writing software. Both activities have a large creative component, and while you would like to makeprogramming resemble engineering, it will take a lot of time to get thereâand mayb e you really, in the long
run, do not want to do it! Maybe it just sounds good You will have to think about it many times in thecoming years; you might as well start now and discount propaganda you hear, as well as all the wishfulthinking which goes on in the area! The software of the utility programs of computers has been done oftenenough, and is so limited in scope, so it might reasonably be expected to become âengineeredâ, but the generalsoftware preparation is not likely to be under âengineering controlâ for many, many years.
There are many proposals on how to improve the productivity of the individual programmer as well as
groups of programmers. I have already mentioned top-down and bottom-up; there are others such a headprogrammer, lead programmer, proving the program is correct in a mathematical sense, and the waterfall
model of programming to name but a few. While each has some merit I have faith in only one which is
almost never mentionedâ think before you write the program, it might be called. Before you start, think
carefully about the whole thing including what will be your acceptance test it is right, as well as how later
field maintenance will be done. Getting it right the first time is much better than fixing it up later!
One trouble with much of programming is simply that often there is not a well defined job to be done,
rather the programming process itself will gradually discover what the problem is! The desire that you begiven a well defined problem before you start programming often does not match reality, and hence a lot ofthe current proposals to âsolve the programming prob lemâ will fall to the ground if adopted rigorously.
The use of higher level languages has meant a lot of progress. One estimate of the improvement in 30
years is:
Assembler: machine code =2:1 Ă2
C language: assembler =3:1 Ă6
Time share: batch =1.5:1 Ă9
UNIX: monitor =1.5:1 Ă12System QA: debugging =2:1 Ă24
Prototyping: top-down =1.3:1 Ă30
C
++: C =2:1 Ă60
Reuse: redo =1.5:1 Ă90
The Human Bottleneck in Software
- Programmer productivity has improved by only 16% annually over 30 years, a rate dwarfed by the exponential speed-up of computer hardware.
- The vast disparity in individual talent suggests it is more efficient to pay low-performing programmers to stay home than to let them interfere with elite talent.
- Neural networks offer a potential solution to the 'programming problem' by learning from feedback rather than requiring explicit, detailed instructions.
- Software development is compared to literary writing, suggesting that clear thinking is a fundamental trait that may not be easily taught in a classroom.
- Experience does not necessarily improve a programmer's skill; like bureaucratic writing, long-term habits may actually degrade the quality of their work.
In practice you may actually be better off to pay the worst to stay home and not get in the way of the more capable (and I am serious)!
so we apparently have made a factor of about 90 in the total productivity of programmers in 30 years (a
mere 16% rate of improvement!). This is one personâs guess, and it is at least plausible. But compared withHISTORY OF COMPUTERSâSOFTWARE 31
the speed up of machines it is like nothing at all! People wish humans could be similarly speeded up, but the
fundamental bottleneck is the human animal as it is, and not as we wish it were.
Many studies have shown programmers differ in productivity, from worst to best, by much more than a
factor of 10. From this I long ago concluded the best policy is to pay your good programmers very well butregularly fire the poorer onesâif you can get away with it! One way is, of course, to hire them on contractrather than as regularly employed people, but that is increasingly against the la w which seems to want to
guarantee even the worst have some employment. In practice you may actually be better off to pay the
worst to stay home and not get in the way of the more capable (and I am serious)!
Digital computers are now being used extensively to simulate neural nets and similar devices are
creeping into the computing field. A neural ne t, in case you are unfamiliar with them, can learn to get
results when you give it a series of inputs and acceptable outputs, wit hout ever saying how to produce the
results. They can classify objects into classes which are reasonable, again withou t being told what classes
are to be used or found. They l earn with simple feedback which us es the informatio n that the result
computed from an input is not acceptable. In a way they represen t a solution to âthe programming
problemââonce they are built th ey are really not programmed at all, but still they can solve a wide variety
of problems satisfactorily. They are a coming field which I shall have to skip in this book, but they willprobably play a large part in the fu ture of computers. In a sense they are a âhard wiredâ computer (it may be
merely a program) to solve a wide class of problems wh en a few parameters are chosen and a lot of data is
supplied.
Another view of neural nets is they represent a fair ly general class of stable feedback systems. You pick
the kind and amount of feedback you think is appropriate, and then the neural netâs feedback systemconverges to the desired solution. Ag ain, it avoids a lot of detailed programming since, at least in a
simulated neural net on a computer, by once writing out a very general piece of program you then have
available a broad class of problems already programmed and the programmer hardly does more than give acalling sequence.
What other very general pieces of programmin g can be similarly done is not now knownâ you can think
about it as one possible solution to the âprogramming problemâ.
In the Chapter on hardware I carefully discussed some of the limitsâthe size of molecules, the velocity of
light, and the removal of heat. I should summarize correspondingly the less firm limits of software.
I made the comparison of writing software with th e act of literary writing; both seem to depend
fundamentally on clear thinking. Can good programming be taught? If we look at the correspondingteaching of âcreative writingâ courses we find most students of such courses do not become great writers,and most great writers in the past did not take creative writing courses! Hence it is dubious that greatprogrammers can be trained easily.
Does experience help? Do bureaucrats after years of wr iting reports and instructions get better? I have no
real data but I suspect with time th ey get worse! The habitual use of â governmenteseâ over the years probably
seeps into their writing style and makes them worse. I suspect the same for programmers! Neither years ofexperience nor the number of languages used is any reason for thinking the programmer is getting better
from these experiences. An examina tion of books on programming suggests most of the authors are not
good programmers!
The Duty of Communication
- The author argues that scientific discovery is incomplete without successful communication in multiple formats.
- A scientist's duty encompasses writing papers, delivering prepared public talks, and mastering impromptu speaking.
- The author recounts overcoming a paralyzing fear of public speaking that threatened his professional growth in the 1950s.
- The text emphasizes that technical material is often best conveyed through the structure of personal anecdotes.
- The author asserts that his pessimistic predictions are backed by years of programming evidence rather than wishful thinking.
On thinking this over very seriously, I came to the conclusion I could not afford to be crippled that way and still become a great scientist.
The results I picture are not nice, but all you have to oppose it is wishful thinkingâI have evidence of
years and years of programming on my side! 32 CHAPTER 4
5
History of Computer Application
As you have probably noticed, I am using the technical material to hang together a number of anecdotes,
hence I shall begin this time with a story of how this, and the two pr eceding chapters, came about. By the
1950s I had found I was frightened when giving public talks to large audiences, this in spite of having taughtclasses in college for many years. On thinking this ove r very seriously, I came to the conclusion I could not
afford to be crippled that way and st ill become a great scientist; the duty of a scientist is not only to find new
things, but to communicate them su ccessfully in at least three forms:
writing papers and booksprepared public talksimpromptu talks
Mastering the Art of Speaking
- The author identifies public speaking as a critical career skill and commits to overcoming stage fright through deliberate practice.
- To maximize practice opportunities, the author designed a talk specifically tailored to what the audience wanted to hear rather than personal preference.
- A distinction is made between scientific communication and mere entertainment, emphasizing that truth must be the priority even when engaging an audience.
- The chosen topic, 'The History of Computing to the Year 2000,' forced the author to stay intellectually current and anticipate future trends.
- The author argues that a degree of stage fright is beneficial because excitement is contagious and prevents the audience from falling asleep.
- Beyond giving talks, the author began studying the delivery styles of others to identify what makes a presentation effective or ineffective.
Your excitement tends to be communicated to the audience, and if you seem to be perfectly relaxed then the audience also relaxes and may fall asleep!
Lacking any one of these would be a serious drag on my career. How to learn to give public talks withoutbeing so afraid was my problem. The answer was obviously by practice, and while other things might help,practice was a necessary thing to do.
Shortly after I had realized this it happened I was aske d to give an evening talk to a group of computer
people who were IBM customers learning some aspect of the use of IBM machines. As a user I had been
through such a course myself and knew typically the training period was for a week during working hours.To supply entertainment in the evenings IBM usually arranged a social get-toge ther the first evening, a
theater party on some other evening, and a general talk about computers on s till another eveningâand it
was obvious to me I was being asked to do the later.
I immediately accepted the offer because here was a chance to practice giving talks as I had just told
myself I must do. I soon decided I should give a talk which was so good I would be asked to give other talksand hence get more practice. At first I thought I would give a talk on a topic dear to my heart, but I soonrealized if I wanted to be invited back I had best gi ve a talk the audience wanted to hear, which is often a
very, very different thing. What would they want to hear, especially as I di d not know exactly the course
they were taking and hence the abilities of people? I hit on the general interest topic, The History of
Computing to the Year 2000 âthis at around 1960. Even I was interested in the topic, and wondered what I
would say! Furthermore, and this is important, in preparing the talk I would be preparing myself for the future.
In saying, âWhat do they want to hear?â I am not speaking as a politician but as a scientist who should
tell the truth as they see it. A scientist should not give ta lks merely to entertain, since the object of the talk is
usually scientific information transm ission from the speaker to the audi ence. That does not imply the talk
must be dull. There is a fine, but definite, line betw een scientific communication and entertainment, and the
scientist should always stay on the right side of that line.
My first talk concentrated on the hardware, and I dealt with the limitations of it including, as I mentioned
in Chapter 3 , the three relevant laws of Nature; the size of molecules, the speed of light, and the problem of
heat dissipation. I included lovely colored VuGraphs with overlays of the quantum mechanical limitations,
including the uncertainty principle effects. The talk was successful since the IBM person who had asked me
to give the talk said afterwards how much the audience had liked it. I casually said I had enjoyed it too, and
would be glad to come into NYC almost any evening they cared, provided they warned me well in advance,and I would give it againâand they acce pted. It was the first of a series of talks which went on for many
years, about two or three times a y ear; I got a lot of practice and learne d not to be too scared. You should
always feel some excitement when yo u give a talk since even the best actors and actresses usually have some
stage fright. Your excitement tends to be communicat ed to the audience, and if you seem to be perfectly
relaxed then the audience also relaxes and may fall asleep!
The talk also kept me up to date, made me keep an eye out for trends in computing, and generally paid
off to me in intellectual ways as well as getting me to be a more polished speaker. It was not all just luckâI
made a lot of it by trying to understand, below the su rface level, what was going on. I began, at any lecture I
attended anywhere, to pay attention not only to what was said, but to the style in which it was said, andwhether it was an effective or a noneffective talk. Those talks which were merely funny I tended to ignore,though I studied the style of joke telling closely. An after dinne r speech requires, generally, three good
The Evolution of Computing Economics
- The author transitioned from focusing on hardware and software to realizing that economics and applications are the primary drivers of computer evolution.
- Early computing was dominated by 'number crunching' because only those requiring hard numerical data could justify the high costs of the era.
- Historically, the most difficult problems were solved on the most primitive equipment to prove the technology's viability before it was applied to routine tasks.
- Innovation faces a natural barrier of resistance, requiring proof of success in 'heroic tasks' before being accepted for more useful, everyday applications.
- The author shifted toward the 'mass production of a variable product,' organizing systems to handle a high volume of diverse, unpredictable small problems.
Yes, we did some of the hardest problems on the most primitive equipmentâit was necessary to do this in order to prove machines could do things which could not be done otherwise.
jokes; one at the beginning, one in the middle, and a closing one so that they will at least remember onejoke; all jokes of course told well. I had to find my own style of joke telling, and I practiced it by tellingjokes to secretaries.
After giving the talk a few times I realized, of cour se, it was not just the hardware, but also the software
which would limit the evolution of computing as we approached the year 2000â Chapter 4 I just gave you.
Finally, after a long time, I began to realize it wa s the economics, the applicat ions, which probably would
dominate the evolution of computers. Much, but by no means all, of what would happen had to beeconomically sound. Hence this chapter.
Computing began with simple arithmetic, went thro ugh a great many astronomical applications, and came
to number crunching. But it should be noted Raymond Lull (12357â1315), sometimes written Lully, aSpanish theologian and philosopher, built a logic machine! It was this that Swift satirized in his Gulliverâs
Travels when Gulliver was on the island of Laputa, and I have the impression Laputa corresponds to
Majorca where Lull flourished.
In the early years of modern computing, say arou nd 1940s and 1950s, ânumber crunchingâ dominated the
scene since people who wanted hard, firm numbers were the only ones with enough money to afford theprice (in those days) of computing. As computing costs came down the kinds of things we could doeconomically on computers broadened to include many other things than number crunching. We hadrealized all along these other activities were possibl e, it was just they were uneconomical at that time.
Another aspect of my experiences in computing was also typical. At Los Alamos we computed the
solutions of partial differential equations (atomic bomb behavior) on primitive equipment. At BellTelephone Laboratories at first I solved partial diff erential equations on relay computers; indeed I even
solved a partial differential-integral equation! Later, with much better machines available, I progressed to
ordinary differential equations in the form of trajectories for missiles. Then still later I published severalpapers on how to do simple integration. Then I progressed to a paper on function evaluation, and finally onepaper on how numbers combine! Yes, we did some of the hardest problems on the most primitive equipmentâit was necessary to do this in order to prove machines could do things which could not be done otherwise.Then, and only then, could we turn to the economical solutions of problems which could be done only34 CHAPTER 5
laboriously by hand! And to do this we needed to develop the basic theories of numerical analysis and
practical computing suitable for machines rather than for hand calculations.
This is typical of many situations. It is first neces sary to prove beyond any d oubt the new thing, device,
method, or whatever it is, can cope with heroic tasks before it can get into the system to do the moreroutine, and in the long run, more useful tasks. Any innovation is always against such a barrier, so do not getdiscouraged when you find your new idea is stoutly, and perhaps foolishly, resisted. By realizing themagnitude of the actual task you can then decide if it is worth your efforts to continue, or if you should godo something else you can accomplish and not fritter away your efforts needle ssly against the forces of
inertia and stupidity.
In the early evolution of computers I soon turned to the problem of doing many small problems on a big
machine. I realized, in a very real sense, I was in the mass production of a variable product âI should
organize things so I could cope with most of the problems which would arise in the next year, while at thesame time not knowing what, in detail, they would be.
Mass Production of Variety
- Computers have enabled the mass production of variable products, allowing for customization without the traditional costs of excessive standardization.
- The author demonstrates that investing a year into building software tools can yield more productivity than solving individual problems sequentially.
- Computer applications follow an S-curve growth pattern where specific fields like science or engineering eventually saturate, but new fields emerge to maintain overall growth.
- The historical progression of computing at Bell Labs moved from scientific research to engineering, military applications, and eventually symbol manipulation like word processing.
- Future growth in computing power consumption will likely be driven by pattern recognition, virtual reality, and artificial intelligence.
- For software tools to be viable in a rapidly changing field, they must provide a return on investment in the near future.
They enable us to deal with variety without excessive standardization, and hence we can evolve more rapidly to a desired future!
It was then I realized the computers have opened thedoor much more generally to the mass production of a variable product, regardless of what it is; numbers,words, word processing, making furniture, weaving, or what have you. They enable us to deal with varietywithout excessive standardization, and hence we can evolve more rapidly to a desired future! You see it atthe moment applied to computers th emselves! Computers, with some gu idance from humans, design their
own chips, and computers are assembled, more or less, automatically from standard parts; you say whatthings you want in your computer and the particular computer is then made. So me computer manufacturers
are now using almost total machine assembly of the parts with almost no human intervention.
It was the attitude I was in the mass production of a variable product, with all its advantages and
disadvantages, which caused me to approach the IBM 650 as I told you about in the last chapter. Byspending about 1 man year in total effort over a period of 6 months, I found at the end of the year I hadmore work done than if I had approached each problem one at a time! The creation of the software tool paid
off within one year! In such a rapidly changing field as computer software if the payoff is not in the nearfuture then it is doubtful it will ever pay off.
I have ignored my experiences outside of science and engineeringâfor example I did one very large
business problem for AT&T using a UNIVAC-I in NYC, and one of these days I will get to a lesson Ilearned then.
Let me discuss the applications of computers in a more quantitative way. Naturally, since I was in the
Research Division of Bell Telephone Laboratories, in itially the problems were mainly scientific, but being
in Bell Telephone Laboratories we soon got to engineering problems. First, Figure 5.I , following only the
growth of the purely scientific problems, you get a curve which rises exponentially (note the vertical log scale),
but you soon see the upper part of the S-curve, the flattening off to more moderate growth rates. After all,given the kind of problem I was solving for them at that time, and the total number of scientists employed in
Bell Telephone Laboratories, there had to be a limit to what they could propose and consume. As you knowthey began much more slowly to propose far larger problems so scientific computing is still a large
component of the use of computers, but not the major one in most installations.
The engineering computing soon came along, and it rose along much the same shape, but was larger and
was added on top of the earlier scientific curve. Then, at least at Bell Telephone Laboratories, I found aneven larger military work load, and finally as we shifted to symbol manipulations in the form of word
processing, compiling time for the higher level languag es, and other things, there was a similar increase.
Thus while each kind of work load seemed to slowly approach saturati on in its turn, the net effect of all of
them was to maintain a rather constant growth rate.HISTORY OF COMPUTER APPLICATIONS 35
What will come along to sustain this straight line logarithmic growth curve and prevent the inevitable
flattening out of the Scurve of applications? The next big area is, I believe, pattern recognition. I doubt our
ability to cope with the most general problem of pattern recognition, because for one thing it implies too
much, but in areas like speech recognition, radar patte rn recognition, picture analysis and redrawing, work
load scheduling in factories and offices, analysis of data for statisticians, creation of virtual images, and
such, we can consume a very large amount of computer power. Virtual reality computing will become a
large consumer of computing power, and its obvious economic value assures us this will happen, both in thepractical needs and in amusement areas . Beyond these is, I believe, Artificia l Intelligence, which will finally
The Rise of Interactive Computing
- The author recounts an early experiment attaching a small SDS 910 computer to a Brookhaven cyclotron to provide real-time data feedback.
- Despite corporate concerns regarding the longevity of the computer manufacturer, the project proceeded and proved highly successful.
- The small computer effectively doubled the productivity of the massive cyclotron by allowing scientists to monitor data as it was gathered.
- Real-time visualization on an oscilloscope enabled researchers to abort and adjust flawed experiments immediately rather than waiting for completion.
- This success at Brookhaven led Bell Telephone Laboratories to integrate small computers into labs for both data reduction and experimental control.
- The shift toward interactive computing transformed the machine from a passive calculator into an active driver of experimental parameters.
I believed then, as I do now, that cheap, small SDS 910 machine at least doubled the effective productivity of the huge, expensive cyclotron!
get to the point where the delivery of what they have to offer will justify the price in computing effort, andwill hence be another source of problem solving.
We early began interactive computing. My introducti on was via scientist named Jack Kane. He had, for
that time, the wild idea of attaching a small Scientif ic Data Systems (SDS) 910 computer to the Brookhaven
cyclotron where we used a lot of time. My V.P. aske d me if Jack could do it, and when I examined the
question (and Jack) closely I said I thought he could. I was then asked, âWould the manufacturing companymaking the machine stay in business ?â, since the V.P. had no desire to get some unsupported machine. That
cost me much more effort in other directions, and I finally made an appointment with the President of SDSto have a face to face talk in his office out in Los Angles. I came away believing, but more on that at a laterdate. So we did it, and I believed then, as I do now, that cheap, small SDS 910 machine at least doubled theeffective productivity of the huge, expensive cyclotron! It was ce rtainly one of the first computers which
during a cyclotron run gathered, reduced, and displayed the gathered data on the face of a small oscilloscope(which Jack put together and made operate in a few days). This enabled us to abort many runs which werenot quite right; say the specimen was not exactly in the middle of the beam, there was an effect near the
edge of the spectrum and hence we had better redesign the experiment, something funny was going on, andwe would need more detail here or thereâall reasons to stop and modify rather than run to the end and thenfind the trouble.
This one experience led us at Bell Telephone La boratories to start putting small computers into
laboratories, at first merely to gather, reduce, and display the data, but soon to drive the experiment. It isoften easier to let the machine progra m the shape of the electrical drivin g voltages to the experiment, via a
Figure 5.I
36 CHAPTER 5
The Realities of Shared Databases
- Computers often change the nature of experiments and problems rather than just automating existing tasks.
- Boeing's attempt at a centralized design tape failed because engineers could not perform optimization studies against a constantly shifting baseline.
- In practice, teams must freeze a copy of a database to ensure that improvements are due to their own parameter changes rather than external updates.
- Real-time data access in corporate settings can create conflict and inconsistency, such as two executives presenting different figures based on different retrieval times.
- Scientific databases face social and technical hurdles, including prestige-driven conflicts over whose measurements are officially recorded.
- Most high-level decisions and optimizations should not be sensitive to minute-by-minute data fluctuations.
You simply cannot use a constantly changing data base for an optimization study.
standard digital to analog converter, than it is to build special circuits to do it. This enormously increased
the range of possible experiments, and introduced the practicality of having interactive experiments . Again,
we got the machine in under one pretext, but its presence in the long run changed both the problem andwhat the computer was actually used for. When you successfully use a computer you usually do anequivalent job, not the same old one. Again you see th e presence of the computer, in the long run, changed
the nature of many of the experiments we did.
Boeing (in Seattle) later had a somewhat similar idea, namely they would keep the current status of a
proposed plane design on a tape and everyone would use that tape, hence in the design of any particularplane all the parts of the vast comp any would be attuned to each otherâs work. It did not work out as the
bosses thought it would, and as they probably thought it did! I know, because I was doing a high level, twoweek snooping job for the Boeing top brass under the guise of doing a routine inspection of the computercenter for a lower level group!
The reason it did not work as planned is simple. If the current status of the design is on the tape (currently
discs) and if you use the data during a study of, say, wing area, shape, and profile, then when you make achange in your parameters and you find an improvement it might have been due to a change someone elseinserted into the common design and not to the change you madeâwhich might have actually made thingsworse! Hence what happened in practice was each group, when making an optimization study, made a copy
of the current tape, and used it with out any updates from any other area. Only when they finally decided on
their new design did they insert the changesâand of course they had to verify their new design meshed withthe new designs of the others. You simply cannot use a constantly changing data base for an optimizationstudy.
This brings me to the topic of data bases. Computers we re to be the savior in this area, and they are still
occasionally invoked as if they would be. Certainly the airlines with their reservation systems is a good
example of what can be done with computersâjust think what a mess it would be when done by hand withall its many human errors, let alon e the size of the troubles. The ai rlines now keep many data bases,
including the weather. The weather and current airport delays are used to design the flight profile for each
flight just before takeoff, and possibly change it during flight in view of later information.
Company managers always seem to have the idea if only they knew the curren t state of the company in
every detail then they could manage things better. So nothing will do but they must have a data base of allthe companyâs activities, always up to the moment. This has its difficulties as indi cated above. But another
thing; suppose you and I are both V.Ps of a compan y and for a Monday morni ng meeting we want exactly
the same figures. You get yours from a program run on Friday afternoon, while I, being wiser and knowingover the weekend much information comes in from the outlying branches, wait until Sunday night andprepare mine. Clearly there could be significant differences in our two reports, even though we both usedthe same program to prepare them! Th at is simply intolerable in pract ice. Furthermore, most important
reports and decisions should not be time sensitive to up to the minute data!
How about a scientific data base? For example, whose measurement gets in? There is prestige in getting
yours in, of course, so there will be hot, expensive, i rritating conflicts of interest in that area. How will such
conflicts be resolved? Only at high costs! Again, when you are making optimization studies you have theabove problem; was it a change made in some physic al constant you did not know happened which made
the new model better than the old model?
The Rise of General Purpose Chips
- The shift from hardware-specific manufacturing to software programming allowed for mass production of variable products using the same general purpose computer.
- The Intel 4004 four-bit chip revolutionized the industry by replacing complex manufacturing jobs with flexible programming tasks.
- General purpose computers have become universal and invisible, controlling everything from stoplights and elevators to automobiles and washing machines.
- Choosing a special-purpose chip over a general-purpose one is often driven by ego rather than economic or practical logic.
- General purpose chips benefit from a shared ecosystem of bug fixes, manuals, and upgrades that are maintained by the wider market.
- Excess capacity in general purpose chips is essential for handling the inevitable future expansion of a project's original requirements.
One of the main reasons is there is a great ego satisfaction in having your own special chip and not one of the common herd.
How will you keep the state of changes available to all the users?It is not sufficient to do it so the users must read all your publications every time they use the machine, andsince they will not keep up to date errors will be made. Blaming the users will not undo the errors!
I began mainly talking about general purpose computers, but I gradually took up discussing the use of a
general purpose computer as a special purpose device to control things, such as the cyclotron and laboratoryHISTORY OF COMPUTER APPLICATIONS 37
equipment. One of the main steps happened when someon e in the business of making integrated circuits for
people noted that if instead of making a special chip fo r each of several customers, he could make a four bit
general purpose computer and then program it for each special job (INTEL 4004). He replaced a complex
manufacturing job with a programming job, though of course the chip still had to be made, but now it wouldbe a large run of the same four bit chips. Again th is is the trend I noted earlier, going from hardware to
software to gain the mass production of a variab le productâalways using the same general purpose
computer. The four bit chip was soon expanded to 8 bit chips, then 16, etc. so now some chips have 64 bitcomputers on them!
You tend not to realize the number of computers you interact with in the course of a day. Stop-and-go
lights, elevators, washing machines, telephones which now have a lot of computers in them as opposed to myyouth when there was always a cheerfu l operator at the end of every line waiting to be helpful and get the
phone number your wanted, answering machines, automobiles controlled by computers under the hood areall examples of their expanding range of applicationâyou have only to watch and note the universality ofcomputers in your life. Of course they will further increase as time goes onâthe same simple general purposecomputer can do so many special purpose jobs it is seldom that a special purpose chip is wanted.
You see many more special purpose chips around than th ere need be. One of the main reasons is there is a
great ego satisfaction in having your own special chip and not one of the common herd. (I am repeating partof Chapter 2 .) Before you make this mistake and use a speci al purpose chip in any equipment ask yourself a
number of questions. Let me repeat the earlier arguments. Do you want to be alone with your special chip?
How big a stock pile of them will you need in inventory? Do you really want to have a single, or a few,suppliers rather than being able to buy them on the open market? Will not the total cost be significantlyhigher in the long run?
If you have a general purpose chip then all the users will tend to contribute, not only in finding flaws but
having the manufacturer very willing to correct them; otherwise you will have to produce your ownmanuals, diagnostics, etc, and at the same time what others learn about their chips will seldom help you
with your special one. Furthermore, with a general purpose chip then upgrades of the chip, which you canexpect will sort of be taken care of mainly by others, will be available to you free of effort on your part.There will inevitably be a need fo r you to upgrade yours b ecause you will soon want to do more than the
original plan called for. In meeting this new need a general purpos e chip with some excess capacity for the
inevitable future expansion is much easier to handle.
I need not give you a list of the applications of computers in your business. You should know better than
I do your rapidly increasing use of computers, not only in the field but throughout your whole organization,from top to bottom, from far behind the actual manufacturing up to the actual production front. You shouldalso be well aware of the steadily in creasing rate of changes, upgrades, and the flexibility a general purpose
The Future of Computer Applications
- The potential for symbol-manipulating devices to adapt to changing environments is still in its infancy.
- Innovation should strive for transformative 'great new things' rather than mere ten percent incremental improvements.
- Successful careers require analyzing the specific conditions that lead to project success versus guaranteed failure.
- Effective automation involves redesigning tasks for machines rather than simply replicating human processes.
- Future-proofing and realistic field maintenance are critical components of sustainable system design.
- The next frontier of computing applications lies in the exploration of Artificial Intelligence and its inherent limitations.
I have no objections to 10% improvements of established things, but from you I also look for the great new things which make so much difference to your organization that history remembers them for at least a few years.
symbol manipulating device gives to the whole organization to meet the constantly changing demands ofthe operating environment The range of possible applications has only begun, and many new applicationsneed to be doneâperhaps by you. I have no objections to 10% improvements of established things, but fromyou I also look for the great new things which make so much difference to your organization that historyremembers them for at least a few years.
As you go on in you r careers you should examine the applications which su cceed and those which fail;
try to learn how to distinguish between them; try to understand the situations which produce successes andthose which almost guarantee failure. Realize, as a general rule, it is not the same job you should do with amachine, but rather an equivalent one, and do it so then future, flexible, expansion can be easily added (ifyou do succeed). And always also remember to give seri ous thought to the field maintenance as it will actually
be done in the fieldâwhich is generally not as you wish it would be done!38 CHAPTER 5
The use of computers in society has not reached its end, and there is room for many new, important
applications. They are easier to find than most people think!
In the two previous chapters I ended with some remarks on the possible limitations of their topics,
hardware and software. Hence I need to discuss some possible limitations of applications. This I will do inthe next few chapters under the general title of Artificial Intelligence, AI. HISTORY OF COMPUTER APPLICATIONS 39
6
Artificial IntelligenceâI
The Limits of Machine Intelligence
- Computers manipulate symbols rather than 'information,' as the latter is a fuzzy concept that cannot be strictly defined for programming.
- Early research by Newell and Simon shifted from solving puzzles to modeling the human reasoning patterns used to solve them.
- The General Problem Solver (GPS) failed to scale, leading to a massive increase in the number of rules required for rule-based logic systems.
- Expert Systems face significant hurdles because experts often rely on subconscious patterns that they cannot consciously articulate.
- The success of rule-based logic appears inconsistent, suggesting that some human knowledge may be fundamentally impossible to translate into instructions.
Among other troubles with this idea is in many fields, especially in medicine, the world famous experts are in fact not much better than the beginners!
Having examined the history of computer applications we are naturally attracted to an examination of their
future limits, not in computing capacity but rather what kinds of things computers can and perhaps cannot
do. Before we get too far I need to remind you computers manipulate symbols, not information; we are
simply unable to say, let alone write a program fo r what we mean by the word âinformationâ. We each
believe we know what the word means, but hard thought on your part will convince you it is a fuzzy
concept at best, you cannot give a definition which can be converted into a program.
Although Babbage and Lady (Ada) Lovelace both considered slightly some of the limitations of
computers, the exploration of the limits of computers really began in the late 1940s and early 1950s by,among others, Newell and Simon at RAND. For example th ey looked at puzzle solvin g, such as the classic
cannibals and missionaries problem. Could machines solve them? And how would they do it? Theyexamined the protocols people used as they solved such problems, and tried to write a program which wouldproduce similar results. You should not expect exactly the same result as generally no two people reportedexactly the same steps in the same order of their thought processes, rather the program was to produce a similarlooking pattern of reasoning. Thus they tried to model the way people did such puzzles and examine howwell the model produced results resembling human results, rather than just solve the problem.
They also started the General Prob lem Solver (GPS) with the idea that given about 5 general rules for
solving problems they could then give the details of the particular area of a problem and the computer
program would solve the problem. It didnât work too well, though very valuable by-products did come fromtheir work such as list processing . To continue with this problem solving approach they started, after their
initial attack on general problem solving (which certainly promised to alleviate the programming problem toa fair extent) it was dropped, more or less, for a d ecade, and when revived the pr oposal was about 50 general
rules would be needed. When that did not work, an other decade and the proposa l with 500 general rules,
and another decade, now under the title of rule based logic and they are sometimes at 5000 rules, and I have
even heard of 50,000 as the nu mber of rules for some areas.
There is now a whole area known as Expert Systems . The idea is you talk with some experts in a field,
extract their rules, put these rules into a program, and then you have an expert! Among other troubles withthis idea is in many fields, especially in medicine, th e world famous experts are in fact not much better than
the beginners! It has been measured in many different studies! Another trouble is experts seem to use their
subconscious and they can only report their conscious experience in making a diagnosis. It has beenestimated it takes about 10 years of intensive work in a field to become an expert, and in this time many,many patterns are apparently laid down in the mind from which the expert then makes a subconsciousinitial choice of how to approach the problem as well as the subsequent steps to be used.
In some areas rule based logic has had spectacular successes, and in some apparently similar areas there
were plain failures, which indicates success depends on a large element of luck; they still do not have a firm
basic understanding of when the method of rule based logic will or will not work, nor how well it will
work.
In Chapter 1 , I already brought up the topic that perhaps everything we âknowâ cannot be put into words
(instructions)â cannot in the sense of impossible and not in the sense we are stupid or ignorant. Some ofthe features of Expert Systems we have found certainly strengthen this opinion.
After quite a few years the field of the limits of intellectual performance by machines acquired the
The Ambiguity of AI
- The term Artificial Intelligence is described as a 'dubious title' due to its lack of a singular, concrete definition.
- AI is framed not as a fixed technology but as a conceptual variant on a deeper philosophical or technical question.
- The text suggests that the nomenclature of AI may be misleading or overly broad in its current application.
- The lack of a unified meaning complicates the public and academic understanding of what these systems actually represent.
dubious title of Artificial Intelligence (AI), which does not have a single meaning.
dubious title of Artificial Intelligence (AI), which does not have a single meaning. First, it is a variant on the
question,
The Limits of Machine Thinking
- Leaders must avoid the binary trap of believing or disbelieving in machine thought, as both extremes lead to strategic failure.
- The question of AI is better framed as identifying which human burdens machines can relieve, particularly on the intellectual side of life.
- Autonomous intelligence is a physical necessity for remote exploration, such as Mars rovers, where signal delays make human control impossible.
- Modern technology, like unstable high-speed aircraft, already requires machines to handle millisecond-level stabilization that exceeds human capability.
- Defining 'machine' is philosophically difficult, especially when considering the potential integration of organic components or neural networks.
Thus you cannot afford to either believe or disbelieveâyou must come to your own terms with the vexing problem, 'To what extent can machines think?'
Can Machines Think?
While this is a more restricted definition than is arti ficial intelligence, it has a sharper focus and is a good
substitute in the popular mi nd. This question is importa nt to you because if you believe com puters cannot think
then as a prospective leader you will be slow to use computers to advance the field by your efforts, but ifyou believe of course computers can think then you are very apt to fall into a first class failure! Thus youcannot afford to either believe or disbelieveâyou must come to your own terms with the vexing problem,âTo what extent can machines think?â
Note, first, it really is mis-statedâthe question se ems to be more, âCan we write programs which will
produced âthinkingâ fr om a von Neumann type mach ine?â The reason for the hedge is there are arguments
that modern neural nets, when not simulated on a digital computer, might be able to do what no digitalcomputer can do. But then again they might not. It is a problem we will look into at a later stage when we
have more technical facts available.
While the problem of AI can be vi ewed as, âWhich of all the things humans do can machines also do?â I
would prefer to ask the question in another form, âO f all of lifeâs burdens, which are those machines can
relieve, or significantly ease, for us?â Note while you tend to automatically think of the material side of life,pacemakers are machines connected directly to th e human nervous system and help keep many people
alive. People who say they do not want their life to depend on a machine seem quite conveniently to forget
this. It seems to me in the long run it is on the intellectual side of life that machines can most contribute tothe quality of life.
Why is the topic of artifici al intelligence important? Let me take a specific example of the need for AI.
Without defining things more sharply (and without defining either thinking or what a machine is there can
be no real proof one way or the other), I believe very li kely in the future we will have vehicles exploring the
surface of Mars. The distance between Earth and Mars at times may be so large the signaling time round trip
could be 20 or more minutes. In the exploration process the vehicle must, therefore, have a fair degree oflocal control. When having passed between two rocks, turned a bit, and then found the ground under thefront wheels was falling away, you will want prompt, âsensibleâ action on the part of the vehicle. Simple,obvious things like backing up will be inadequate to save it from destruction, and there is not time to getadvice from Earth; hence some degree of âintelligenceâ should be programmed into the machine.
This is not an isolated situation; it is increasingly ty pical as we use computer dr iven machines to do more
and more things at higher and higher speeds. You cannot have a human backupâ often because of the
boredom factor which humans suffer from. They say piloting a plane is hours of boredom and seconds ofsheer panicânot something humans were designed to cope with, though they manage to a reasonabledegree. Speed of response is often essential. To repe at an example, our current fastest planes are basically
unstable and have computers to stabilize them, m illisecond by millisecond, which no human pilot could
handle; the human can only supply the strategy in the large and leave the details in the small to the machine.ARTIFICIAL INTELLIGENCEâI 41
I earlier remarked on the need to get at least some understanding of what we mean by âa machineâ and by
âthinkingâ. We were discussing these thing at Bell Telephone Laboratories in the late 1940s and some onesaid a machine could not have organic parts, upon which I said the definition excluded any wooden parts!The first definition was retracted, but to be nasty I suggested in time we might learn how to remove a largepart of a frogâs nervous system and keep it alive. If we found how to use it for a storage mechanism, wouldit be a machine or not?
Defining Machine Thinking
- The definition of thinking is often biased by human exceptionalism, such as the Jesuit engineer's claim that thinking is exclusively what machines cannot do.
- The Turing Test attempts to define thinking through behavioral indistinguishability, though it bypasses the fundamental nature of the process.
- The author suggests that thinking might not be a binary 'yes-no' state but rather a matter of degree, moving away from the search for a 'smallest' thinking program.
- The history of chemistry serves as a parallel, where the 'vitalistic' belief that organic compounds required a life force was eventually overturned by laboratory synthesis.
- Religious and philosophical resistance to machine intelligence often stems from the discomfort of humans potentially creating entities in their own image.
- Historical attempts to quantify the human soul through physical measurements like weight have consistently failed to provide empirical evidence of a distinct vital essence.
As to the soul, in the Late Middle Ages some people, wanting to know when the soul departed from the dead body, put a dying man on a scale and watched for the sudden change in weightâbut all they saw was a slow loss as the body decayed.
If we used it as an a content addressable storag e how would you feel about it being a
âmachineâ?
In the same discussion, on the thinking side, a Jesuit trained engineer gave the definition, âThinking is
what humans can do and machines cannot doâ. Well, that solves the problem once and for all, apparently.But do you like the definition? Is it really fair? As we pointed out to him then, if we start with some obviousdifference at present then with improved machines a nd better programming we may be able to reduce the
difference, and it is not clear in the long run there would be any difference left.
Clearly we need to define âthinkingâ. Most people want the definition of thinking to be such that they can
think but stones, trees, and such things, cannot think. But people vary to the extent they will or will notinclude the higher levels of animals. People often make the mistake of saying, âThinking is what Newtonand Einstein did.â but by that definition most of us cannot thinkâand usually we do not like thatconclusion! Turing, in coping with the question in a se nse evaded it and made the claim that if at the end of
one teletype line there was a human and at the end of another teletype line there was a suitably programmed
machine, and if the average human could not tell the di fference then that was a pr oof of âthinkingâ on the
part of the machine (program).
The Turing test is a popular approach, but it flies in the face of the standard scientific method which starts
with the easier problems before facing the harder ones. Thus I soon raised the question with myself, âWhatis the smallest or close to the smallest program I would believe could think?â Clearly if the program were
divided into two parts then neither piece could think. I tried thinking about it each night as I put my head onthe pillow to sleep, and after a year of considering the problem and getting nowhere I decided it was the
wrong question! Perhaps âthinkingâ is not a yes-no thing, but maybe it is a matter of degree.
Let me digress and discuss some of the history of chemistry. It was long believed organic compounds
could only be made by living things, there was a vitalistic aspect in living things but not in inanimate things
such as stones and rocks. But around 1823 a chemist named Wohler synthesized urea, a standard by-productof humans. This was the beginning of making organic co mpounds in test tubes. Still, apparently even as late
as 1850, the majority of chemists were holding to the vitalistic theory that only living things could make
organic compounds. Well, you know from that attitude we have gone to the other extreme and now mostchemists believe in principle any compound the body can make can also be made in the labâbut of course
there is no proof of this, nor could there ever be. The situation is they have an increasing ability to make
organic compounds, and see no reason they cannot make any compound which that can exist in Nature aswell as many which do not. Chemists have passed from the vitalistic theory of chemistry to the opposite
extreme of a non-vitalistic theory of chemistry.
Religion unfortunately enters into discussions of the problem of machine thinking, and hence we have
both vitalistic and non-vitalistic theories of âmachines vs. humansâ. For the Christian religions their Bible
says, âGod made Man in His imageâ. If we can in turn create machines in our image then we are in some
sense the equal of God, and this is a bit embarrassing ! Most religions, one way or the other, make man into
more than a collection of molecules, indeed man is often distinguished from the rest of the animal world by
such things as a soul, or some other property. As to the soul, in the Late Middle Ages some people, wantingto know when the soul departed from the dead body, put a dying man on a scale and watched for the sudden42 CHAPTER 6
change in weightâbut all they saw was a slow loss as the b ody decayedâapparently the soul, which they
The Stalemate of Artificial Intelligence
- The debate over AI hinges on whether humans possess a unique, non-material essence or are simply a collection of molecules in a radiant energy field.
- Skeptics often define 'thinking' as a moving target, specifically excluding any task a machine has already proven capable of performing.
- Hard AI proponents argue that human consciousness is a matter of programming and that current failures are due to human ignorance rather than essential limitations.
- The subjective nature of self-awareness creates a stalemate, as machines can claim to have souls without providing any verifiable proof of their internal state.
- Games like chess and 3D tic-tac-toe serve as primary AI testing grounds because their rules are unambiguous and success is clearly defined.
- Historically, AI leaders have made extravagant, unfulfilled predictions, yet their work continues to produce startling results in well-defined problem spaces.
Such people are forced, like the above mentioned Jesuit trained engineer, to make the definition of thinking to be what machines cannot do.
were sure the man had, did not have material weight.
Even if you believe in evolution, still there can be a moment when God, or the gods, stepped in and gave
man special properties which distinguish him from the rest of living things. This belief in an essentialdifference between man and the rest of the world is what makes many people believe machines can never,unless we ourselves become like the gods, be the same as a human in such details as thinking, for example.
Such people are forced, like the above mentioned Jesuit trained engineer, to make the definition of thinkingto be what machines cannot do. Usually it is not so honestly stated as he did, rather it is disguised somehowbehind a facade of words, but the intention is the same!
Physics regards you as a collection of molecules in a radiant energy field and th ere is, in strict physics,
nothing else . Democritus (b. around 460 B.C.) said in ancien t Greek times, âAll is atoms and voidâ. This is
the stance of the hard AI people; there is no essential differen ce between machines and humans, hence by
suitably programming machines then machines can do anything humans can do. Their failures to producethinking in significant detail is, they believe, merely the failure of programmers to understand what they aredoing, and not an essential limitation.
At the other extreme of the AI scale, some of us, when considering our own feelings, believe we have
self-awareness and selfconsciousness âthough we are not able to give satisfactory tests to prove these
things exist. I can get a machine to print out, âI have a soulâ, or âI am self-aware.â, or âI have self-
consciousness.â, and you would not be impressed with such statements from a machine. But from humans
you are inclined to give greater credence to such remarks, based on the belief that you, by introspection,feel you have such properties (things), and you have learned by long experience in life other humans are similarto youâthough clearly racism still exists which a sserts there are differencesâme being always the better
person!
We are at a stalemate at this point in the discussion of AI; we can each assert as mu ch as we please, but it
proves nothing at all to most people. So let us turn to the record of AI successes and failures.
AI people have always made extravagant claims which have not been borne outânot even closely in
most cases. Newell and Simon in 1958 predicted in 10 years the next world champion in chess would be a
computer program. Unfortunately similar, as yet unreali zed, claims have been made by most of the AI leaders
in the public eye. Still, startling results have been produced.
I must again digress, this time to point out why game playing has such a prominent role in AI research.
The rules of a game are clear beyond argument, and success or failure are alsoâin short, the problem is
well defined in any reasonable sense. It is not that we particularly want machines to play games, but they
provide a very good testing ground of our ideas on how to get started in AI.
Chess, from the beginning, was regarded as a very good test since it was wide ly believed at that time
chess requires thinking beyond any doubt . Shannon proposed a way of writing chess playing programs (we
call them chess playing machines but it is really mainly a matter of programming). Los Alamos, with a
primitive MANIAC machine tried 6Ă6 chess boards, dropping the two bishops on each side, and got
moderate results. We will return to the history of chess playing programs later.
Let us examine how one might write a program for the much simpler game of three dimensional tic-tac-
toe. We set aside simple two dimensional tic-tac-toe since it has a known strategy for getting a draw, andthere is no possibility of win against a prudent player. Games which have a known strategy of playingsimply are not exhibiting thinkingâso we believe at the moment.
As you examine the 4Ă4Ă4 cube there are 64 squares, and 76 straight lines through them. Any one line is
Heuristics of 4x4x4 Tic-Tac-Toe
- The 4x4x4 tic-tac-toe cube contains 16 'hot spots' consisting of corners and center locations that share a geometric duality.
- Randomness is essential in early game strategy to prevent opponents from systematically exploiting predictable patterns.
- Game logic follows a hierarchy of immediate win conditions, defensive blocking, and the creation or prevention of forks.
- Winning often depends on 'forcing moves' that maintain the initiative and compel the opponent into defensive positions.
- The transition from defensive play to an offensive sequence is a critical, non-exact science where timing determines victory.
- Computer game programs rely on heuristicsâplausible but non-guaranteed rulesâto navigate complex decision spaces.
Thus when to go on the attack is a touchy matter; too soon and you lose the initiative, too late and the opponent starts and wins.
a win if you can get all four of the po sitions filled with your pieces. You next note the 8 corner locations,
and the 8 center locations, all have more lines through th em than the others; indeed there is an inversion ofARTIFICIAL INTELLIGENCEâI 43
the cube such that the center points go to the corners and the corners go to the center whil e preserving all
straight linesâhence a duality which can be exploited if you wish.
For a program to play 4Ă4Ă4 tic-tac-toe it is first n ecessary to pick legal moves. Then in the opening
moves you tend to place your pieces on these âhotâ spots, and you use a random strategy since otherwise,
since if you play a standard game then the opponent can slowly explore it until a weakness is uncovered
which can be systematically exploited. This use of randomness, when there are essentially indifferent
moves, is a central part of all game playing programs.
We next formulate some rules to be applied sequentially.
1. If you have 3 men on a line and it is still âopenâ then play it and win.2. If you have no immediate win, and if the opponent has 3 men on a line, then you must block it.3. If you have a fork ( Figure 6.I ), take it since then on the next move you have a win, as the opponent
cannot win in one move.
4. If the opponent has a fork you must block it.
After this there are apparently no definite rules to follow in making your next move. Hence you begin tolook for âforcing movesâ, ones which will get you to some place where you have a winning combination.Thus 2 pieces on an âope nâ line means you can place a third and the opponent will be forced to block the
line (but you must be careful that the blocking move does not produce three in a line for the opponent andforce you to go on the defensive). In the process of making several forcing moves you may be able to createa fork, and then you have win! But these rules are vague. Forcing moves which are on âhotâ places and
where the opponentâ s defense must be on a âcoolâ places seem to favor you, but does not guarantee a win.
In starting a sequence of forcing moves, if you lose the initiative, then almost certainly the opponent can
start a sequence of forcing moves on you and gain a win. Thus when to go on the attack is a touchy matter;too soon and you lose the initiative, too late and the opponent starts and wins. It is not possible, so far as Iknow to give an exact ru le of when to do so.
This is the standard structure of a program to play a game on a comput er. Programs must first require you
check the move is legal before any other step, but this is a minor detail. Then there is usually a set of moreor less formal rules to be obeyed, followed by some much vaguer rules. Thus a game program has a lot ofheuristics in it (heuristicâto invent or discover), moves which are plausible and likely to lead you to a win,
but are not guaranteed to do so.
Figure 6.I
44 CHAPTER 6
Learning and Machine Intelligence
- Arthur Samuel's checker program demonstrated early machine learning by iteratively optimizing its own parameters through self-play.
- The program eventually surpassed its creator's skill level and defeated a state champion, challenging the notion of human superiority in strategy.
- The author draws a provocative parallel between a machine's programmed learning and a student's education in subjects like Euclidean geometry.
- The definition of 'learning' is often shifted by critics to exclude any process that can be explained mechanically or algorithmically.
- The text suggests that human intelligence may itself be a form of complex programming shaped by biological inheritance and chance events.
- True understanding of Artificial Intelligence requires moving beyond philosophical debate to practical experimentation and programming.
If you deny the machine learns from experience because you claim the program was told (by the human programmer) how to do improve its performance, then is not the situation much the same with you, except you are born with a somewhat larger initial program compared to the machine when it leaves the manufacturerâs hands?
Early in the field of AI Art Samuel, then at IB M, wrote a checker playin g program, checkers being
thought to be easier than chess which had proved to be a real stumbling block. The formula he wrote forplaying checkers had a large number of rather arbitrary parameters in the weighting functions for makingdecisions, such as for control of the center, passed pi eces, kings, mobility, pinned pieces, etc. Samuel made
a copy of the program and then slightly altered one (or more) of these parameters. Then he made oneformula play, say, ten games against the other, and the formula which won the most games was clearly(actually only probably) the better program. The ma chine went on perturbing the same parameters until it
came to a local optimum, where upon it shifted to other parameters. Thus it went around and around,repeatedly using the same parameters, gradually em erging with a significan tly better checker playing
programâcertainly much better than was Samuel himself. The program even beat a Connecticut State
checker champion!
Is it not fair to say, âThe program learned from experienceâ? Your immediate objection is there was a
program telling the machine how to learn. But when yo u take a course in Euclidean geometry is not the
teacher putting a similar learning program into you? Poorly, to be sure, but is that not, in a real sense, what acourse in geometry is all about? You enter the course and cannot do problems; th e teacher puts into you a
program and at the end of the course you can solve such problems. Think it over carefully. If you deny themachine learns from experience because you claim the program was told (by the human programmer) how
to do improve its performan ce, then is not the situation much the sa me with you, except you are born with a
somewhat larger initial program compared to the m achine when it leaves the ma nufacturerâs hands? Are
you sure you are not merely âprogrammedâ in life by what by chance events happen to you?
We are beginning to find not only is intelligence not adequately defined so arguments can be settled
scientifically, but a lot of other associated words like, computer, learning, in formation, id eas, decisions
(hardly a mere branching of a program, though branch points are often called decision points to make theprogrammers feel more important), expert behaviorâall are a bit fuzzy in our minds when we get down tothe level of testing them via a pr ogram in a computer. Science has tr aditionally appealed to experimental
evidence and not idle words, and so far science seem to have been more effective than philosophy in
improving our way of life. The future can, of course, be different.
In this chapter we have âset the stageâ for a further discussion of AI. We have also claimed it is not a
topic you can afford to ignore. Although there seems to be no hard, factual results, and perhaps there cannever be since the very words are ill-defined and are open to modification and various interpretations, stillyou must come to grips with it. In particular, when a program is written which does meet some earlier
specification for a reasonable test of computer learning, origin ality, creativity, or intelligence, then it is
promptly seen by many people the test had a mechanical solution. This is true even if random numbers areinvolved, and given the same test twice the machine will get a solution which differs slightly from the earlier
one, much as humans seldom play exactly the same ga me of chess twice in a ro w. What is a reasonable,
practical test of machine learning? Or are you going to claim, as the earlier cited Jesuit trained engineer did,
by definition learning, creativity, originality, and intelligence are what machines cannot do? Or are you
going to try to hide this blatant st atement and conceal it in some devious fashion which does not really alter
the situation?
In a sense you will never really grasp the whole problem of AI until you get inside and try your hand at
Confronting Artificial Intelligence Biases
- The perception of machine learning often shifts from 'impossible' to 'clever cheating' once the underlying mechanism is revealed.
- Progress in understanding computer potential requires a rigorous, formal critique of one's own internal beliefs.
- Students typically approach AI with strong biases, either for or against, which must be dismantled to achieve objectivity.
- The author argues that the computer revolution is in its infancy and will inevitably transform all organizations.
- Holding onto false beliefs about machine intelligence prevents individuals from participating meaningfully in future societal shifts.
Before the checker playing program which learned was exposed in simple detail, you probably thought machines could not learn from experienceânow you may feel what was done was not learning but clever cheating.
finding what you mean and what machines can do. Before the checker playing program which learned wasexposed in simple detail, you probably thought machines could not learn from experienceânow you mayfeel what was done was not learning but clever cheating, though clearly the program modified its behaviordepending on its experiences. You must struggle with your own beliefs if you are to make any progress in
understanding the possibilities and limitations of computers in the intellectual area. To do this adequatelyARTIFICIAL INTELLIGENCEâI 45
you must formalize your beliefs and then criticize them severely, arguing one side against the other, until
you have a fair idea of the strengths and weakness of both sides. Most students start out anti-AI; some areproAI; and if you are either one of these then you must try to undo your biases in this important matter. Inthe next chapter we will supply more surprising data on what machines have done, but you must make upyour own mind on this important topi c. False beliefs will mean you will not participate significantly in the
inevitable and extensive computerizat ion of your organization and society generally. In many senses the
computer revolution has only begun! 46 CHAPTER 6
7
Artificial IntelligenceâII
Intellectual Machines and Emergent Thinking
- The author distinguishes between mechanical automation and artificial intelligence, focusing on the computer's role in intellectual rather than physical tasks.
- A comparison of biological and mechanical structures reveals that engineering often achieves natural goals through entirely different mechanisms, such as fixed wings versus flapping.
- The concept of emergence suggests that complex effects like friction or thinking may simply be artifacts of large-scale organization rather than inherent properties of individual parts.
- The speed of electronic signaling in computers vastly outpaces the biological nervous system, yet the challenge of 'thinking' remains a software problem rather than just a hardware scale issue.
- An early AI geometry program demonstrated 'originality' by discovering an elegant proof for isosceles triangles that bypassed traditional human constructions.
Perhaps it is not a separate thing, it is just an artifact of largeness.
In this book we are more concerned with the aid comput ers can give us in the inte llectual areas than in the
more mechanical areas, for example, manufacturing. In the mechanical area computers have enabled us to
make better, preferable, and cheaper products, and in some areas they have been essential, such as space
flights to the moon which could hardly be done without the aid of computers. AI can be viewed as
complementary to roboticsâit is mainly concerned with the intellectual side of the human rather than the
physical side, though obviously both are closely connected in most projects.
Let us start again and return to the elements of ma chines and humans. Both are built out of atoms and
molecules. Both have organized basic parts; the machine has, among other things, two state devices both forstorage and for gates, while humans are built of cells. Both have larger structures, arithmetic units, storage,control, and I/O for machines, and humans have bones, muscles, organs, blood vessels, nervous system, etc.
But let us note some things carefully. From large organizations new effects can arise. For example we
believe there is no friction between molecules, but most large structures show th is effectâit is an effect
which arises from the organization of smaller parts which do not show the effect.
We should also note often when we engineer some device to do the same as Nature does, we do it
differently. For example, we have airplanes which, ge nerally, use fixed wings (or rotors), while birds mainly
flap their wings. But we also do a di fferent thing-we fly much higher an d certainly much faster than birds
can. Nature never invented the wheel, though we use wheels in many, many ways. Our nervous system iscomparatively slow and signals with a velocity of around a few hundred meters per second, whilecomputers signal at around 186,000 miles per second.
A third thing to note, before cont inuing with what AI has accompli shed, is the human brain has many,
many components in the form of nerv es interconnected with each other. We want to have the definition of
âthinkingâ to be something the human brain can do. With past failures to program a machine to think, theexcuse is often given that the machine was not big en ough, fast enough, etc. Some people conclude from
this if we build a big enough machine then automatically it will be able to think! Remember, it seems to bemore the problem of writing the program than it is bu ilding a machine, unless you believe, as with friction,
enough small partsâ will produce a new effectâthinking from non-thinking parts. Perhaps that is allthinking really is! Perhaps it is not a separate thing, it is just an artifact of larg eness. One cannot flatly deny
this as we have to admit we do not know what thinking really is.
Returning again to the past accomp lishments of AI. There was a geom etry proving routine which proved
theorems in classical school geometry much as you did when you took such a course. The famous theoremâIf two sides of a triangle are equa l then the base angles are also equal.â was given to the program,
Figure 7.I . You would probably bisect the top angle, and go on to prove the two parts are congruent
triangles, hence corresponding angles are equal. A few of you might bisect the third side, and draw the lineto the opposite angle, again getting two congruent triangles. The proof the machine produced used no
constructions but compared triangle ABC with triangle CBA, and then proved the selfcongruence, hence
equal angles.
Anyone looking at that proof will admit it is el egant, correct, and surprisi ng. Indeed, the people who
wrote the geometry proving program did not know it, nor was it widely known, though it is a footnote in mycopy of Euclid. One is inclined to say the program showed âoriginalityâ. The result was the program
apparently showed ânoveltyâ not put into the program by the designers; the program showed âcreativityâ;
and all those sorts of good things.
The Paradox of Machine Intelligence
- The author argues that human education is essentially an inefficient process of 'loading a program' into a person, whereas machine programming is clean and permanent.
- Samuelâs checker program and geometry theorem provers demonstrate behaviors that would be labeled as 'originality' or 'creativity' if performed by humans.
- A psychological paradox exists where the moment a program is written to perform a task, humans dismiss that task as a mere 'rote routine.'
- The 'hard AI' perspective posits that humans are biological machines, implying all intellectual feats can eventually be replicated by technology.
- The debate over machine consciousness hinges on whether the universe is governed solely by physics or if mysterious, unknown forces influence human thought.
- The author notes that physics currently cannot account for the vast majority of the universe, such as dark matter, complicating claims of scientific completeness.
Thus we have the paradox; the existence of the program automatically turns you against believing it is other than a rote process.
A bit of thinking will show the programmers gave the in structions in the program to first try to prove the
given theorem, and then when stuck try drawing auxiliary lines. If that had been the way you were taught todo geometry then more of you would have found the above elegant proof. So, in a sense, it was programmedin. But, as I said before, what was the course in geometry you were taught except trying to load a programinto you? Inefficiently, to be sure. That is the way with humans, but with machines it is clean, you just put
the program in once and for all, and you do not need to endlessly repeat and repeat, and still have thingsforgotten!
Did Samuelâs checker playing progra m show originality when it made surprising moves and defeated the
State Checker Champion? If not, can you show you have originality? Just what is the test you will use toseparate you from a computer program?
One can claim the checker playi ng program âlearnedâ and the geometry theorem proving program
showed âcreativityâ, âoriginalityâ, or what ever you care to call it. They are but a pair of examples of many
similar programs which have been written. The difficulty in convincing you the programs have the claimedproperties is simply once a program exists to do something you immediately regard what is done as
involving nothing other than a rote routine, even when random numbers obtained from the real world areincluded in the program. Thus we have the paradox; the existence of the program automatically turns youagainst believing it is other than a rote process. With this attitude, of course, the machine can never
demonstrate it is more than a âmachineâ in the classi cal sense, there is no way it can demonstrate, for
example, it can âthinkâ.
The hard AI people claim man is only a machine and nothing else, and hence anything humans can do in
the intellectual area can be copied by a machine. As noted above, most readers, when shown some result
from a machine automatically believe it cannot be the human trait that was claimed. Two questions
immediately arise. One, is this fair ? Two, how sure are you, you are not ju st a collection of molecules in a
radiant energy field and hence the whole world is merely molecule bouncing against molecule? If you
believe in other (unnamed, mysterious) forces how do th ey affect the motion of the molecules, and if they
cannot affect the motion then how can they affect the real world? Is physics complete in its description ofthe universe, or are there unknown (to them) forces? It is a hard choice to have to make. [Aside: At the
Figure 7.I
48 CHAPTER 7
moment (1994) it is believed that 90% to 99% of the Universe is the so-called dark matter of which physics
knows nothing except its gravitational attraction.]
The Digital Revolution in Music
- Digital music is created by sampling sound frequencies and quantizing amplitudes into numerical data that a computer can process.
- Computers can simulate any existing instrument by programming specific frequency combinations, attack, and decay patterns.
- Algorithmic composition allows computers to generate music by applying formal rules and using random numbers for creative choices.
- Digital technology represents the technical ceiling of music production, shifting the challenge from what is possible to what is worth producing.
- The feedback loop for composers is drastically shortened, allowing them to hear and refine their work instantly rather than waiting years for an orchestra.
- Conductors and producers gain absolute control over every millisecond and tonal fraction, removing the limitations of human performance.
It is now clearly a matter of what sounds are worth producing, not what can be done.
We now shift to some actual applications of computers in more cultural situations. Early in the Computer
Revolution I watched Max Mathews and John R.Pierce at Bell Telephone Laboratories deal with music fromcomputers. It will be clear later, if you do not know it now, once you decide how high a frequency you wantto reproduce then the sampling rate is determined. Hu mans can hear up to about 18,000 cycles per second at
best and then only when young; adults use a telephone at less than 8000 cycles per second and can generallyrecognize a voice almost at once. Th e quantizing of the sound track wh ich represents the music (and no
matter how many musical instruments there are there is a single sound track amplitude), does not introducemuch further distortion. Hence, so the reasoning went, we can have the computer compute the height of a
sound track at each time interval, pu t the number out as a voltage, pass it through a smoothing filter, and
have the corresponding âmus icâ. A pure tone is easy, just a sine curve. Combinations of frequencies
determine the sound of a single instrument, with its âattackâ (meaning how the frequencies grow inamplitude as the note starts, and the decay later on), and other features. With a number of differentinstruments programmed, you can then supply the notes and have the sound of the music written out on thetape for later playing. You do not have to compute th e numbers in real time, th e computer can go as slowly
as needed, and not even at a constant rate, but when the numbers are put on the tape and played at a uniform
rate then you get the âmusicâ.
But why supply the notes? Why not have the computer also âcompose?â There are, after all, many ârules
of compositionâ. And so they did, using the rules, an d when there were choices they used random numbers
to decide the next notes. At present we have both co mputer composed and comput er played music; you hear
a lot of it in commercials over radio and TV. It is cheaper, more controlled, an d can make sounds which no
musical instrument at present can make. Indeed, any sound which can appear on a sound track can be
produced by a computer.
Thus in a sense, computers are the ultimate in musi c. Except for the trivial details (of sampling rate and
number of levels of quantization, which could be incr eased if you wanted to pa y the price), the composers
now have available any sound which can exist, at any rates, in any combinations, tempos, and intensitiesthey please. Indeed, at present the âhighest quality r ecording of musicâ is digital. There can be no future
significant technical improvements. It is now clearly a matter of what sounds are worth producing, not what
can be done. Many people now have digitally recorded music players and they ar e regarded as being far
better than the older analog machines.
The machine also provides the composer with more immediate feedback to hear what was composed.
Before this, the composer had often to wait years an d years until fame reached out and the music composed
earlier was first heard in real life rather than only in the imagination. Hence the composer can now developa style at a much more rapid pace. From reading an i ssue of a Journal devoted to computer music I get the
impression a fairly elaborate computer setup is comm on equipment for todayâs composers of music, there
are many languages for them to use, and they are using a wide variety of approaches to creating music in a
combined human-machine effort.
The conductor of music now also has much more control. In the past the conductor when making a
recording tried to get the best from the musicians, and often several takings were spliced to get the bestrecording they could, including âmixingâ of the various microphone recordings. Now the conductor can getexactly what is wanted, down to the millisecond timing, fraction of a tone, and other qualities of theindividual instruments being simulated.
AI and Human Potential
- Computers are shifting human focus from the world of physical things toward the world of abstract ideas.
- The author advocates for human-machine collaboration rather than competition, viewing machines as tools to free humans from routine labor.
- There is significant skepticism regarding the percentage of the population capable of transitioning from manual labor to complex programming.
- Job displacement primarily affects lower-level roles, while new opportunities emerge at higher levels of cognitive complexity.
- The difficulty in automating algebra stems from the lack of explicit, logical rules for concepts like simplification that humans handle intuitively.
- The 'new math' movement illustrates the absurdity of trying to define simple mathematical expressions through rigid, non-intuitive rules.
However, I have long publicly doubted you could take many coal miners and make them into useful programmers.
All the all too human musicians do not have to be perfect at the sametime during a passage. ARTIFICIAL INTELLIGENCEâII 49
Here you see again the effects of computers and how they are pushing us from the world of things into
the world of ideas, and how they are supplementing and extending what humans can do.
This is the type of AI that I am interested inâwhat can the human and machine do together, and not in
the competition which can arise. Of course robots will displace many hu mans doing routine jobs. In a very
real sense, machines can best do routine jobs thus freeing humans for more humane jobs. Unfortunately,
many humans at present are not equipped to compet e with machinesâ they ar e unable to do much more
than routine jobs. There is a widespread belief (h ope?) humans can compete, once they are given proper
training. However, I have long publicly doubted you could take many coal miners and make them intouseful programmers. I have my reservations on the fraction of the human population who can be made intoprogrammers in the classical sense; if you call getting money from a bank dispensing âmachine
programmingâ, or the dialing of a telephone number (both which apply the human input to an elaborateprogram which is then executed much like an interpreter acts on your program input) then of course mostpeople can be made into programmers. But if you mean the more classical activity of careful analysis of a
situation and then the detailed specifi cation as to what is to be done, th en I say there are doubts as to what
fraction of the population can compete with computers, even with nice interactive prompting menus.
Computers have both displaced so many people from jobs, and also made so many new jobs it is hopeless
to try to answer which is the larger number. But it is cl ear that on the average it is the lower level jobs which
are disappearing and the higher le vel jobs which are appearing. Again, one would like to believe most
people can be trained in the future to the higher level jobsâbut that is a hope without any real evidence.
Besides games, geometry, and music we have al gebra manipulating programsâthey tend to be more
âdirectedâ programs than âself-standingâ programs, that is they depend on humans for guidance at variousstages of the manipulation. At first it is curious we could build a self-standing geometry program butapparently can not do the same easily for algebra. Simplification is one of the troubles. You may not have
noticed when you took an algebra course and you were to told âto simplify an expressionâ you wereprobably not given an explicit rule for âsimplifi cationââand if you were then the rule was obviously
ridiculous. For example, at least one version of the ânew mathâ said
is not simplified but
Machine Diagnosis and Legal Liability
- The definition of simplification is context-dependent and varies based on the intended next step in a process.
- Computer-assisted synthesis in chemistry allows for rapid exploration of costs, yields, and reaction times.
- Machines are increasingly replacing unreliable human analysis in medical measurements due to superior speed and consistency.
- While machines can store vast knowledge of rare diseases, legal liability remains a primary barrier to replacing human doctors.
- The legal system forgives human error under 'due prudence' but lacks a clear framework for suing a machine or its programmer.
- Rising medical costs are driven by the increasing complexity of treatments rather than the efficiency gains provided by computers.
But with a machine error whom do you sue? The machine? The programmer? The experts who were used to get the rules?
is simplified!
We constantly use the word âsimplifyâ, but its meaning depends on what you are going to do next, and
there is no uniform definition. Thus, if in the calculu s you are going to integrate next, you break things up
into small pieces, but at other times you try to combine the parts into nice product or quotient expressions.
A similar âguidance by humanâ interacting program has been developed for the synthesis of chemical
compounds. It has been quite useful as it gives: (1) the possible routes to the synthesis, (2) the costs, (3) the
times of the reactions along the way, and (4) the effective yields. Thus the programmer using it can exploremany various ways of synthesizing a new compound, or re-explore old ones to find new methods now thecosts of the materials and processes have ch anged from what they were some years ago.
Much of the medical measur ement of blood samples, etc. has gone to machine analysis rather than using
unreliable humans looking through microscopes. It is faster, more reliable and more cost effective in mostcases. We could go further in medicine and do medi cal diagnosis by machines, thus replacing doctors.
Indeed, in this case it is apt to be the machine which is prompting the doctor during the diagnosis! There50 CHAPTER 7
have long been on the market self-diagnosis kits for some diseases. That is nothing new. It is merely the
going farther and prescribing the treatment that bothers people.
We know doctors are human and hence unreliable, and often in the case of rare diseases the doctor may
never have seen a case before, but a machine does not forget and can be load ed with all the relevant
diseases. Hence from the symptoms th e program can either diagnose or call for further tests to establish the
probable disease. With probabilities programmed in (which can adjust rapidly for current epidemics),machines can probably do better in the long run than can the average, or even better than the average doctorâ and it is the average doctors who must be the ones to treat most people! The very best doctors canpersonally treat (unaided by machines) only very few of the whole population.
One major trouble is, among others, the legal problem. With human doctors so long as they show âdue
prudenceâ (in the legal language), then if they make a mistake the law fo rgives themâthey are after all only
human (to err is human). But with a machine error whom do you sue? The machine? The programmer? Theexperts who were used to get the rules? Those who formulated the rules in more detail? Those whoorganized them into some order? Or those who programmed these rules? With a machine you can prove bydetailed analysis of the program, as you cannot prove with the human doctor, that there was a mistake, awrong diagnosis. Hence my prediction is you will find a lot of computer assisted diagnosis made by
doctors, but for a long time there will be a human doctor at the end between you and the machine. We willslowly get personal programs which will let you know a lot more about how to diagnose yourself but therewill be legal troubles with such programs. For example, I doubt you will have the authority to prescribe the
needed drugs without a human doctor to sign the orde r. You, perhaps, have already noted all the computer
programs you buy explicitly absolve the sellers from any, and I mean any responsibility for the product they
sell! Often the legal problems of new applications are the main difficulty, not the engineering!
If you have gone to a modern hospital you have seen the invasion of computersâthe field of medicine has
been very aggressive in using computers to do a bett er, and better job. Better, in cost reduction, accuracy,
and speed. Because medical costs have risen dramatically in recent years you might not think so, but it is the
elaboration of the medical field which has brought the costly effects that dominate the gains in lower coststhe computers provide.
Computers in Specialized Labor
- Computers have become essential in healthcare for managing administrative red tape and monitoring patients with a vigilance that human nurses cannot match alone.
- Early symbolic manipulation programs in mathematics, such as Slagle's 1961 integration program, demonstrated that machines could compete with MIT engineers in abstract calculus.
- The complexity of modern integrated circuits, containing over a million transistors, has reached a point where human design is impossible without computer-driven automation.
- While robots excel in controlled production environments, they struggle with nonroutine situations where unexpected obstacles can lead to disaster.
- Future robotic applications, such as naval damage control, prioritize machine endurance in hostile environments where human life would otherwise be at risk.
No human mind could go reliably through the layout of more than a million transistors on a chip; it would be a hopeless task.
The computers do the billing, scheduling, and record keeping for the mechanics ofthe hospital, and even privat e doctors are turning to comp uters to assist them in their work. To some extent
the Federal bureaucracy is forcing them to do so to cope with the red tape surrounding the field.
In many hospitals computers monitor patients in the emergency ward, and sometimes in other places
when necessary. The machines are free from boredom, ra pid in response, and will alert a local nurse to do
something promptly. Unaided by computers it is doubtful full time nurses could equal the combination ofcomputer and nurse.
In Mathematics, one of the earliest programs (1953) which did symbol manipulation was a formal
differentiation program to find higher derivatives. It was written so they could find the first 20 terms of apower series of a complicated function. As you ought to know from the calculus, differentiation is a simpleformal process with a comparatively fe w rules. At the time you took the c ourse it must have seemed to be much
more than that, but you were probably confusing the differentiation with the later necessary simplificationand other manipulations of the derivatives. Another very early abstract symbol manipulation program wascoordinate changingâneeded for guided missiles, radars , etc. There is an extra degree of freedom in all
radars so the target cannot fly over the end of an axis of rotation and force the radar to slew 180° to track it.Hence coordinate transformations can be a bit messier than you might think.
Slagle, a blind scientist, wrote (in a thesis at MIT, 1961) a program which would do analytical integration
much as you did in the calculus cour se. It could compete with the aver age undergraduate engineer at MIT,
in both the range of integrals it could do and in the cost of doing them. Since then we have had muchARTIFICIAL INTELLIGENCEâII 51
improvement, and there is supposed to be a program based on the famous Risch algorithm that is supposed
to find any integral which can be done in closed form, but after years of waiting and waiting I have not seenit. There are, they tell me, integration programs which will get the closed form answer or else prove itcannot exist.
In the form of robots the computers have invaded production lines of hard goods as well as drugs, etc.
Computers are now assembled by robots which are driven by computers, an d the integrated circuit chips the
computers are built of are designed mainly by computers with some direction from humans. No humanmind could go reliably through the layout of more than a million transistors on a chip; it would be ahopeless task. The design programs clearly have some de gree of artificial intellig ence. In restricted areas,
where there can be no surp rises, robots are fairly effective, but where unexpect ed things can happen then
simple robots are often in serious trouble. A routine response to nonroutine situations can spell disaster.
An obvious observation for the Navy, for example; if on a ship you are going to have mobile robots (and
you need not have all of your robots mobile) then ru nning on rails from the ceiling will mean things which
fall to the deck will not necessari ly give trouble when bot h the robot and the ship are in violent motion.
That is another example of what I have been repeatedly saying, when you go to machines you do an
equivalent job, not the same one. Things are bound to be on the deck where they are not supposed to be,having fallen there by accident, by carelessness, by bat tle damage, etc, and having to step over, or around,
them is not as easy for a robot as for a human.
Another obvious area for mobile robots is in damage control . Robots can stand a much more hostile
environment, such as a fire, than can humans, even wh en humans are clothed in asbestos suits. If in doing
the job rapidly some of the robots are destroyed it is not the same as dead humans.
Machine Intelligence and Human Insight
- The advancement of chess-playing machines relies on massive computational volume rather than mimicking human psychological processes or insight.
- The original goal of using computers to study human thought has been largely abandoned in favor of simply winning games through brute force.
- Artificial intelligence produces psychological novelty, where programmers are surprised by outcomes, even if the machine follows strict logical rules.
- The concept of logical novelty is questioned, suggesting that human discoveries may also be the result of past experiences rather than true originality.
- The 'monkeys and typewriters' theory illustrates the idea that a random source could theoretically produce all known knowledge given infinite time.
That, at least is what they think they thinkâwhat the human mind actually does when playing chess is another matter!
The Navy now hasremote controlled mine sweepers because when you lose a ship you do not lose a human crew. We regularly
use robot control when doing deep sea diving, and we have unmanned bombers these days.
Returning to chess as played by machines. The programs have been getting steadily more effective and it
appears to be merely a matter of time until machines can beat the world chess cham pion. But in the past the
path to better programs has been mainly through the detailed examination of possible moves projectedforward many steps rather than by understanding how humans play chess. The computers are nowexamining millions of board positions per second, while humans typically exam ine maybe 50 to 100 at
most before making a moveâso they report when they are supposed to be cooperating with the psychologists.That, at least is what they think they thinkâwhat the human mind actually does when playing chess is
another matter! We really do not know!
In other games machines have been more successf ul. For example, I am to ld a backgammon playing
program beat all the winners of a contest held recently in Italy. But some simple games, like the game of Go,
simple in the rules only, remain hard to pr ogram a machine to play a first class game.
To summarize, in many games and related activities machines have be en programmed to play very well,
in some few games only poorly. But often the way the machine plays may be said âto solve the problem byvolume of computationsâ, rather th an by insightâ whatever âinsightâ means! We started to play games on
computers to study the human thought processes and not to win the game; the goal has been perverted towin, and never mind the insight into the human mind and how it works.
Let me repeat myself, artificial intelligence is not a subject you can afford to ignore; your attitude will
put you in the front or the rear of the applications of ma chines in your field, but al so may lead you into a really
great fiasco!
This is probably the place to introduce a nice distinction between logical and psychological novelty.
Machines do not produce logical novelty when working properly, but they certainly produce psychologicalnoveltyâprogrammers are constantly being surprised by what the program they wrote actually does! But52 CHAPTER 7
can you as a human produce logical novelty? A careful examination of peopleâs reports on their great
discoveries often shows they were led by past experien ces to finding the result th ey did. Circumstances led
them to success; psychological but not logical novelty. Are you not prepared by past experiences to do whatyou do, to make the discoveries you do? Is logical novelty actually possible?
Do not be fooled into thinking that psychological n ovelty is trivial. Once the postulates, definitions, and
the logic are given, then all the rest of mathematics is merely psychologically novelâat that level there is in
all of mathematics technically no logical novelty!
There is a common belief, if we appeal to a rand om source of making deci sions then we escape the
vicious circle of molecule banging against molecule , but from whence comes this external random source
except the material world of molecules?
There is also the standard claim a truly random source contains all knowledge. This is based on a variant
of the monkeys and the typewriters story. Ideally you have a group of monkeys sitting at typewriters and atrandom times they hit random keys. It is claimed in ti me one of them will type all the books in the British
Museum in the order in which they are on the shelves! This is based on the argument that sooner or later a
monkey will hit the right first key; indeed in infinite time this will happen infinitely often. Among these
infinite number of times there will be some (an infinite number) in which the next key is hit correctly. Andso it goes; in the fullness of infinite time the exact sequence of key strokes will occur.
The Paradox of Machine Thought
- Knowledge theoretically exists within random noise, but the inability to recognize it makes filtering information impossible.
- The debate over free will is unresolved because no experiment can prove its existence, yet we deny it to others through environmental determinism.
- Thinking may be defined by the process rather than the result, suggesting that routine tasks are conditioned responses rather than true thought.
- The 'Hard AI' perspective focuses solely on results, which allows humans to maintain a sense of superiority until machines match their output.
- Humans experience a conflict between wanting machines to think for utility and fearing the loss of self-importance if they do.
- The threat of machines surpassing human professionals like doctors creates a deep existential anxiety about our own value.
The logic of the situation is inescapableâ the reality is hardly believable!
This is the basis for the claim, all of knowledge resi des in a truly random source, and you can get it easily
if you can write a program to recogn ize âinformationâ. For example, sooner or later the next theory of
physics will occur in the random stream of noise, and if you can recognize it you will have filtered it out of
the stream of random numbers! The lo gic of the situation is inescapableâ the reality is hardly believable!
The times to wait are simply too long, and in truth you cannot always recognize â'informationâ even when
you see it.
There is an old claim, âfree willâ is a myth, in a given circumstance you being you as you are at the
moment you can only do as you do. The argument sounds cogent, though it flies in the face of your belief
you have free will. To settle the question, What experiment would you do? There seems to be nosatisfactory experiment which can be done. The truth is we constantly alternate between the two positions inour behavior. A teacher has to believe if only the ri ght words were said then the student would have to
understand. And you behave similarly when raising a chil d. Yet the feeling of having free will is deep in us
and we are reluctant to give it up for ourselvesâbut we are often willing to deny it to others!
As another example of the tacit belief in the lack of free will in others, consider when there is a high rate
of crime in some neighborhood of a city many people believe the way to cure it is to change theenvironmentâhence the people will have to change and the crime rate will go down!
These are merely more examples to get you involved with the question of, âCan machines think?â
Finally, perhaps thinking should be measured not by what you do but how you do it . When I watch a child
learning how to multiply two, say three digit, numbers, then I have the feeling the child is thinking; when I
do the same multiplication I feel I am more doing âconditioned responsesâ; when a computer does the samemultiplication I do not feel the machine is thinking at all. In the words of the ol d song, âIt ainât what you
do, itâs the way that you do itâ. In the area of thinking maybe we have confused what is done with the wayit is done, and this may be the source of much of our confusion in AI.
The hard AI people will accept only what is done as a measure of success, and this has carried over into
many other peopleâs minds without carefully examining th e facts. This belief, âthe results are the measure
of thinkingâ, allows many people to believe they can âthinkâ and machines cannot, since machines have not
as yet produced the required results. ARTIFICIAL INTELLIGENCEâII 53
The situation with respect to computers and thought is awkward. We would like to believe, and at the
same time not believe, machines can âthinkâ. We want to believe because machines could then help us so much
in our mental world; we want to not believe to pres erve our feeling of self-i mportance. The machines can
defeat us in so many ways, speed, accuracy, reliabilit y, cost, rapidity of control, freedom from boredom,
bandwidth in and out, ease of forgetting old and learning new things, hostile environments, and personnelproblems, that we would like to feel superior in some way to themâthey are, after all, our own creations!For example, if machine programs could do a significantly better job than the current crop of doctors, wherewould that leave them? And by extension where would we be left?
Two of the main sticky points are: (1) if a machine does it then it must be an algorithm and cannot be
The Limits of AI
- The fundamental gap between physical molecular movement and the emergence of self-awareness remains an unsolved mystery.
- Current discussions on AI are hampered by a lack of clear definitions for terms like thinking and consciousness.
- The recursive nature of using language to analyze language processing creates inherent uncertainty in the field.
- AI should not be dismissed despite the false claims of experts, as its limits remain an open and vital question for the future.
- Thinking may be a matter of degree or a specific process of execution rather than a binary state of being.
- Defining what evidence would be required to change one's mind is essential for an objective evaluation of machine intelligence.
We simply do not know what we are talking about; the very words are not defined, nor do they seem definable in the near future.
thinking, and (2) on the other hand how do we escape th e molecule banging against molecule we apparently
areâby what forces do our thinking, our self-awareness, and our self-consciousness affect the paths of themolecules?
In two previous chapters I closed w ith estimates of the limits of both hardware and software, but in these
two chapters on AI I can do very little. We simply do not know what we are talking about; the very wordsare not defined, nor do they seem definable in the near future. We have also had to use language to talkabout language processing by computers, and the recursiveness of this makes things more difficult and lesssure. Thus the limits of applications, which I have ta ken to be the general topic of AI, remain an open
question, but one which is important for your future career. Thus AI requires your careful thought andshould not be dismissed lightly just because many experts make obviously false claims. 54 CHAPTER 7
8
Artificial IntelligencâIII
I suggest you pause and have two discussion with yourself on the topic,
Can Machines Think?
and review why it is important to come to your own evaluation of what machines can and cannot do in
your future. Consider the following list of observations:
1. Just because computers have not yet been programmed to think does not mean th ey cannot think; it may
mean programmers are stupid!
2. Just because you want to believe that machines can think does not mean they can; it may only be
wishful thinking!
3. Art Samuelâs checker program âlearned â from experience so machines can apparently learn from
experience.
4. The new proof in the isosceles triangle theorem showed âoriginalityââperhaps as much as you have
ever done!
5. Try to imagine the shortest, or close to the shortest, program you believe could think. No subpiece
could think by definition.
6. Remember âlogicalâ and âpsychologicalâ novelty.
7. Whatever your opinion is, what evidence would you accept you are wrong?8. Thinking may be a matter of degree and not a yes/no thing.9. Consider thinking may be the way something is done rather than what is done which determines
Man-Machine Symbiosis and Resistance
- AI research historically focuses on the outcomes of tasks rather than the internal processes of how they are achieved.
- Human resistance to machine control is often hypocritical, as people already rely on computers for life-critical functions like pacemakers and flight stabilization.
- The argument that machines cannot do what humans do ignores the reality that machines already perform many tasks that are impossible for humans.
- Religious beliefs often underpin the conviction that humans are unique, yet these arguments are rarely articulated clearly in secular or diverse settings.
- The focus should shift from human-machine conflict to the potential of man-machine combinations, moving past ego-driven superiority.
- Machines offer distinct advantages over human experts, including speed, accuracy, freedom from boredom, and ease of retraining.
It is the combination of man-machine which is important, and not the supposed conflict which arises from their all too human egos.
whether it occurs or not. AI has traditionally stuck to the âwhat is doneâ and seldom considered theâhow it is doneâ.
You could begin your discussion begins with my observation which ever position you adopt there is theother side, and I do not care what you believe so long as you have good reasons and can explain themclearly. That is my task, to make you think on this awkward topic, and not to give any answers.
Year after year such discussions are generally quite hostile to machines, though it is getting less so every
year. They often start with remarks such as, âI would not want to have my life depend on a machine.â towhich the reply is, âYou are opposed to using pacemakers to keep people alive?â Modern pilots cannotcontrol their airplanes but must depend on machines to stabilize them. In the emergency ward of modernhospitals you are automatically connected to a computer which monitors your vital signs and under manycircumstances will call a nurse long before any human could note and do anything. The plain fact is yourlife is often controlled by machines and sometimes they are essential to your lifeâyou just do not like to be
reminded of it.
âI do not want machines to control my life.ââyou do not want stop and go lights at intersections! See
above for some other answers. Ofte n humans can cooperate with a m achine far better than with other
humans!
âMachines can never do things humans can doâ. I obse rve in return machines can do things no human can
do. And in any case, how sure are you for any clearly prespecified th ing machines (programs) apparently
cannot now do and in time still could not do it better than humans can? (Perhaps âclearly specifiedâ means
you can write a program!) And in any case how releva nt are these supposed di fferences to your career?
The people are generally sure they ar e more than a machine, but usuall y can give no real argument as to
why there is a difference, unless they appeal to their religion, and with foreign students of very different
faiths around they are reluctant to do soâthough obviously most (though not all) religions share the beliefman is different, in one way or another, from the rest of life on Earth.
Another level of objections to the use of computers is in the area of experts. Pe ople are sure the machine
can never compete, ignoring all the adva ntages the machines have (see end of Chapter 1 ). These are:
economics, speed, accuracy, reliabilit y, rapidity of control, freedom from boredom, bandwidth in and out,
ease of retraining, hostile environments, and personne l problems. They always seem to cling to their
supposed superiority rather than try to find places wh ere machines can improve matte rs! It is difficult to get
people to look at machines as a go od thing to use whenever they will wo rk; they keep their feelings people
are somehow superior in some areaâand of course th ere are such areas, but at present they are seldom
where you first think they are. It is the combination of man-machine which is important, and not thesupposed conflict which arises from their all too human egos.
A second useful discussion is on the topic:
Thinking About Future Applications
- The author emphasizes the importance of sensitizing oneself to future technological possibilities rather than just reviewing past or present applications.
- There is a noted difficulty in getting experts to aggressively reimagine how their own specific fields could be transformed by computers.
- The author suggests that people might be less inhibited and more creative when applying computer logic to areas outside their narrow specialties.
- Readers are encouraged to confront the 'awkward' topic of machine intelligence and develop a clear vision for their personal futures.
- The text advocates for a dialectical approach to belief, where one must argue against their own certainties to achieve true clarity.
- The primary goal of the author is not to dictate belief, but to force the reader to articulate and defend their own positions.
I have some times wondered whether it might be better if I asked people to apply computers to other areas of application than their own narrow speciality; perhaps they would be less inhibited there!
Future applications of computers to thei r area of expertise.
All too often people report on past and present applications, which is good, but not on the topic whose
purpose is to sensitize you to future possibilities you might exploit. It is hard to get people to aggressivelythink about how things in their own area might be done differently. I have some times wondered whether itmight be better if I asked people to apply computers to other areas of application than their own narrowspeciality; perhaps they would be less inhibited there!
Since the purpose, as stated above, is to get the r eader to think more careful ly on the awkward topics of
machines âthinkingâ and their vision of their personal future, you the reader should take your own opinionsand try first to express them clearly , and then examine them with counte r arguments, back and forth, until
you are fairly clear as to what yo u believe and why you believe it. It is none of the authorâs business in this
matter what you believe, but it is the authorâs business to get you to think and articulate your position
clearly. For readers of the book I suggest instead of reading the next pages you stop and discuss with yourself,or possibly friends, these nasty problems; the surer you are of one side the more you should probably arguethe other side! 56 CHAPTER 8
9
n-Dimensional space
Designing in High Dimensional Space
- The author reflects on a career at Bell Labs, realizing that complex engineering design problems actually occur in n-dimensional space where each parameter represents a dimension.
- Human intuition is often limited to two dimensions; even in a three-dimensional world, life forms like fish or airplanes must congregate in specific areas to ensure encounters.
- Mathematical constructs of n-dimensional space are essential for understanding the behavior of systems with many independent variables.
- The Pythagorean theorem naturally extends into higher dimensions, where the square of the diagonal equals the sum of the squares of all mutually perpendicular sides.
- To understand the 'size' of restricted design spaces, one must calculate the volume of n-dimensional spheres using tools like Stirling's approximation.
You think you live in three dimensions, but in many respects you live in a two dimensional space.
When I became a professor, after 30 years of active research at Bell Telephone Laboratories, mainly in the
Mathematics Research Department, I recalled professors are supposed to think and digest past experiences.
So I put my feet up on the desk and began to consider my past. In the early years I had been mainly incomputing so naturally I was involved in many large projects which required computing. Thinking abouthow things worked out on several of the large engineering systems I was partially involved in, I began, nowI had some distance from them, to see they had some common elements. Slowly I began to realize the
design problems all took place in a space of n-dimensions, where n is the number of independent
parameters. Yes, we build three dimensional objects , but their design is in a high dimensional space, 1
dimension for each design parameter.
I also need high dimensional spaces so later proofs w ill become intuitive ly obvious to you without filling
in the details rigorously Hence we will discuss n-dimensional space now.
You think you live in three dimensions, but in many respects you live in a two dimensional space. For
example, in the random walk of life, if you meet a person you then have a reas onable chance of meeting
that person again. But in a world of three dimensions you do not! Consider the fish in the sea whopotentially live in three dimensions. They go along the surface, or on the bottom, reducing things to twodimensions, or they go in schools, or they assemble at one place at the same time, such as a river mouth, a
beach, the Sargasso sea, etc. They cannot expect to find a mate if th ey wander the open ocean in three
dimensions. Again, if you want airplanes to hit each ot her, you assemble them near an airport, put them in
two dimensional levels of flight, or send them in a group ; truly random flight wo uld have fewer accidents
than we now have!
n-dimensional space is a mathematical construct which we must investig ate if we are to understand what
happens to us when we wander there during a design problem. In two dimensions we have Pythagorasâtheorem for a right triangle the square of the hypotenuse equals the sum of the squares of the other twosides. In three dimensions we ask for the le ngth of the diagonal of a rectangular block, Figure 9.I . To find it
we first draw a diagonal on one face, apply Pythagorasâ theorem, and then take it as one side with the other
side the third dimension, which is at right angles , and again from the Pythag orean theorem we get the
square of the diagonal is the sum of the squares of the three perpendicular sides. It is obvious from thisproof, and the necessary symmetry of the formula, as you go to higher and higher dime nsions you will still
have the square of the diagonal as the sum of the squares of the individual mutually perpendicular sides
where the xi are the lengths of the sides of the rectangular block in n-dimensions.
Continuing with the geomet ric approach, planes in the space will be simply linear combinations of the xi,
and a sphere about a point will be all points which are at the fixed distance (t he radius) from the given
point.
We need the volume of the n-dimensional sphere to get an idea of the size of a piece of restricted space.
But first we need the
Stirling approximation for n!, which I will derive so you will see most of the details and be convinced what
is coming later is true, rather than on hearsay.
A product like n! is hard to handle, so we take the log of n! which becomes
where, of course, the In is the logari thm to the base e. Sums remind us that they are related to integrals, so
we start with the integral
We apply integration by part s (since we recognize the In x arose from integrating an algebraic function and
hence it will be removed in the next step). Pick U=In x, dV=dx, then
On the other hand, if we apply the trapezoid rule to the integral of In x we will get, Figure 9.II ,
Since In 1=0, adding (
Stirling's Formula and Hyperspheres
- The text derives Stirling's formula as an approximation for factorials, noting that while the ratio of the approximation to the true value approaches 1, the absolute difference grows with n.
- The gamma function is introduced as a continuous extension of the factorial function for all positive real numbers using an integral definition.
- A mathematical 'trick' involving polar coordinates and the product of integrals is used to evaluate the gamma function of 1/2 as the square root of pi.
- The volume of an n-dimensional sphere is defined by a constant Cn multiplied by the radius to the power of n.
- Calculations reveal a counterintuitive geometric property: the volume coefficient Cn peaks at dimension 5 and then decreases toward zero as dimensions increase.
- For a unit radius, the volume of an n-dimensional hypersphere eventually vanishes as the number of dimensions approaches infinity.
Note as the numbers get larger and larger the ratio approaches 1 but the differences get greater and greater!
) In n to both terms we get, finally,
Undo the logs by taking the exponential of both sides
where C is some constant (not far from e) independent of n, since we are approxima ting an integral by the
trapezoid rule and the error in the trapezoid ap proximation increases more and more slowly as n grows
Figure 9.I58 CHAPTER 9
larger and larger, and C is the limiting value. This is the first fo rm of Stirlingâs formul a. We will not waste
time to deriving the limiting, at infinity, value of the constant C which turns out to be
=2.5066âŚ
(e=2.71828âŚ). Thus we finally have the usual Stirlingâs formula for the factorial
The following table shows the quality of the Stirling approximation to n!
n Stirling True Stirling/True
1 0.92214 1 0.92214
2 1.91900 2 0.959503 5.83621 6 0.97270
4 23.50518 24 0.97942
5 118.01916 120 0.98349
6 710.07818 720 0.98622
7 4,980.3958 5,040 0.988178 39,902.3958 40,320 0.98964
9 359,536.87 362,880 0.99079
10
3,598,695.6 3,628,800 0.99170
Note as the numbers get larger and larger the ratio approaches 1 but th e differences get greater and greater!
If you consider the two functions
then the limit of the ratio f(n)/g(n), as n approaches infinity, is 1, but as in the table the difference
grows larger and larger as n increases.
We need to extend the factorial function to al l positive real numbers, hence we introduce the gamma
function in the form of an integral
Figure 9. II
N-DIMENSIONAL SPACE 59
which converges for all n>0. For n>1 we again integrate by parts, this time using the d V=eâx dx and the U=xnâ
1. At the two limits the integrated part is zero, and we have the reduction formula
with Đ (1)=1 .
Thus the gamma function takes on the values ( nâ1)! at the positive integers n, and it provides a natural
way of extending the factorial to all positive numbers since the integral exist whenever n>0.
We will need
Set x=t2, hence d x=2t dt, and we have (using symmetry in the last step)
We now use a standard trick to evaluate this integral. We take the product of two of the integrals, one with x
and one with y as their variables.
The x2+y2 suggests polar coordinates, so we convert
The angle integration is easy, the exponentia l is now also easy, and we get, finally,
Thus
We now turn to the volume of an n-dimensional sphere (or hypersphere if you wish). Clearly the volume of
a cube in n dimensions and of side x is xn. A little reflection and you will believe the formula for the volume
of an n-dimensional sphere must have the form
where Cn is a suitable constant. In the case n=2 the constant is Ď, in the case n=1, it is 2 (when you think
about it). In three dimensions we have C3=4Ď/3.
We start with same trick as we used for the gamma function of 1/2, except this time we take the product
of n of the integrals, each with a different xi. Thinking of the volume of a sphere we see it is the sum of
shells, and each element of the sum has a volume whic h is the corresponding shell area multiplied by the
thickness, d r. For a sphere the value for the surface area can be obtained by differentia ting the volume of the
sphere with respect to the radius r,60 CHAPTER 9
and hence the elements of volume are
We have, therefore, on setting r2=t
from which we get
It is easy to see
and we can compute the following table.
Dimension n Coefficient Cn
1 2 =2.00000âŚ
2 Ď =3.14159âŚ
34 Ď/3 =4.11879âŚ
4 Ď2/2 =4.93480âŚ
58 Ď2/15 =5.26379âŚ
6 Ď3/6 =5.16771âŚ
71 6 Ď3/105 =4.72477âŚ
8 Ď4/24 =4.05871âŚ
93 2 Ď4/945 =3.29850âŚ
10 Ď5/120 =2.55010âŚ
2k Ďk/k! â 0
Thus we see the coefficient Cn increases up to n=5 and then decreases towards 0. For spheres of unit radius
this means the volume of th e sphere approaches 0 as n increases. If the radius is r, then we have for the
volume, and using n=2k for convenience (since the actual numbers vary smoothly as n increases and the odd
dimensional spaces are messier to compute),N-DIMENSIONAL SPACE 61
Geometry of High Dimensions
- As the number of dimensions increases, the volume of a sphere of any radius eventually shrinks toward zero.
- In high-dimensional spaces, almost all the volume of a sphere is concentrated in a thin shell near its surface.
- Optimal designs in high-dimensional engineering are typically found on the surface of the feasible region rather than the interior.
- Standard calculus optimization methods are often inappropriate for high-dimensional spaces where extremes are the norm.
- The diagonal of an n-dimensional cube becomes increasingly perpendicular to every coordinate axis as dimensions grow.
- In a 10-dimensional space, there are 1,024 diagonal lines that are all simultaneously almost perpendicular to the axes.
As we say, the volume is almost all on the surface.
No matter how large the radius, r, increasing the number of dimensions, n, will ultimately produce a sphere
of arbitrarily small volume.
Next we look at the rela tive amount of the volume close to the surface of a n-dimensional sphere. Let the
radius of the sphere be r, and the inner radius of the shell be r(1âÎľ), then the relative volume of the shell is
For large n, no matter how thin the shell is (relative to the radius), almost all the volume is in the shell and
there is almost nothing inside. As we say, the volume is almost all on the surface . Even in 3 dimensions the
unit sphere has 7/8-ths of its volum e within 1/2 of the surface. In n-dimensions there is 1â1/2n within 1/2 of
the radius from the surface.
This has importance in design; it means almost sure ly the optimal design will be on the surface and will
not be inside as you might think from taking the calculus and doing optimizations in that course. Thecalculus methods are usually inapprop riate for finding the optimum in hi gh dimensional spaces. This is not
strange at all; generally speaking the best design is pushing one or more of the parameters to their extremeâobviously you are on the surface of the feasible region of design!
Next we turn to looking at the diagonal of an n-dimensional cube, say the vector from the origin to the
point (1,1,âŚ,1). The cosine of the angle between this line and any axis is given by definition as the ratio ofthe component along the axis, which is clearly 1, to the length of the line which is ân. Hence
Therefore, for large n the diagonal is almost perpendicular to every coordinate axis!
If we use the points with coordinates (Âą1, Âą1,âŚ, Âą1) then there are 2n such diagonal lines which are all
almost perpendicular to the coordinate axes. For n=10, for example, this amounts to 1024 such almost
perpendicular lines.
I need the angle between two lines, and while you may remember it is the vector dot product, I propose to
derive it again to bring more understanding about what is going on. [Aside; I have found it very valuable inimportant situations to review all the basic derivations involved so I have a firm feeling for what is going
on.] Take two points x and y with their corres ponding coordinates x
i
and yi, Figure 9.III . Then applying the law of cosines in the plane of the three points x, y, and the origin we
have
Figure 9. III
62 CHAPTER 9
where X and Y are the lengths of the lines to the points x and y. But the C comes from using the differences
of the coordinates in each direction
Comparing the two expressions we see
The Paradoxes of N-Dimensions
- In high-dimensional space, random vectors are almost surely almost perpendicular to one another, defying standard linear algebra intuition.
- While there are only n mutually perpendicular axes in n-dimensions, there are 2^n other directions that are nearly perpendicular to those axes.
- A geometric construction of packed spheres in an n-dimensional cube reveals that the radius of a central inner sphere grows with the number of dimensions.
- By the 10th dimension, the central sphere, despite being contained by the inner surfaces of the corner spheres, actually reaches outside the surrounding cube.
- The author argues that raw human intuition is poorly suited for high-dimensional spaces where complex design problems occur.
- These phenomena are grounded in classical Euclidean space using the Pythagorean distance formula, also known as the L2 norm.
Yes, the sphere is convex, yes it touches each of the 1024 packed spheres on the inside, yet it reaches outside the cube!
We now apply this formula to two lines drawn from the origin to random points of the form
The dot product of these factors, taken at random, is again random Âą1âs and these are to be added n times,
while the length of each is again ân, hence (note the n in the denominator)
and by the weak law of large numbers this approaches 0 for increasing n, almost surely . But there are 2n
different such random vectors, and given any one fixed vector then any other of these 2n random vectors is
almost surely almost perpendicular to it! n-dimensions is indeed vast!
In linear algebra and other courses you learned to find the set of perpendicular axes and then represent
everything in terms of these coordinates, but you see in n-dimensions there are, after you find the n mutually
perpendicular coordinate directions, 2n other directions which are almost perpendicular to those you have
found! The theory and practice of linear alge bra are quite different!
Lastly, to further convince you your intuitions about high dimensional spaces are not very good, I will
produce another paradox which I will need in later chapters. We begin with a 4Ă4 square and divide it into 4unit squares in each of whic h we draw a unit circle, Figure 9.IV . Next we draw a circ le about the center of
the square with radius just touching the four circles on their insides. Its radius must be, from theFigure 9.IV ,
Now in three dimensions you will have a 4x4x4 cube, and 8 spheres of unit radius. The inner sphere will
touch each outer sphere along the line to their center will have a radius of
Think of why this must be larger than for two dimensions.
Going to n dimensions, you have a 4Ă4ĂâŚĂ4 cube, and 2n spheres, one in each of the corners, and with
each touching its n adjacent neighbors. The inner sphere, touching on the inside all of the spheres, will have
a radius of
Examine this carefully! Are you sure of it? If not, why not? Where will you object to the reasoning?
Once satisfied it is correct we apply it to the case of n=10 dimensions. You have for the radius of the
inner sphere
N-DIMENSIONAL SPACE 63
and in 10 dimensions the inner sphere reaches outside the surrounding cube! Yes, the sphere is convex, yes
it touches each of the 1024 pack ed spheres on the inside, yet it reaches outside the cube!
So much for your raw in tuition about n-dime nsional space, but remember the n-dimensional space is
where the design of complex objects generally takes place. You had better get an improved feeling for n-dimensional space by thinking about the things just pr esented, until you begin to see how they can be true,
indeed why they must be true. Else you will be in trouble the next time you get into a complex design
problem. Perhaps you should calculate the radii of the various dimensions, as well as go back to the angles
between the diagonals and the axes, and see how it can happen.
It is now necessary to note care fully, I have done all this in th e classical Euclidean space using the
Pythagorean distance where the sum of squares of the differences of the coordinates is the distance between
the points squared. Mathematicians call this distance L
Metrics and Distance Functions
- The L1 metric, or Hamming distance, measures distance as the sum of coordinate differences, resembling travel on a city grid.
- The Lâ metric, or Chebyshev distance, defines distance as the maximum coordinate difference between two points regardless of other traits.
- Geometric shapes like circles and spheres change drastically depending on the metric used, appearing as squares or cubes in L1 and Lâ spaces.
- All valid metrics must satisfy four fundamental conditions: non-negativity, identity, symmetry, and the triangle inequality.
- While L2 is standard for physical measurements, L1 and Lâ are often more appropriate for intellectual judgments and pattern identification in AI.
- Real-world design spaces are often a 'messy' mixture of different metrics rather than a uniform Euclidean environment.
In this space a circle in two dimensions looks like a square standing on a point, Figure 9.V.
2.
The space L1 uses not the sum of the squares, but rather th e sum of the distances, much as you must do in
traveling in a city with a rectangular grid of street s. It is the sum of the differences between the two
locations that tells you how far you must go. In th e computing field this is often called the âHamming
distanceâ for reasons which will appear in a later chapter. In this space a circle in two dimensions looks like
a square standing on a point, Figure 9.V . In three dimensions it is like a cube standing on a point, etc. Now
you can better see how it is in the circle paradox above the inner sphere can get outside the cube.
There is a third, commonly used, metric (the y are all metrics=distance functions), called Lâ, or
Chebyshev distance. Here we have the distance is the maximu m coordinate difference, regardless of any
other differences, Figure 9.VI . In this space a circle is a square, a three dimensional sphere is a cube, and
you see in this case the inner circle in the circle paradox has 0 radius in all dimensions.
These are all examples of a metric, a measure of distance. The conv entional conditions on a metric D(x,y)
between two points x and y are:
1.D(x,y)âĽ0 (non-negative),
2.D(x,y) =0 if and only if x=y (identity),
Figure 9.IV
64 CHAPTER 9
3.D(x,y)=D(y,x) (symmetry),
4.D(x,y)+D(y,z) âĽD(x,z) (triangle inequality).
Figure 9.V
Figure 9.VI
It is left to you to verify the three metrics, Lâ, L2 and L1 (Chebyshev, Pythagoras, and Hamming), all satisfy
these conditions.
The truth is, in complex design, for various coordinates we may use any of the three metrics, all mixed up
together, so the design space is not as portray ed above, but is a mess of bits and pieces. The L2 metric is
connected with least squares, obviously, and the other two, Lâ and L1, are more like comparisons. In making
comparisons in real life, you generall y use either the maximum difference, Lâ, in any one trait as sufficient
to distinguish two things, or sometimes, as in strings of bits, it is the number of differences which matters,N-DIMENSIONAL SPACE 65
and the sum of the squares does not enter, hence the L1 distance is used. This is increasingly true, for
example, in pattern identification in AI.
Unfortunately, the above is all too true, and it is seldom pointed out to you. They never told me a thing
about it! I will need many of the results in later chapte rs, but in general, after this exposure, you should be
better prepared than you were for complex design and for carefully examining th e space in which the design
occurs, as I have tried to do here. Messy as it is, fundamentally it is where the design occurs and where youmust search for an acceptable design.
Since L
1 and Lâ are not familiar let me expand the remarks on the three metrics. L2 is the natural distance
function to use in physical and geometric situa tions including the data reduction from physical
measurements. Thus you find least squares, L2, throughout physics. But when the subject matter is
intellectual judgments then the ot her two distance functions are genera lly preferable, and this is slowly
coming into use, though we still find the Chi square test, which is obviously a measure for L2, used widely
when some other suitable test should be used.66 CHAPTER 9
10
Coding TheoryâI
Information Representation and Transmission
- The meaning of a symbol in a computer is not inherent but is defined entirely by how it is processed.
- Information representation is simplified by treating transmission through space and storage through time as the same problem.
- A general theory of information is achieved by abstracting away the specific nature of the source, whether it be music, math, or dance.
- The 'meaning' associated with symbols is excluded from the technical theory to ensure its broad applicability across different fields.
- The standard model of information systems begins with a source that generates a sequence of symbols for processing.
It is the abstraction from details that gives the breadth of application.
Having looked at computers and how they operate, we now turn to the problem of the representation of
informationâhow do we represent the information we want to process. Recall any meaning a symbol may
have depends on how it is processed; there is no inherent meaning to the bits the machine uses. In the
synthetic language mentioned in Chapter 4 on the history of software, the breaking up of the instructions
was pretty much the same for every code instruction and this is true for most languages; the âmeaningâ ofany instruction is defined by the corresponding subroutine.
To simplify the problem of the representation of information we will, at present, examine only the
problem of the transmission of information from here to there. This is exactly the same as transmission from
now to then, storage. Transmission through time or through space ar e the same problem. The standard
model of the system is given in Figure 10.I
Starting of the left hand side of Figure 10.I we have a source of information. We do not discuss what the
source is. It may be a string of: al phabetical symbols, numbers, mathemat ical formulas, musical notes of a
score, the symbols now used to represent dance movementsâwhat ever the source is and what everâmeaningâ is associated with the symbols is not part of the theory. We postulate only a source of
information, and by doing only that, and no more, we have a powerful, general theory which can be widelyapplicable. It is the abstr action from details that gives the breadth of application.
Figure 10.I
Foundations of Information Theory
- Claude Shannon insisted on the term 'information' despite the theory focusing primarily on strings of symbols rather than meaning.
- The encoding process is split into source encoding, which adapts to the data, and channel encoding, which adapts to the transmission medium.
- Information theory uniquely assumes the presence of noise and errors from the start, unlike classical physics or quantum mechanics.
- The concept of transmission applies equally to sending data through space or through time, which is defined as storage.
- Variable length codes, like Morse code, increase efficiency by assigning shorter symbols to more frequent data points.
- A fundamental requirement for any code is the ability to uniquely decode a stream of symbols in the absence of noise.
Recall, again, sending through space is the same as sending through time, namely storage.
When in the late 1940s C.E.Shannon created Information Theory there was a general belief he should call
it Communication Theory, but he insisted on the word âinformationâ, and it is exactly that word which has
been the constant source of both interest and of disappointment in the theory. One wants to have a theory ofâinformationâ but it is simply a theory of strings of sy mbols. Again, all we suppose is there is such a source,
and we are going to enco de it for transmission.
The encoder is broken into two pa rts, the first half is called the source encoding which as its name
implies is adapted to the source, various sources having possibly different kinds encodings.
The second half of the encoding process is called channel encoding and it is adapted to the channel over
which the encoded symbols are to be sent. Thus the second half of the encoding process is tuned to thechannel. In this fashion, with the common interface, we can have a wide variety of sources encoded first to
the common interface, and then the me ssage is further encoded to adapt it to the particular channel being
used.
Next, going to the right in Figure 10.I , the channel is supposed to have ârandom noise addedâ. All the
noise in the system is in corporated here. It is assumed the enc oder can uniquely recognize the incoming
symbols without any error, and it will be assumed the decoder similarly functions without error. These areidealizations, but for many practical purposes they are close to reality.
Next, the decoding is done in two stages, channel to standard, and then standard to the source code.
Finally it is sent on to the sink, to its destination. Again, we do not ask what the sink does with it.
As stated before, the system resembles transmissi on, for example a telephone message from me to you,
radio, or TV programs, and other things such as a number in a register of a computer being sent to another place.Recall, again, sending through space is th e same as sending through time, namely storage . If you have
information and want it later, you encode it for storage and store it. Later when you want it it is decoded.Among encoding systems is the identit y, no change in the representation.
The fundamental difference between this kind of a theory and the usual theory in physics is the
assumption at the start there is ânoiseâ, errors will arise in an y equipment. Even in quantum mechanics the
noise appears at a later stage as an uncertainty princi ple, not as an initial assumption; and in any case the
ânoiseâ in Information Theory is not at all the same as the uncertainty in Q.M.
We will, for convenience only, assume we are using the binary form fo r the representation in the system.
Other forms can be similarly handled, but the generality is not worth the extra notation.
We begin by assuming the coded symbols we use are of variable length, much as the classical Morse
code of dots and dashes, where the common letters are short and the rare ones are long. This produces an
efficiency in the code, but it should be noted Morse code is a ternary code, not binary, since there are spaces
as well as dots and dashes. If all the code sym bols are of the same length we will call it a block code .
The first obvious property we want is the ability to uniquely decode a message if there is no noise added
âat least it seems to be a desirable property, though in some situations it could be ignored to a small extent.
What is sent is a stream of symb ols which looks to the receiver like a string of 0âs and 1âs. We call two
adjacent symbols a second extensio n, three a third ex tension, and in general if we send n symbols the
receiver sees the n-th extension of the basic code symbols. Not knowing n, you the receiver, must break the
Principles of Unique Decodability
- Unique decodability is essential for ensuring a receiver can reconstruct the original message from a stream of symbols without ambiguity.
- Instantaneous decodability, where no symbol is a prefix of another, allows for immediate processing of digits without waiting for the end of a message.
- The inclusion of an 'exit' or 'escape' symbol is a critical but often overlooked design element for terminating a decoding process.
- The efficiency of a code is measured by its average length, calculated by weighting the length of each symbol by its probability of occurrence.
- Optimal code design is inherently dependent on the frequency of symbols; different probability distributions favor different tree structures.
- McMillanâs Theorem suggests that requiring instantaneous decodability does not impose a practical cost on code efficiency.
You have to wait until you get to the end of the message before you can start the decoding process!
stream up into units which can be translated, and you want, as we said above, to be able at the receiving end,meaning you again, to make this decomposition of the stream uniquely in order to recover the originalmessage I, at the sending end, sent to you.
I will use small alphabets of symbols to be encoded for illustrations; usually the alphabet is much larger.
Typically natural language alphabets run from 16 to 36 letters, both upper and lower case, along with
numbers and numerous punctuation symb ols. For example, ASCII has 128=2
7 symbols in its alphabet.
Let us examine one special code of four symbols, s1, s2, s3, s4.
If you receive
what will you do? Is it
You cannot tell; the code is not uniquely decodable, and hence is unsatisfactory. On the other hand the code68 CHAPTER 10
is uniquely decodable. Let us take a random string and see what you would do to decode it. You would
construct a decoding tree of the form shown in Figure 10.II . The string
can be broken up into the symbols
by merely following the decoding tree using the rule:
Each time you come to a branch point (node) you r ead the next symbol, and when you come to a leaf
of the tree you emit the corresponding symbol and return to the start.
The reason why this tree can exist is that no symbol is the prefix of any other, so you always know when
you have come to the end of the current symbol.
There are several things to note. Fi rst, the decoding is a straight fo rward process in which each digit is
examined only once. Second, in practice you usually include a symbol which is an exit from the decoding
process and is needed at the end of message. Failure to allow for an escape symbol is a common error in the
design of codes. You may, of course, never expect to exit from a decoding mode, in which case the exit symbol
is not needed.
Figure 10.II
The next topic is instantaneous decodable codes . To see what this is, consider the above code with the
digits reversed end for end.
Now consider receiving 011111⌠111. The only way you can decode this is to star t at the final end and
group by threeâs until you see how many 1âs are left to go with the first 0; only then you can decode thefirst symbol. Yes, it is uniquely decodable, but not in stantaneously! You have to wait until you get to the
end of the message before you can start the decodi ng process! It will turn out (McMillanâs Theorem)
instantaneous decodability costs nothing in practice, hence we will stick to instantaneously uniquelydecodable codes.
We now turn to two examples of encoding the same symbols, s
i:
which will have the decoding tree shown in Figure 10.III .CODING THEORYâI 69
The second encoding is the same source, but we have:
with the tree shown in Figure 10.IV .
The most obvious measure of âgoodnessâ of a code is its average length for some ensemble of messages.
For this we need to co mpute the code length li of each symbol multiplied by its corresponding probability pi
of occurring, and then add these pr oducts over the whole code. Thus the formula for the average code length
L is, for an alphabet of q symbols,
where the pi are the probabilities of the symbols si and the li are the corresponding le ngths of the encoded
symbols. For an efficient code this number L should be as small as possible. If p1=1/2, p2= 1/4, p3=1/8, p4=
1/16, and p5=1/16, then for code #1 we get
and for code #2
and hence the given probabilities will favor the first code.
If most of the code words are of the same probability of occurring then the second encoding will have a
smaller average code length than the first encoding. Let all the pi=1/5. The code #1 has
while code #2 has
thus favoring the second code. Clearly the designing of a âgoodâ code must depend on the frequencies of
the symbols occurring.
Figure 10.III70 CHAPTER 10
We now turn to the Kraft inequality which gives a limit on the lengths li of the code symbols of a code. In
the base 2, the Kraft inequality is
The Kraft Inequality and McMillan's Theorem
- The Kraft inequality establishes a mathematical constraint on the lengths of symbols in uniquely decodable codes, preventing an excess of short symbols.
- McMillan's Theorem extends this inequality to non-instantaneous codes, proving that instantaneous decodability costs nothing in terms of efficiency.
- A code is uniquely decodable only if the sum of its symbol lengths, weighted by powers of two, is less than or equal to one.
- If the Kraft sum is strictly less than one, the code has excess signaling capacity that could be used to shorten average code lengths.
- Meeting the Kraft inequality does not guarantee a specific code is decodable, but rather that a decodable code with those specific lengths can exist.
When examined closely this inequality says there cannot be too many short symbols or else the sum will be too large.
When examined closely this inequality says there cannot be too many short symbols or else the sum will be
too large.
To prove the Kraft inequality for any instantaneously uniquely decodable code we simply draw the decoding
tree, which of course exists, and apply mathematical indu ction. If the tree has one or two leaves as shown in
Figure 10.V then there is no doubt the inequality is true . Next, if there are mo re than two leaves we
decompose the trees of length m (for the induction step) into two trees, and by the induction suppose the
inequality applies to each branch of length mâ1 or less. By induction the inequality applies to each branch,
giving Kâ and Kâ for their sums. Now when we join the two trees each length increases by 1, hence each
term in the sum gets another factor of 2 in the denominator, and we have
and the theorem is proved.
Figure 10.IV
Figure 10.V
CODING THEORYâI 71
Next we consider the proof of McMillanâs Theorem, the Kraft inequality applies to non-instantaneous
codes provided they are unique ly decodable. The proof depends on the fact for any number K > 1 some n-th
power will exceed any li near function of n, when n is made large enough. We start with the Kraft inequality
raised to the n-th power (which gives the n-th extension) and expand the sum
where Nk is the number of symbols of length k, and the sum starts from the minimum length of the n-th
extension of the symbols, which is n, and ends with the maximum length nl, where l is the maximum length
of any single code symbol. But from the unique decodability it must be that Nk ⤠2k. The sum becomes
If K were > 1 then we could find an n so large the inequality would be false, hence we see K ⤠1, and
McMillanâs Theorem is proved.
Since we now see, as we said we would show, instantaneous decodability costs us nothing, we will stick
to them and ignore merely uniquely decodable codesâtheir generality buys us nothing.
Let us take a few examples to illustrate the Kraft inequality. Can there exist a uniquely decodable code
with lengths 1, 3, 3, 3? Yes, since
How about lengths 1, 2, 2, 3? We have
hence no! There are too many short lengths.
Comma codes are codes where each symbol is a string of 1âs followed by a 0, except the last symbol
which is all 1âs. As a special case we have:
We have the Kraft sum
and we have exactly met the condition. It is easy to see the general comma code meets the Kraft inequality
with exact equality.
If the Kraft sum is less than 1 then there is ex cess signaling capacity sin ce another symbol could be
included, or some existing one shortened and thus the average code length would be less.
Note if the Kraft inequality is met that does not mean the code is uniquely decodable, only there exists a
code with those symbol lengths which is uniquely decodable. If you assign binary numbers in numerical
order, each having the right length li in bits, then you will find a uniquely decodable code. For example,
given the lengths 2, 2, 3, 3, 4, 4, 4, 4 we have for Kraftâs inequality
hence an instantaneously decodable code can exist. We pick the symbols in increasing order of numerical
size, with the binary point on imagined on the le ft, as follows, and watch carefully the corresponding
lengths li:72 CHAPTER 10
Meaning and Huffman Coding
- The transmission of ideas is distinct from the specific words used, as meaning is often reconstructed by the receiver using internal context.
- Organizational communication is frequently distorted by 'channel noise' where subordinates hear what they expect rather than what is actually said.
- Efficient coding theory aims to minimize the average message length based on the statistical probability of symbol occurrence.
- Huffman coding requires that symbols with higher probabilities be assigned shorter code lengths to achieve mathematical optimality.
- A minimum length code must utilize every decision node in its tree structure to avoid wasted capacity and ensure unique decodability.
This inability of the receiver to âhear what is saidâ by a person in a higher management position but to hear only what they expect to hear, is, of course, a serious problem in every large organization.
I feel it necessary to point out ho w things are actually done by us when we communicate ideas. Thus I want,
at this time, to get an idea from my head into yours. I emit some words from which you are supposed to get
the idea. But if you later try to transmit this idea to a friend you will emit, almost certainly, different words.
In a real sense, the âmeaningâ is not contained in the specific words I use since you will probably use
different words to communicate the same idea. Apparently different wo rds can convey the same
âinformationâ. But if you say you do not understand the message then usually a different set of words is
used by the source in a second or even third presentation of the idea. Thus, again in some sense, the
âmeaningâ is not contained in the actual words I use, but you supply a great deal of surrounding information
when you make the translation from my words to your idea of what I said inside you head.
We have learned to âtuneâ the words we use to fit th e person on the receiving end; we, to some extent,
select according to what we think is the channel nois e, though clearly this do es not match the model I am
using above since there is significant noise in the decoding process, shall we say. This inability of the
receiver to âhear what is saidâ by a person in a hi gher management position but to hear only what they
expect to hear, is, of course, a serious problem in every large organization, and is something you should be
keenly aware of as you rise towards the top of the organization. Thus the representation of information in
the formal theory we have given is mirrored only partly in life as we live it, but it does show a fair degree of
relevance outside the formal bounds of computer usage where it is highly applicable. CODING THEORYâI 73
11
Coding TheoryâII
Two things should be clear from the previous chapter. First, we want the average length L of the message
sent to be as small as we can make it (to save the us e of facilities). Second, it must be a statistical theory
since we cannot know the messages which are to be sent, but we can know some of the statistics by using
past messages plus the inference the future will probably be like the past. For the simplest theory, which isall we can discuss here, we will need the probabilitie s of the individual symbols occurring in a message.
How to get these is not part of the theory, but can be obtained by inspection of past experience, orimaginative guessing about the future use of the proposed system you are designing.
Thus we want an instantaneous uniquely decodable code for a given set of input symbols, s
i, along with
their probabilities, pi. What lengths li should we assign (realizing we must obey the Kraft inequality), to
attain the minimum average code length? Huffman solved this code design problem.
Huffman first showed the following running inequalities must be true for a minimum length code. If the pi
are in descending order then the li must be in ascending order
For suppose the pi are in this order but at least one pair of the li are not. Consider the effect of interchanging
the symbols attached to the two which are not in order. Before the interchange the two terms contributed to
the average code length L an amount
and after the interchange the terms would contribute
All the other terms in the sum L will be the same. The di fference can be written as
One of these two terms was assumed to be negative, hence upon interchanging the two symbols we would
observe a decrease in the average code length L. Thus for a minimum length code we must have the two
running inequalities.
Next Huffman observed an instantaneous decodable code has a decision tree, and every decision node
should have two exits, or else it is wasted effort, hence there are two longest symbols which have the samelength.
To illustrate Huffman coding we use the classic example. Let p(s
1)=0.4, p(s 2)=0.2, p(s3)=0.2, p (s4)=0.1,
The Logic of Huffman Encoding
- Huffman encoding functions by iteratively merging the two least frequent symbols until only two remain.
- The process is reversed to assign binary digits, adding a 0 or 1 to distinguish previously merged symbols.
- The resulting code is mathematically guaranteed to have the minimum average length for a given probability distribution.
- Huffman codes are not unique; arbitrary choices in bit assignment and symbol ordering can create 'long' or 'bushy' decoding trees.
- Optimizing symbol placement in the tree can reduce the variability of code lengths without changing the average length.
- Practical application of this method can reduce data storage requirements by more than half in certain scenarios.
The average length of the two codes is the same, but the codes, and the decoding trees are different; the first is âlongâ and the second is âbushyâ, and the second will have less variability than the first one.
and p(s5)=0.1. We have it displayed in the attached Figure 11.I . Huffman then argued on the basis of the
above he could combine (merge) the two least frequent symbols (which must have the same length) into one
symbol having the combined probability with common bits up to the last bit which is dropped, thus having
one fewer code symbols. Repeating this again and again he would come down to a system with only two
symbols, for which he knew how to assign a code representation, namely one symbol 0 and one symbol 1.
Now in going backwards to undo the merging steps, we would need at each stage to split the symbol
which arose from the combining of two symbols, keeping the same leading bits but adding to one symbol a0, and to the other a 1. In this way he would arrive at a minimum L code, see again Figure 11.I . For if there
were another code with smaller length Lâ then doing the forward steps, which changes the average code
length by a fixed amount he would arrive finally at two symbols with an average code length less than 1âwhich is impossible. Hence the Huffman encoding gives a code with minimum length. See Figure 11.II for
the corresponding decoding tree.
The code is not unique. In the firs t place at each step of the backing up process the assigning of the 0 and
the 1 is an arbitrary matter to which symbol each goes. Second, if at any stage there are two symbols of the
same probability then it is indifferent which is put above the other. This can result, sometimes, in very
different appearing codesâbut both codes will have the same average code length.
Figure 11.I
Figure 11.II
CODING THEORYâII 75
If we put the combin ed terms as high as possible we get Figure 11.III with the corresponding decoding
tree Figure 11.IV . The average length of the two codes is the sa me, but the codes, and the decoding trees are
different; the first is âlongâ and the second is âbushyâ, and the second will have less variability than the first
one.
We now do a second example so you will be sure how Huffman encoding works since it is natural to want
to use the shortest average code le ngth you can when designing an en coding system. For example you may
have a lot of data to put into a backup store, and encoding it into the appropriate Huffman code has been
known at times to save more than ha lf the expected storage space! Let p(sl)=1/3, p(s2)=1/5, p(s3)=1/6, p(s4)
=1/10 , p(s5)=1/12, p(s6)=1/20, p(s7)=1/30 and p(s8)=1/30. First we check that the total probability is 1. The
common denominator of the fractions is 60. Hence we have the total probability
Figure 11.III
Figure 11.IV76 CHAPTER 11
This second example is illustrated in Figure 11.V where we have dropped the 60 in the denominators of the
probabilities since only the relative sizes matter. What is the average code length per symbol? We compute
For a block code of eight symbols each symbol would be of length 3 an d the average would be 3, which is
more than 2.58âŚ
Automating Huffman Coding Efficiency
- Huffman coding is a mechanical process easily automated by computer programs through iterative probability summation and symbol splitting.
- The system can autonomously sample data, estimate probabilities, and transmit both the decoding tree and encoded data without human intervention.
- Practical implementation requires an escape symbol with low probability to signal the end of the decoding process.
- Huffman coding is most effective when symbol probabilities are highly varied, potentially resulting in a comma code structure.
- If symbol probabilities are uniform, Huffman coding offers little to no advantage over standard block encoding.
- The technique has been applied to computer instruction sets where certain operations occur much more frequently than others.
Indeed, you can write a program which will sample the data to be stored and find estimates of the probabilities, find the Huffman code, do the encoding, and send first the decoding algorithm (tree) and then the encoded data, all without human interference or thought!
Note how mechanical the process is for a machine to do. Each forward stage for a Huffman code is a
repetition of the same process, combine the two lowe st probabilities, place the new sum in its proper place
in the array, and mark it. In the backward process, take the marked symbol and split it. These are simple
programs to write for a computer he nce a computer program can find the Huffman code once it is given the
si and their probabilities pi. Recall in practice you want to assign an escape symbol of very small probability
so you can get out of the decoding process at the end of the message. Indeed, you can write a program
which will sample the data to be stored and find estim ates of the probabilities (small errors make only small
changes in L), find the Huffman code, do the encoding, and send first the decoding algorithm (tree) and then
the encoded data, all without human interference or thought! At the dec oding end you already have received
the decoding tree. Thus once written as a library pr ogram, you can use it whenever you think it will be
useful.
Huffman codes have even been used in some comput ers on the instruction part of instructions, since
instructions have very different probabilities of being used. We need, therefore, to look at the gain in average
code length L we can expect from Huffman encoding over simple block encoding which uses symbols all of
the same length.
If all the probabilities are the same and there are exactly 2k symbols, then an examination of the Huffman
process will show you will get a standard block code with each symbol of the same length. If you do not
have exactly 2k symbols then some symbols will be shortened, but it is difficult to say whether many will be
shortened by one bit, or some may be shortened by 2 or more bits. In any case, the value of L will be the
same, and not much less than that for the corresponding block code.
On the other hand, if each pi is greater than (2/3) (sum of all the pr obabilities that foll ow except the last)
then you will get a comma code, one which has one symbol of length 1 (0), one symbol of length 2, (10),
Figure 11.V
CODING THEORYâII 77
etc, down to the last where at the end you will have two symbols of the same length, ( qâ1), (1111âŚ10) and
(1111âŚ11). For this the value of L can be much less than the corresponding block code.
Rule: Huffman coding pays off when the probabilities of the symbols are very different, and does not pay
Huffman Variance and Parity
- When Huffman coding probabilities are equal, the order chosen affects the variance of the resulting code lengths.
- Placing new probabilities as high as possible in the table minimizes variance, ensuring more consistent message lengths.
- Channel encoding addresses the problem of noise by adding redundancy to detect or correct bit errors.
- A single parity check bit allows for the detection of an odd number of errors within a block of transmitted data.
- The mathematical model for channel noise assumes 'white noise,' where errors are independent and equally likely in any position.
A sensible criterion is to minimize the variance of the code so that messages of the same length in the original symbols will have pretty much the same lengths in the encoded message.
off much when they are all rather equal.
When two equal probabilities arise in the Huffman process they can be put in any order, and hence the
codes may be very different, though the average code length in both cases will be the same L. It is natural to
ask which order you should choose when two probabilities are equal. A sensible criterion is to minimize thevariance of the code so that messages of the same length in the original symbols will have pretty much thesame lengths in the encoded message you do not want a short original message to be encoded into a very
long encoded message by chance. The simple rule is to put any new probability, when inserting it into thetable as high as it can go. Indeed, if you put it above a symbol with a slightly higher probability you usuallygreatly reduce the variance and at the same time only slightly increase L; thus it is a good strategy to use.
Having done all we are going to do about source encoding (though we have by no means exhausted the
topic) we turn to channel encoding where the noise is modeled. The channel, by supposition, has noise,
meaning some of the bits are changed in transmission (or storage). What can we do?
Error detection of a single error is easy. To a block of ( nâ1) bits we attach an n-th bit which is set so that
the total n bits has an even number of 1âs (an odd number if you prefer, but we will stick to an even number
in the theory). It is called an even (odd) parity check, or more simply a parity check .
Thus if all the messages I send to you will have this propert y, then at the receivi ng end you can check to
see if the condition is met. If the parity check is not met then you know at least one error has happened,indeed you know an odd number of errors has occurred . If the parity does check then either the message is
correct, or else there are an even number of errors. Si nce it is prudent to use sy stems where the probability of
an error in any position is low, then the probability of multiple errors must be much lower.
For mathematical tractability we ma ke the assumption the channel has white noise, meaning: (1) each
position in the block of n bits has the same probability of an error as any other position, and (2) the errors in
various positions are uncorrelated, meaning independent . Under these hypotheses th e probabilities of errors
are:
Engineering Error Detection Codes
- The probability of undetected errors depends on the engineering balance between block length and redundancy.
- Single error detection is effective for retransmission unless the source data itself is corrupted.
- Historical relay computers used 2-out-of-5 codes to represent decimal digits and catch hardware failures.
- Error detection codes serve a vital maintenance role by identifying the exact moment and location of a machine failure.
- Human data entry errors often involve transposing adjacent characters, requiring more robust codes than simple parity.
- Weighted codes were developed to handle complex human error patterns in alphanumeric naming systems.
Any error was caught by the machine almost in the act of its being committed, and hence pointed the maintenance people correctly rather than having them fool around with this and that part, misadjusting the good parts in their effort to find the failing part.
From this if, as is usually true, p is small with respect to the block length n (meaning the product np is
small), then multiple errors are much less likely to ha ppen than single errors. It is an engineering judgment
of how long to make n for a given probability of error p. If n is small then you have a higher redundancy
(the ratio of the number of bits sent to the minimum number of bits possible, n/(nâ1)) than with a larger n,
but if np is large then you have a low redundancy but a higher probability of an undetected error. You must
make an engineering judgement on how you are going to balance these two effects.
When you find a single error you can ask for a retran smission and expect to get it right the second time,
and if not then on the third time, etc. However, if th e message in storage is wrong, then you will call for
retransmissions until another error occurs and you will probably have two errors which will pass undetected78 CHAPTER 11
in this scheme of single error detection. Hence the use of repeated retransmission should depend on the
expected nature of the error.
Such codes have been widely used, even in the relay days. The telephone company in its central offices,
and in many of the early relay computers, used a 2-out-of-5 code, meaning two and only two out of the fiverelays were to be âupâ. This code was used to repres ent a decimal digit, since C( 5,2)=10. If not exactly 2
relays were up then it was an error, and a repeat was used. There was also a 3-out-of-7 code in use,
obviously an odd parity check code.
I first met these 2-out-of-5 codes while using the M odel 5 relay computer at Bell Tel Labs, and I was
impressed not only did they help to get the right answer , but more important, in my opinion, they enabled the
maintenance people to maintain the mach ine. Any error was caught by the mach ine almost in the act of its being
committed, and hence pointed the mainte nance people correctly rather than having them fool around with this
and that part, misadjusting the good parts in their effort to find the failing part
Going out of time sequence, but still in idea sequence, I was once asked by AT&T how to code things
when humans were using an alphabet of 26 letter, ten decimal digits, plus a âspaceâ. This is typical of
inventory naming, parts naming, and many other naming of things, including the naming of buildings. Iknew from telephone dialing error data, as well as long experience in hand computing, humans have a
strong tendency to interchange adjacent digits, a 67 is apt to become a 76, as well as change isolated ones,
(usually doubling the wrong digit, for example a 556 is likely to emerge as 566). Th us single error detecting
is not enough. I got two very bright people into a conference room with me, and posed the question.Suggestion after suggestion I rejected as not good enough until one of them, Ed Gilbert, suggested a
weighted code . In particular he suggested assigning the number s (values) 0, 1, 2, âŚ, 36 to the symbols 0,1,
âŚ, 9, A, B, âŚ, Z, space. Next he comp uted not the sum of the values but if the k-th symbol has the value
(labeled for convenience) s
k then for a message of n symbols we compute
Weighted Parity and Human Error
- The text describes a weighted parity check using modulo 37 arithmetic to detect single-symbol errors and symbol transpositions.
- Using a prime number as a modulus is essential for the mathematical integrity of the error-detection scheme.
- The ISBN system on books uses a similar weighted code, employing the symbol 'X' to represent the value 10 because its modulus is 11.
- Implementing these codes at the point of data entry allows for immediate error correction before incorrect information propagates through a system.
- Coding theory can be applied to man-machine interfaces to minimize keystrokes by adapting menus to individual user habits, similar to Huffman encoding.
The dashes are merely for decorative effect and are not used in the code at all.
âmoduloâ meaning divide this weighted sum by 37 and take only the remainder. To encode a message of n
symbols leave th e first symbol, k=1, blank and what ever the remainder is, which is less than 37, subtract it
from 37 and use the corresponding symbol as a check symbol, which is to be put in the first position. Thus
the total message, with the check symbol in the first position, will have a check sum of exactly 0. When you
examine the interchange of any two di fferent symbols, as well as the ch ange of any single symbol, you see
it will destroy the weighted parity check, modulo 37 (provided the two interchang ed symbols are not exactly
37 symbols apart!). Without going into the details, it is essential the modulus be a prime number, which 37
is.
To get such a weighted sum of the symbols (actually their values) you can avoid multiplication and use
only addition and subtraction if you wish. Put the numbers in order in a column, and compute the running
sum then compute the running sum of the running sum modulo 37, and then complement this with respect to
37, and you have the check symbol. As an illustration using w, x, y, z.
symbols sum sum of sums
ww wxw + x2 w + x
y w+x+y 3w+2x+y
z w+x+y+z 4w+3x+2y+z
= weighted check sumCODING THEORYâII 79
At the receiving end you subtract the modulus repeatedly until you get either a 0 (correct symbol) or a
negative number (wrong symbol).
If you were to use this encoding, for example, for inventory parts names, then the first time a wrong part
name came to a computer, say at tran smission time, if not before (perhaps at order preparation time), the
error will be caught; you will not have to wait until the order gets to supply headquarters to be later told that
there is no such part or else they have sent the wrong part! Before it leaves your location it will be caughtand hence is quite easily corrected at that time. Trivial? Yes! Effectiv e against human errors (as contrasted
with the earlier white noise), yes!
Indeed, you see such a code on your books these days with their ISBN numbers. It is the same code
except they use only 10 decimal digits, and 10, not being a prime number, they had to introduce an 11-thsymbol, labeled X, which might at times arise in the parity checkâindeed, about every 11-th book you have
will have an X for the parity check number as the final sym bol of its ISBN number. The dashes are merely
for decorative effect and are not used in the code at all. Check it for yourself on your text books. Many
other large organizations could use such codes to good effect, if they wanted to make the effort
I have repeatedly indicated I believe the future will be increasingly concerned with information in the
form of symbols, and less concerned with material things, hence the theory of encoding (representing)information in convenient codes is a non-trivial topi c. The above material gave a simple error detecting
code for machine-like situations, as well as a weight ed code for human use. They are but two examples of
what coding theory can contribute to an organizatio n in places where machine and human errors can occur.
When you think about the man-machine interface one of the things you would like is to have the human
make comparatively few key strokesâHuffman encoding in a disguise! Evidently, given the probabilites ofyou making the various branches in the program menus, you can design a way of minimizing your total keystrokes if you wish. Thus the same set of menus can be adjusted to the work habits of different people ratherthan presenting the same face to all. In a broader sense than this, âautomatic programmingâ in the higher
level languages is an attempt to achieve something like Huffman encoding so that for the problems you wantto solve require comparatively few key strokes are needed, and the ones you do not want are the others. 80 CHAPTER 11
12
Error Correcting Codes
The Genesis of Hamming Codes
- Richard Hamming explores the dual nature of scientific discovery, focusing on both the technical development of error-correcting codes and the psychological process of invention.
- The author warns that retrospective accounts of discovery are inherently limited, as the conscious mind cannot fully trace the 'magic' of the unconscious work.
- The breakthrough was triggered by intense frustration when a relay computer repeatedly failed over weekends, wasting Hamming's limited machine time and forcing him to apologize to colleagues.
- Hamming argues that significant breakthroughs rarely come from calm research; instead, they require the emotional stress and involvement of a 'prepared mind' facing a crisis.
- The technical solution emerged from Hamming's deep familiarity with parity checks and the realization that a rectangular arrangement of bits could pinpoint an error's coordinates.
- By applying parity checks to both rows and columns, the machine could not only detect that an error occurred but identify its exact location for automatic correction.
I was angry to say the least, and said, 'If the machine can locate there is an error, why can it not locate where it is, and then fix it by simply changing the bit to the opposite state?'
There are two subject matters in this chapter; the first is the ostensible topic, error correcting codes, and the
other is how the process of discovery sometimes goesâyou all know I am the official discoverer of the
Hamming error correcting codes. Thus I am presumably in a position to describe how they were found. Butyou should beware of any reports of this kind. It is true at that time I was already very interested in theprocess of discovery, believing in many cases the method of discovery is more important than what isdiscovered. I knew enough not to think about the process when doing research, just as athletes do not thinkabout style when the engage in sports, but they practice the style until it is more or less automatic. I had thus
established the habit after something of great of sma ll importance was discovered of going back and trying
to trace the steps by which it apparently happened. But do not be deceived; at best I can give the consciouspart, and a bit of the upper subconscious part, but we simply do not know how the unconscious works itsmagic.
I was using the Model 5 relay computer in NYC in preparation for delivering it to Aberdeen Proving
Grounds, along with the some required software programs (mainly mathematical routines). When an errorwas detected by the 2-out-of-5 block codes the machine would when unattended repeat the step, up to threetimes, before dropping it and picking up the next problem in the hope the defective equipment would not beinvolved in the new problem. Being at that time low man on the totem pole, as they say, I got free machinetime only over the weekendsâmeaning from Friday at around 5:00 P.M. to Monday morning around 8:00A.M. which is a lot of time! Thus I would load up the input tape with a large number of problems andpromise my friends, back at Murray Hill NJ wher e the Research Department was located, I would deliver
them the answers on Tuesday. Well, one weekend, just after we left on a Friday night, the machine failed
completely and I got essentially nothing on Monday. I had to apologize to my friends and promised themthe answers on the next Tues. Alas! The same thing happened again! I was angry to say the least, and said,
âIf the machine can locate there is an error, why can it not locate where it is, and then fix it by simply changing
the bit to the opposite state?â (The actual language used was perhaps a bit stronger!).
Notice first this essential step happened only because there was a gr eat deal of emotional stress on me at
the moment, and this is characteristic of most great discoveries. Working calmly will let you elaborate and
extend things, but the break throughs generally come only after great frustration and emotionalinvolvement. The calm, cool, uninvolved researcher seldom makes really great, new steps.
Back to the story. I knew from previous discussions that of course you could build three copies of a machine,
include comparing circuits, and use the majority voteâhence error correct ing machines could exist. But at
what cost! Surely there were better methods. I also knew, as discussed in the last chapter, a great deal about
parity checks; I had examined th eir fundamentals very carefully.
Another aside. Pasteur says, âLuck favors the pr epared mindâ. You see I was prepared by the
immediately previous work I had done. I had become more than acquainted with the 2-out-of-5 codes, I had
understood them fundamentally, and had worked out and understood the general implications of a parity
check.
After some thought I recognized if I arranged the message bits of any message symbol in a rectangle, and
put parity checks on each row and each column, then the two failing parity checks would give me the
coordinates of the single error, and this would include the corner added parity bit (which could be setconsistently if I used even parities), Figure 12.I . The redundancy, the ratio of what you use to the minimum
amount needed, is
The Evolution of Error Correction
- The author explores the limitations of rectangular parity codes, noting that double errors can lead to unresolvable ambiguities in identifying error locations.
- A sudden realization during a commute led to the development of triangular and then multi-dimensional cubic parity checks to improve redundancy efficiency.
- By extending the logic to an n-dimensional cube, the author discovered that a 2x2x2...x2 configuration provides the most favorable ratio of parity checks to data bits.
- The breakthrough involved using the 'syndrome' of an error as a binary number that explicitly names the position of the error within the message.
- The design of these codes relies on assigning parity checks to specific bit positions based on their binary representation, a novel approach in the late 1940s.
My smugness vanished immediately! Did I have the best code this time?
It is obvious to anyone who ever took the calculus the closer the rectangl e is to a square the lower is the
redundancy for the same amount of message. And of course big mâs and nâs would be better than small
ones, but then the risk of a double error might be to o great; again an engineeri ng judgment. Note if two
errors occurred then you would have: (1) if they were not in the same column and not in the same row, then
just two failing rows and two failing columns would occur and you could not know which diagonal paircaused them; and (2) if two were in the same row (or column) then you would have only the columns (orrows) but not the rows (columns).
We now move to some weeks later. To get to NYC I would go a bit early to the Murray Hill, NJ location
where I worked and get a ride on the company mail delivery car. Well, riding through north Jersey in theearly morning is not a great sight, so I was, as I ha d the habit of doing, review ing successes so I would have
the style in hand automatically; in particular I was reviewing in my mind the rectangular codes. Suddenly,and I can give no reason for it, I realized if I took a triangle and put the parity checks along the diagonal,
with each parity check checking both the row and column it was in, th en I would have a more favorable
redundancy, Figure 12.II .
My smugness vanished immediately! Did I have the best code this time? A few miles of thought on the
matter (remember there were no distr actions in the north Jersey scenery) , I realized a cube of information
bits, with parity checks across the enti re planes and the parity check bit on the axes, for all three axes, would
give me the three coordinates of the error at the cost of 3 nâ2 parity checks for the whole n
3 encoded
message. Better! But was it best? No! Being a mathem atician I promptly realized a 4-dimensional cube (I
did not have to arrange them that way, only interwire them that way) would be better. So an even higherdimensional cube would be still better. It was soon obvious (say five miles) a 2Ă2Ă2ĂâŚĂ2 cube, with n+1
parity checks, would be the bestâapparently!
But having burnt my fingers once, I was not about to settle for what looked goodâI had made that
mistake before! Could I prove it was best? How to make a proof? One obvious approach was to try acounting argument I had n+1 parity checks, whose result was a string of n+1 bits, a binary number of length
Figure 12.I
82 CHAPTER 12
n+1 bits, and this could represent any of 2n+1 things. But I needed only 2n+1 things, the 2n points in the cube
plus the one result the message was correct. I was off by almost a factor of 2. Alas! I arrived at the door of
the company, had to sign in, and go to a conference so I had to let the idea rest
When I got back to the idea after some days of distractions (after all I was supposed to be contributing to
the team effort of the compan y), I finally decided a good approach would be to use the syndrome of the
error as a binary number which named the place of the error, with, of course , all 0âs being the correct
answer (an easier test than for all 1âs on most computers). Notice famili arity with the binary system, which
was not common then, (1947â1948), repeatedly played a prominent role in my thinking. It pays to know morethan just what is needed at the moment!
How do you design this particular case of an error correcting code? Easy! Write out the positions in the
binary code:
11
21 0
31 1
4 1005 1016 110
7 111
8 10009 1001
It is now obvious the parity check on the right hand side of the syndrome must involve all positions which have
a 1 in the right hand column; the second digit from the right must involve the numbers which have a 1 in thesecond column, etc. Therefore you have:
Parity check#l 1, 3, 5, 7, 9, 11, 13, 15, âŚ
Parity check#2 2, 3, 6, 7, 10, 11, 14, 15, âŚParity check#3 4, 5, 6, 7, 12, 13, 14, 15, âŚ
Parity check#4 8, 9, 10, 11, 12, 13, 14, 15, âŚ
Figure 12.II
ERROR CORRECTING CODES 83
The Logic of Hamming Codes
- Hamming codes use parity checks to generate a binary syndrome that identifies the exact position of a single bit error.
- The code structure is flexible, allowing for the interchange of columns or bit values without losing the essential error-correcting properties.
- Adding a single global parity check enables the system to detect double errors, even if it can only correct a single error.
- The efficiency of the code improves with message length, as the number of required parity bits grows logarithmically relative to the message size.
- Error correction can be visualized geometrically as movement between vertices on an n-dimensional cube using the L1 metric.
- Engineering these codes requires balancing the risk of uncorrectable double errors against the overhead cost of redundancy.
If it seems magical, then think of the all 0 message, which will have all 0 checks, and then think of a single digit changing and you will see as the position of the error is moved around then the syndrome binary number will change correspondingly.
Thus if any error occurs in some position, those parity checks, and only those, will fail and give 1âs in the
syndrome, and this will produce exactly the binary representation of the position of the error. It is that
simple!
To see the code in operation suppose we confine ou rselves to 4 message and 3 check positions. These
numbers satisfy the condition
which is clearly a necessary condition, and the equality is sufficient. We pick as the positions for the
checking bits (so the setting of the parity check will be easy), the check positions 1, 2, and 4. The message
positions are therefore 3, 5, 6, 7. Let the message be
We (1) write the message on the top line, (2) encode on the next line, (3) insert an error at position 6 on the
next line, and (4) on the next three lines compute the three parity checks.
1234567p o s i t i o n
1 001m e s s a g e
0011001e n c o d e d m e s s a g e
0 0 1 1 0 1 1 message with error
You apply the parity checks to the received message.
Binary number 110 â 6; hence change the digit in position 6, and drop the check positions 1, 2 and 4, and
you have the original message, 1001.
If it seems magical, then think of the all 0 message , which will have all 0 checks, and then think of a
single digit changing and you will see as the position of the error is moved around then the syndrome binarynumber will change correspondingly and will always exactly match the position of the error. Next, note thesum of any two correct messages is still a correct message (the parity checks are additive modulo 2 hence
the proper messages form an additive group modulo 2). A correct message will give all zeros, and hence thesum of a correct message plus an error in one position will give the position of the error regardless of themessage being sent. The parity checks concen trate on the error and ignore the message.
Now it is immediately evident any interchange of an y two or more of the co lumns, once agreed upon at
each end of the channel, will have no essential effect; the code will be equivalent . Similarly, the
interchanging of 0 and 1 in any column (complementing that particular position) will not be an essentiallydifferent code. The particular (so called) Hamming code is merely a cute arrang ement, and in practice you
might want the check bits to all come at the end of the message rather than being scattered in the middle ofit.
How about a double error? If we want to catch (but not be able to correct) a double error we simply add a
single new parity check over the whole message we ar e sending. Let us see what will then happen at your
end.84 CHAPTER 12
old syndrome new parity check meaning
000 0 right answer
000 1 new parity check wrongxxx 1 old parity check works
xxx
0 must be a double error.
A single error correcting plus double error detecting code is often a good balance. Of course, the
redundancy in the short message of 4 bits, with now 4 bits of check, is bad, but the number of parity bitsrises roughly like the log of the message length. Too long a message and you risk a double uncorrectableerror (which in a single error corr ecting code you will âcorr ectâ into a third error), too short a message and
the cost in redundancy is too high. Again an engineering judgment depending on the particular situation.
From analytic geometry you learned the value of us ing the alternate algebrai c and geometric views. A
natural representation of a string of bits is to use an n-dimensional cube, each stri ng being a vertex of the
cube. Given this picture and finally noting any error in the message moves the message along one edge, two
errors along two edges, etc., I slowly r ealized I was to operate in the space of L
1. The distance between
symbols is the number of positions in which they differ. Thus we have a metric in the space and it satisfies
the three standard condi tions for a distance (see Chapter 10 where it is identified as the standard L1
distance):
1. D(x,y)⼠0 (non-negative)
Geometry of Error Correction
- The text establishes the mathematical foundation of distance using identity, symmetry, and the triangle inequality.
- A sphere in n-dimensional space is defined as the set of all vertices at a fixed distance from a central code point.
- Error correction is achieved by ensuring that spheres of a certain radius around code points do not overlap.
- A minimum distance of 3 between code points allows for single error correction, while a distance of 5 allows for double error correction.
- The relationship between minimum distance and error handling is formalized into a general rule for k-error correction.
It is obvious if the centers of these spheres are code points, and only these points, then at the receiving end any single error in a message will result in a non-code point and you can recognize where the error came from.
2. D(x,y) =0 if and only if x=y (identity)
3. D(x,y)=D(y,x) (symmetry)
4. D(x,y)+D(y,z) âĽD(x,z) (triangle inequality)
Thus I had to take seriously what I had learned as an abstraction of the Pyth agorean distance function.
With a distance we can define a sphere as all points (vertices, as that is al l there is in the space of
vertices), at a fixed distance from the center. For example, in the 3-di mensional cube which can be easily
sketched, Figure 12.III , the points (0,0,1), (0,1,0), and (1,0,0) are all unit distance from (0,0,0), while the
points (1,1,0), (1,0,1), and (0,1,1) are all two units away, and finally the point (1,1,1) is three units away
from the origin.
We now go to n-dimensions, and draw a sphere of unit radius about each point and suppose that the
spheres do not overlap. It is obvious if the centers of these spheres are code points, and only these points,then at the receiving end any single error in a mess age will result in a non-code point and you can recognize
where the error came from, it will be in the sphere about the point I sent to you, or equivalently in a sphereof radius 1 about the point you r eceived. Hence we have an error co rrecting code. The minimum distance
between code points is 3. If we use non-overlapping sp heres of radius 2 then a double error can be corrected
because the received point will be nearer to the or iginal code point than an y other point; double error
correction, minimum distance of 5. The following table gives the equivalence of the minimum distancebetween code points and th e correctability of errors:
min. distance meaning
1 unique decoding
2 single error detectingERROR CORRECTING CODES 85
min. distance meaning
3 single error correcting
4 1 error correct and 2 error detect
5 double error correcting2k +1 k error correction
2k+2
k error correction and k +1 error detection.
Figure 12.III
The Utility of Error Correction
- Error-correcting codes are mathematically defined by finding sets of code points in n-dimensional space with specific minimum distances.
- There is a direct trade-off between error correction and error detection; sacrificing one correction allows for two additional detections.
- The theoretical upper bound for code points is determined by dividing the total space volume by the volume of a sphere of radius k.
- Beyond ensuring accuracy, these codes drastically reduce the cost and expertise required for field maintenance and initial equipment installation.
- High-level error correction is essential for deep-space communication where low power and high noise make traditional transmission impossible.
- Real-world implementation at the first electronic central office proved that self-checking systems allow for faster, more reliable complex system deployment.
When, during initial installation, any unit is set up and running properly, and you then turned your back on it to get the next part going, if the one you were neglecting developed a flaw, it told you so!
Thus finding an error correcting code is the same as finding a set of code points in the n-dimensional space
which has the required minimum distance between legal messages since the above conditions are both
necessary and sufficient. It should also be clear some error correction can be exchanged for more detection;give up one error correction and you get two more in error detection.
I earlier showed how to design codes to meet the cond itions in the cases where the minimum distance is 1,
2, 3, or 4. Codes for higher minimum distances are not so easily found, and we will not go farther in thatdirection. It is easy to give an upper bound on how large the higher distance codes can be. It is obvious thenumber of points in a sphere of radius k is (C (n,k) is a binomial coefficient)
Hence if we divide the size of the volume of the whole space, 2n, by the volume of a sphere then the quotient
is an upper bound on the number of non-overlapping spheres, code points, in the corresponding space. To
get an extra error detection we simply, as before, ad d an overall parity check, thus increasing the minimum
distance, which before was 2 k+1 to 2 k+2 (since any two points at the minimum distance will have the
overall parity check set differently thus increasing the minimum distance by 1).
Let us summarize where we are. We see by proper code design we can build a system from unreliable
parts and get a much more reliable machine, and we see just how much we must pay in equipment, though
we have not examined the cost in speed of computin g if we build a computer with that level of error
correcting into it. But I have previously stressed the other gain, namely field maintenance, and I want to86 CHAPTER 12
mention it again and again. The mo re elaborate the equipment is, an d we are obviously going in that
direction, the more field maintenance is vital, and error correcting codes not onl y mean in the field the
equipment will give (probably) the right answers, but it can be maintained successfully by low levelexperts.
The use of error detecting and error correcting codes is rising steadily in our society. In sending messages
from the space vehicles we sent to the outer planets, we often have a mere 20 watts or less of power
(possibly as low as 5 watts), and had to use codes which corrected hundreds of errors in a single block ofmessageâthe correction being done here on earth, of course. When you are not prepared to overcome thenoise, as in the above situation, or in cases of âd eliberate jammingâ, then su ch codes are the only known
answer to the situation.
In the late summer of 1961 I was dr iving across the country from my sabb atical at Stanford, Cal. to Bell
Telephone Laboratories in NJ and I made an appointment to stop at Morris, Illinois where the telephonecompany was installing the first elect ronic central office which was not an experimental one. I knew it used
Hamming codes extensively, and I was, of course, we lcomed. They told me they had never had a field
installation go in so easily as this one did. I said to myself, âOf course, that is what I have been preaching for
the past 10 yearsâ. When, during initial installation, any unit is set up and running properly (and you sort ofknow it is because of the self-checking and correcting properties), and you then turned your back on it toget the next part going, if the one you were neglecting developed a flaw, it told you so! The ease of initialinstallation, as well as later maintenance, was being verified right before their eyes! I cannot say too loudly,error correction not only gets the right answer when running, it can by proper design also contributesignificantly to field installation and field maintena nce; and the more elabor ate the equipment the more
essential these two things are.
I now want to turn to the other part of the chapter. I have carefully told you a good deal of what I faced at
each stage in discovering the error co rrecting codes, and what I did.
Preparation and Future Greatness
- Success is often attributed to luck, but it is primarily the result of a mind prepared to recognize and act on opportunities.
- The author argues that most people merely react to surface phenomena rather than thinking deeply about the underlying causes of events.
- Greatness is not a fixed trait but a style of thinking and acting that can be trained through the study of successful predecessors.
- A significant challenge to achieving greatness is that the requirements for success shift from one generation to the next.
- Relying on a 'random walk' of decisions is far less effective than having a personal vision for the future and using imagination to anticipate change.
- The author asserts that the possibility of greatness is more common and achievable than most people realize if they refuse to be 'janitors' of their profession.
Of course as you go through life you do not know what you are preparing yourself forâonly you want to do significant things and not spend the whole of your life being a âjanitor of scienceâ or whatever your profession is.
I did it for two reasons. First, I wanted
to be honest with you and show you how easy, if you will follow Pasteurâs rule, âLuck favors the preparedmind.â, to succeed by merely preparing yourself to su cceed. Yes, there were el ements of luck in the
discovery; but there were many other people in much the same situation, and they did not do it! Why me?Luck, to be sure, but also I was preparing myself by trying to understand what was going onâmore than theother people around who were merely reacting to things as they happened, and not thinking deeply as towhat was behind the surface phenomena.
I now challenge you. What I wrote in a few pages was done in the course of a total of about three to six
months, mainly working at odd moments while carrying on my main duties to the company. (Patent rightsdelayed the publication for more than a year.) Does anyone dare to say they, in my position, could not havedone it? Yes, you are just as capable as I was to have done itâ if you had been there and you had prepared
yourself as well!
Of course as you go through life you do not know what you are preparing yourself forâonly you want to
do significant things and not spend the whole of your life being a âjan itor of scienceâ or whatever your
profession is. Of course luck plays a prominent role. But so far as I can see, life presents you with many,
many opportunities for doing great things (define them as you will) and the prepared person usually hits oneor more successes, and the unprepared person will miss almost every time.
The above opinion is not based on this one experience, or merely on my own experiences, it is the result
of studying the lives of many great scientists. I wanted to be a scientist hence I studied them, and I lookedinto discoveries which happened where I was and asked questions of those who did them. This opinion isalso based on common sense. You establish in yourself the style of doing great things, and then whenERROR CORRECTING CODES 87
opportunity comes you almost automatically respond with greatness in your actions. You have trained
yourself to think and act in the proper ways.
There is one nasty thing to be mentioned, however, what it takes to be great in one age is not what is required
in the next one. Thus you, in preparing yourself for future greatness (and the possibility of greatness is more
common and easy to achieve than you think, since it is not common to recognize greatness when it happensunder oneâs nose) you have to think of the nature of the future you will live in. The past is a partial guide,and about the only one you have besides history is the constant use of your own imagination. Again, arandom walk of random decisions will not get you anywhere near as far as those taken with your own visionof what your future should be.
I have both told and shown you how to be great; now you have no excuse for not doing so! 88 CHAPTER 12
13
Information Theory
The Origins of Information Theory
- Claude Shannon chose the name 'Information Theory' over the more accurate 'Communication Theory' for its greater public impact.
- Shannon defined information as a measure of 'surprise,' where the amount of information is inversely related to the probability of an event.
- The mathematical foundation of information requires a continuous function where the information from independent events is additive.
- The Cauchy functional equation proves that the logarithm is the only continuous solution that satisfies the requirements for measuring information.
- In this framework, information is measured in bits, with a base-2 logarithm making a single binary choice equal to exactly one bit.
Shannon identified information with surprise. He chose the negative of the log of the probability of an event as the amount of information you get when the event of probability p happens.
Information Theory was created by C.E.Shannon in the late 1940s. The management of Bell Telephone
Labs wanted him to call it â'Communication Theoryâ as that is a far more accurate name, but for obvious
publicity reasons âInformation Theoryâ has a much greater impactâthis Shannon chose and so it is known
to this day. The title suggests the theory deals with informationâand therefore it must be important since
we are entering more and more deep ly into the information age. Hence I shall go through a few main results,
not with rigorous proofs of complete generality, but rather intuitive proofs of special cases, so you will
understand what information theory is and what it can and cannot do for you.
First, what is âinformationâ? Shannon identified information with surprise . He chose the negative of the
log of the probability of an event as the amount of information you get when the event of probability p happens.
For example, if I tell you it is smoggy in Los Angles then p is near 1 and that is not much information, but if
I tell you it is raini ng in Monterey in June then that is surpri sing and represents more information. Because
log 1=0 the certain event contains no information.
In more detail, Shannon believed the measure of the amount of information should be a continuous
function of the probability p of the event, and for independent events it should be additiveâwhat you learn
from each independent event when added together should be the amount you learn from the combined
event. As an example, the outcome of the roll of a die and the toss of a coin are generally regarded as
independent events. In ma thematical symbols, if I(p) is the amount of information you have for event of
probability p, then for event x of probability p1 and for the independent event y of probability p2, you will
get for the event of both x and y
This is the Cauchy functional equation, true for all p1and p2.
To solve this functional equation suppose
then this gives
If p 1=p2 and p2=p, then
etc. Extending this process you can show, via the standard method used for exponents, for all rationalnumbers m/n,
From the assumed continuity of the information measure it follows the log is the only continuous solution tothe Cauchy functional equation.
In information theory it is customary to take the base of the log system as 2, so a binary choice is exactly
1 bit of information. Hence information is measured by the formula
The Distortion of Information Theory
- The mathematical definition of information is based on surprise rather than the common human understanding of the word.
- Information is a relative measure that depends entirely on the observer's prior state of knowledge.
- The term 'Information Theory' is arguably a misnomer that should have been called 'Communication Theory' to avoid conceptual confusion.
- The use of the word 'entropy' in this context provides an 'aura of importance' that may not be physically justified.
- Gibbs' inequality proves that maximum entropy occurs when all symbols in a distribution have equal probability.
- The Kraft inequality and pseudoprobabilities allow for the mathematical bounding of uniquely decodable codes.
The same mathematical form does not imply the same interpretation of the symbols!
Let us pause and examine what has happened so far. First, we have not defined âinformationâ, we merely
gave a formula for measuring the am ount. Second, the measure depends on surprise, and while it does
match, to a reasonable degree, the situation with machines, say the telephone system, radio, television,
computers, and such, it simply does not represent the normal human attitude towards information. Third, it
is a relative measure, it depends on the state of your knowledge. If you are looking at a stream of ârandom
numbersâ from a random source th en you think each number comes as a surprise, but if you know the
formula for computing the ârandom num bersâ then the next number contai ns no surprise at all, hence
contains no information! Thus, while the definition Shannon made for information is appropriate in many
respects for machines, it does not seem to fit the huma n use of the word. This is the reason it should have
been called âCommunication Theoryâ, and not âInformati on Theoryâ. It is too late to undo the definition
(which produced so much of its initial popularity, and still makes people think it handles âinformationâ) so
we have to live with it, but you should clearly realiz e how much it distorts the common view of information
and deals with something else, which Shannon took to be surprise.
This is a point which needs to be examined whenever any definition is offered. How far does the
proposed definition, for example Shannonâs definition of information, agree with the original concepts you
had, and how far does it differ? Almost no definition is exactly congruent with your earlier intuitive
concept, but in the long run it is the definition wh ich determines the meaning of the conceptâhence the
formalization of something via sharp definitions always produces some distortion.
Given an alphabet of q symbols with probabilities pi then the average amount of information (the
expected value), in the system is
This is called the entropy of the system with the probability distribution {pi}. The name âentropyâ is used
because the same mathematical form arises in therm odynamics and in statistical mechanics, and hence the
word âentropyâ gives an aura of importance which is not justified in the long run. The same mathematicalform does not imply the same interpretation of the symbols!
The entropy of a probability distribution plays a central role in coding theory. One of the important
results is Gibbsâ inequality for two different probability distributions, p
i and qi. We have to prove
The proof rests on the obvious picture, Figure 13.I , that
and equality occurs only at x=1. Apply the inequality to each term in the sum on the left hand side
If there are q symbols in the signaling system then picking the qi=1/q we get from Gibbs´ inequality, by
transposing the q terms,
90 CHAPTER 13
This says in a probability distribution if all the q symbols are of equal probability, 1/ q, then the maximum
entropy is exactly In q, otherwise the inequality holds.
Given a uniquely decodable code, we have the Kraft inequality
Now if we now define the pseudoprobabilities
where of course â [Qi]=1, it follows from the Gibbs´ inequality,
after some algebra (remember that Kâ¤1 so we can drop the log term and perhaps strengthen the inequality
further),
Shannon's Noiseless Coding Theorem
- The entropy of a source acts as a fundamental lower bound for the average code length in any symbol-to-symbol encoding.
- Channel capacity is defined as the maximum amount of information that can be reliably transmitted through a channel, maximized over all possible encodings.
- In a binary symmetric channel with white noise, the capacity is determined by the probability of error per bit sent.
- Reliable transmission is achieved by encoding long streams of n bits, where n is large enough to narrow the distribution of expected errors.
- The receiver uses a sphere of radius slightly larger than the expected number of errors to decode the message, with errors occurring if multiple code points fall within that sphere.
- As n increases, the probability of a received message falling outside the sender's error sphere becomes arbitrarily small.
If n is large enough then there is an arbitrarily small probability of there occurring a received message point bj which falls outside this sphere.
Thus the entropy is a lower bound for any encoding, symbol to symbol, for the average code length L. This
is the noiseless coding theorem of Shannon.
We now turn to the main theorem on the bounds on signaling systems which use encoding of a bit stream
of independent bits and go symbol to symbol in the presence of noise, meaning there is a probability a bit of
information is correct, P>1/2, and the corresponding probability Q=1âP it is altered when it is transmitted.
For convenience assume the errors are independent and are the same for each bit sent, which is called
âwhite noiseâ.
We will encode a long stream of n bits into one encoded message, the n-th extension of a one bit code,
where the n is to be determined as the theory progresses. We regard the message of n bits as a point in an n-
dimensional space. Since we have an n-th extension, for simplicity we will assume each message has the
same probability of occurring, and we will assume th ere are M messages (M also to be determined later),
hence the probability of each initial message is
Figure 13.I
INFORMATION THEORY 91
We next examine the idea of the channel capacity . Without going into details the channel capacity is
defined as the maximum amount of information which can be sent through the channel reliably, maximized
over all possible encodings, hence there is no argument that more information can be sent reliably than the
channel capacity permits. It can be proved for the bi nary symmetric channel (w hich we are using) the
capacity C, per bit sent, is given by
where, as before, P is the probability of no error in any bit sent. For the n independent bits sent we will have
the channel capacity
If we are to be near channel capacity then we must send almost that amount of information for each of the
symbols ai, i=1,âŚ, M, and all of probability 1/ M, and we must have
when we send any one of the M equally likely messages ai. We have, therefore
With n bits we expect to have nQ errors. In practice we will have, for a given message of n bits sent,
approximately nQ errors in the received message. For large n the relative spread (spread=width,
of the distribution of the number of errors will be increasingly narrow as n increases.
From the senderâs point of view I take the message ai to be sent and draw a sphere about it of radius
which is slightly larger by e2 than the expected number of errors, Q, Figure 13.II . If n is large enough then
there is an arbitrarily smal l probability of there occurr ing a received message point bj which falls outside
this sphere. Sketching the situation as seen by me, the sender, we have along any radii from the chosen
signal, ai, to the received message, bj, with the probability of an error is (almost) a normal distribution,
peaking up at nQ, and with any given e2 there is an n so large the probability of the received point, bj, falling
outside my sphere is as small as you please.
Now looking at it from your end, Figure 13.III , as the receiver, there is a sphere S(r) of the same radius r
about the received point, bi, in the space, such that if the received message, bi, is inside my sphere then the
original message ai sent by me is inside your sphere.
How can an error arise? An error can occur according to the following table:
Figure 13.II
92 CHAPTER 13
case ai in S(r) another in S(r) meaning
1 yes yes error
2 yes no no error3 no yes error
4
no no error
Here we see that if there is at least one other orig inal message point in the s phere about your received point
then it is an error since you cannot decide which one it is. The sent message is correct only if the sent point
is in the sphere and there is no other code point in it.
We have, therefore, the mathemat ical equation for a probability PE of an error, if the message sent is ai,
We can drop the first factor in the second term by setting it equal to 1, thus making an inequality
But using the obvious fact
hence
applied repeatedly to the last term on the right
Shannon's Random Coding Method
- Shannon addressed the lack of existing error-correcting codes by proposing a random encoding process using coin tosses for each bit.
- The proof relies on averaging the probability of error over the set of all possible code books rather than analyzing a single specific code.
- By increasing the message length n, the probability of duplicate or dangerously close code points can be reduced below any arbitrary threshold.
- The volume of the error sphere is estimated using Stirling's formula and dominated by a geometric progression to simplify the bound.
- The entropy function H(s) naturally emerges from the binomial identities used to calculate the probability of a point falling within the error sphere.
- The final derivation demonstrates that reliable communication is possible if the message length is sufficiently large, even with random selection.
Not knowing how to encode, error correcting codes not having been invented as yet, Shannon chose a random encoding.
By making n large enough the first term can be made as small as we please, say less than some number d. We
have, therefore,
We now examine how we can make the code book for the encoding of the M messages, each of n bits. Not
knowing how to encode, error correcting codes not having been invented as yet, Shannon chose a random
Figure 13.III
INFORMATION THEORY 93
encoding. Toss a penny for each bit of the n bits of a message in the co de book, and repeat for all M
messages. There are nM tosses hence
possible code books, all books being of the same probability 1/2nM. Of course the random process of making
the code book means there is a ch ance there will be duplicates, and th ere may be code points which are
close to each other and hence will be a source of probable errors. What we have to prove is this does not
occur with a probability above any positive small level of error you care to pickâprovided n is made large
enough.
The decisive step is that Shannon averaged over all possible code books to find the average error! We
will use the symbol Av[.] to mean average over the set of all possible random code books. Averaging over
the constant d of course gives the constant, and we have, sin ce for the average each term is the same as any
other term in the sum,
which can be increased ( Mâ1 goes to M),
For any particular message, when we average over al l code books, the encoding runs through all possible
values, hence the average probability th at a point is in the sphere is the ratio of the volume of the sphere to
the total volume of the space. The volume of the sphere is
where s=Q+e 2<1/2, and ns is supposed to be an integer.
The largest term in this sum is the last (on the right). We first estimate the size of it, via Stirlingâs formula
for the factorials. We then look at the rate of fall off to the next term before it, note this rate increases as we
go to the left, and hence we can: (1) dominate the sum by a geometric progression with this initial rate, then
(2) extend the geomet ric progression from ns terms to an infinite number, (3) sum the infinite geometric
progression (all standard algebra of no great importance) and we finally get (4) the bound (for n large
enough)
Note how the entropy H(s) has appeared in a binomial identity.
We have now to assemble the parts, note the Taylor series expansion of H(s)=H(Q+e 2) gives a bound
when you take only the first derivative term and neglect all others, to get the final expression
where
Shannon's Noisy Coding Theorem
- Shannon's theorem proves that information can be sent at rates near channel capacity with arbitrarily small error by using sufficiently large block lengths.
- The proof relies on the existence of at least one suitable encoding system within the average of all possible random codes.
- A major practical critique is that the required block length 'n' must be extremely large, leading to significant delays and massive codebooks.
- Error-correcting codes avoid these massive codebooks by using computable, regular methods, though they often trade off some proximity to channel capacity.
- The theorem's relevance is demonstrated in deep-space satellites, which use elaborate encoding of long bit strings to overcome low power and vast distances.
- The geometry of n-dimensional space allows for dense packing of signal spheres with minimal overlap, facilitating high-efficiency error correction.
How large is this n? Very, very large indeed if you want to be both close to channel capacity and reasonably sure you are right!
All we have to do now is pick an e2 so e3<e1 and the last term will get as small as you please with
sufficiently large n. Hence the average error of PE can be made as small as you please while still being as
close to channel capacity C as you please.
If the average over all codes has a suitably small erro r, then at least one code must be suitableâhence
there exists at least one suitable encoding system. This is Shannonâs important result, the ânoisy coding
theoremâ, though let it be noted he proved it in much greater generality than the simple binary symmetricchannel I used. The Mathematics is more difficult in the general case, but the ideas are not so much
different, hence the very particular case used suffices to show you the true nature of the theorem.94 CHAPTER 13
Let us critique the result. Again and ag ain we said, âFor sufficiently large nâ. How large is this n? Very,
very large indeed if you want to be both close to channel capacity and reasonably sure you are right! Solarge, in fact, you would probably have to wait a very long time to accumulate a message of that many bitsbefore encoding it, let alone the size of the random code books (which being random cannot be representedin a significantly shorter form than the complete listing of all Mn bits, both n and M being very large).
Error correcting codes escape this waiting for a very long message and then encoding it via a very large
encoding book, along with the corres ponding large decoding book, because they avoid code books and adopt
regular (computable) methods. In the simple theory they tend to lose the ability to come very near to the
channel capacity and still keep an ar bitrarily low error rate but when a la rge number of erro rs are corrected
by the code they can do well. Put into other words, if you provide a capacity for some level of error
correction then for efficiency you mu st use this ability most of the ti me or else you are wasting capacity,
and this implies a high number of erro rs corrected in each message sent.
But the theorem is not useless! It does show, in so far as it is rele vant, efficient encoding schemes must
have very elaborate encodings of very long strings of bits of informati on. We see this accomplished in the
satellites which passed the outer planets; they correcte d more and more errors per block as they got farther
and farther from both the Earth and the Sun (which for some satellites supplied the solar power of about 5 wattsat most, others used atomic power sources of about th e same power). They had to use high error correcting
codes to be effective, given the lo w power of the source, their small dish size, the limited size of the
receiving dishes on Earth as seen from their position in space, and the enormous distances the signal had to
travel.
We return to the n-dimensional space which we used in the proof. In the discussion of n-dimensional
space we showed almost all the volume of a sphere la y near the outer surfaceâth us for the very slightly
(relatively) enlarged sphere about the received signal it is almost certain the original sent signal lies in it.
Thus the error correctio n of an arbitrarily la rge number of errors, nQ, with arbitrarily clos e to no errors after
decoding is not surprising. What is more surprising is the M spheres can be packed with almost no overlapâ
again an overlap as small as you please. Insight as to why this is possible comes from a closer examination
of the channel capacity than we have gone into, but you saw for the Hamming error correcting codes thespheres had no overlap. The many almost orthogonal directions in n-dimensional space indicates why we
can pack the M sphere into the space with little overlap. By allowing a slight, arbitrarily small amount, of
overlap which can lead to only a very few errors in your decoding you can get this dense packing. Hammingguaranteed a certain level; Shannon on ly a probably small error but as cl ose to the channel capacity as you
The Limits of Definitions
- Information theory serves as a guide for efficient machine-like communication but lacks relevance for human meaning.
- The application of information theory to biological inheritance remains an open question regarding its machine-like nature.
- Initial definitions often distort reality and dictate results rather than uncovering objective truths.
- IQ testing is a circular process where the definition is calibrated to produce a desired normal distribution.
- The 'softer sciences' are increasingly prone to applying definitions under conditions for which they were never intended.
- Researchers must scrutinize definitions to ensure their findings are not merely tautologies created by their own tools.
Their conclusion arose from the tool used and not from reality.
wish, which Hamming codes do not do.
Information theory does not tell you much about how to design, but it does point the way towards
efficient designs. It is a valuable tool for engineering communication system between machine-like things,but as noted before it is not really relevant to human communication of information. The extent to whichbiological inheritance, for example, is machine-like, and hence you ca n apply information theory to the
genes, and to what extent it is not and hence the appli cation is irrelevant, is simply not known at present. So
we have to try, and the success will show the machine-like character, wh ile the failure will point towards
other aspects of informat ion which are important.
We now abstract what we have learned. We have seen all initial definitions, to a larger or smaller extent,
should get the essence of our prior beliefs, but they always have some degree of distortion and hence non-applicability to things as we thought they were. It is traditional to accept, in the long run, the definition weuse actually defines the thing defined; but of course it only tells us how to handle things, and in no way actuallytells us any meaning. The postulational approach, so st rongly favored in mathema tical circles, leaves much
to be desired in practice.INFORMATION THEORY 95
We will now take up an example where a definition still bothers us, namely IQ. It is as circular as you
could wish. A test is made up which is supposed to measure âintelligenceâ, it is revised to make it as
consistent internally as we can, and then it is decl ared, when calibrated by a simple method, to measure
âintelligenceâ which is now normally distributed (v ia the calibration curve). All definitions should be
inspected, not only when first proposed, but much later when you see how they are going to enter into theconclusions drawn. To what extent were the definitions framed as they were to get the desired result? How
often were the definitions framed under one condition and are now being applied under quite differentconditions? All too often these are true! And it will prob ably be more and more true as we go farther and
farther into the softer sciences, which is inevitable during your life time.
Thus one purpose of this presentation of information theory, besides its usefulness, is to sensitize you to
this danger, or if you prefer, how to use it to get what you want! It has long been recognized the initialdefinitions determine what you find, much more than most people care to believe. The initial definitionsneed your careful attention in any new situation, and they are worth reviewing in fields in which you havelong worked so you can understand the extent the re sults are a tautology and not real results at all.
There is the famous story by Eddington about some people who went fishing in the sea with a net. Upon
examining the size of the fish they had caught they d ecided there was a minimum size to the fish in the sea!
Their conclusion arose from the tool used and not from reality. 96 CHAPTER 13
14
Digital FiltersâI
The Genesis of Digital Filters
- Hamming identifies a recurring pattern of technological obsolescence where engineers are left behind during shifts from relays to electronics and analog to digital systems.
- The author warns his Vice President that the transition to total digital transmission risks creating a massive economic and social loss of human capital.
- A brief hallway conversation results in a direct mandate from leadership for Hamming to solve the problem himself by creating educational resources.
- Despite initial disinterest in the subject, Hamming feels a social responsibility to prevent the waste of talent and begins a collaboration with expert Jim Kaiser.
- The project evolves from a joint effort into a solo book as Hamming takes over the writing process to ensure the material is actually produced.
- The resulting book on digital filters went through three editions, illustrating how personal initiative and social concern can drive technical education.
He looked me square in the eye and said, Yes Hamming, you should.â and walked off!
Now that we have examined comput ers and how they represent informat ion let us turn to how computers
process information. We can, of course, only examine a very few of the things they do, and will concentrate
on basics per usual.
Much of what computers process are signals from various sources, and we have already discussed why
they are often in the form of a stream of numb ers from an equally spaced sampling system. Linear
processing, which is the only one I have time for in this book, implies digital filters . To illustrate âstyleâ and
how things actually happen in real life I propose to tell you first how I became invo lved in them, and then
how I proceeded.
First, I never went to the office of my Vice President, W.O. Baker; we only met in passing in the halls
and we usually stopped to talk a few, very few, mi nutes. One time, around 1973â1974, when I met him in a
hall I said to him when I came to Bell Telephone Laboratories in 1946 I had noticed the Laboratories weregradually passing from relay to electronic central offices, but a large number of people would not convert tooscilloscopes and the newer electronic technology and they were moved to a different location to get themout of the way. To him they represented a serious econo mic loss but to me they were a social loss since they
were disgruntled to say the least because they were passed by (though it was their own fault). I went on tosay I had seen the same thing happen when we went from the earlier analog computers (on which BellTelephone Laboratories had many experts because they had developed much of the technology during WW-II) to the more modern digital computersâthat we agai n left a large number of engineers behind, and again
they were both an economic and a social loss. I then observed we both knew the telephone company wasgoing to total digital transmission about as fast as they could, and this time we would leave behind a verymuch larger number of disgruntled engineers. Hence, I concluded, we should do something now about the
situation, such as get adequate elementary books and other training devices to ease more of them into the
future and leave fewer behind. He looked me square in the eye and said, Yes Hamming, you should.â and
walked off! Furthermore, he went on encouraging me, via John Tukey with whom he often spoke, so I knewhe was watching my efforts.
What to do? In the first place I thought I knew very litt le about digital filters, and, furthermore, I was not
really interested in them. But does one wisely ignore oneâs V.P. plus the cogency of ones own observations?
No! The implied social waste was too hi gh for me to contemplate comfortably.
So I turned to a friend, Jim Kaiser (J.F.Kaiser), wh o was one of the worldâs experts in digital filters at
that time, and suggested he should stop his current research and write a book on digital filters book writingto summarize his work was a natural stage in the development of a scient ist. After some pr essure he agreed
to write the book, so I was saved, so I thought. But monitoring what he was doing revealed he was writingnothing. To rescue my plan I offered, if he would educate me over lunches in the restaurant (you get more
time to think there than in the cafeter ia), to help write the book jointly (mainly the first part), and we could
call it Kaiser and Hamming. Agreed!
As time went on I was getting a good education from him, and I got my first part of the book going but he
was still writing nothing. So one day I said, âIf you donât write more we will end up calling it Hamming andKaiser.ââand he agreed. Still later when I had about completed all the writing and he had still writtennothing, I said I could thank him in the preface, but it should be called Hamming, and he agreedâand weare still good friends! That is how the book on Digital Filters I wrote came to be, and I saw it ultimately
through three editions, always with good advice from Kaiser.
Foundations of Digital Filters
- The author reflects on the long-term professional success gained from teaching short courses on digital filters globally while still writing the textbook.
- Learning new subjects is framed as a career necessity for those who wish to remain leaders rather than followers in their fields.
- Dissatisfied with the explanations of electrical engineers, the author sought a fundamental mathematical reason for the dominance of Fourier series.
- The complex exponentials are identified as the essential tool because they are the eigenfunctions of both time-invariant and linear systems.
- The Nyquist sampling theorem provides a third justification, ensuring that band-limited signals can be perfectly reconstructed from discrete samples.
- The concept of 'aliasing' is introduced to describe how high frequencies masquerade as lower ones when sampling rates are insufficient.
Doing what needed to be done, though I did not want to do it, paid off handsomely in the long run.
The book also took me many places which were interesting since I gave a short, one week courses, on it
for many years. The short courses began while I was still writing it because I needed feedback and had
suggested to UCLA Extension Division I give it as a sh ort course, to which they ag reed. That led to years of
giving it at UCLA, once in each of Paris, London, and Cambridge, England, as well as many other places in
the USA and at least twice in Canada. Doing what needed to be done, though I did not want to do it, paidoff handsomely in the long run.
Now, to the more important part, how I went about le arning the new subject of digital filters. Learning a
new subject is something you will have to do many times in your career if you are to be a leader and not beleft behind as a follower by newer developments. It soon became clear to me digital filter theory wasdominated by Fourier series, about which theoretically I had learned in college, and actually I had had a lotof further education during the signal processing I had done for John Tukey, who was a professor fromPrinceton, a genius, and a one or two day a week employee of Bell Telephone Laboratories. For about tenyears I was his computing arm much of the time.
Being a mathematician I knew, as all of you do, any complete set of functions will do about as good as
any other set at representing arbitrary functions. Why, then, the exclusive use of the Fourier series? I asked
various Electrical Engineers and got no satisfactory answers. One engineer said alternating currents were
sinusoidal, hence we used sinusoids, to which I replied it made no sense to me. So much for the usualresidual education of the typical Electrical Engineer after they have left school!
So I had to think of basics, just as I told you I had done when using an error detecting computer. What is
really going on? I suppose many of you know what we want is a time invariant representation of signals
since there is usually no natural origin of time. Hence we are led to the trigonometric functions (theeigenfunctions of translation), in the form of both F ourier series and Fourier in tegrals, as the tool for
representing things.
Second, linear systems, which is what we want at this stage, also have the same eigenfunctionsâthe
complex exponentials which are equivalent to the real trigonometric functions. Hence a simple rule: If youhave either a time invari ant system, or a linear system, then you should use the complex exponentials.
On further digging into the matter I found yet a third reason for using them in the field of digital filters.
There is a theorem, often called âNyquistâs sampling theoremâ (thought it was known long before and evenpublished by Whittaker in a form you can hardly realize what it is saying even when you know Nyquistâstheorem), which says, if you have a band limited signal and sample at equal spaces at a rate of at least two
in the highest frequency, then the original signal can be reconstructed from the samples. Hence the samplingprocess loses no informat ion when we replace the continuous si gnal with the equally spaced samples,
provided the samples cover the whole real line. The sampling rate is ofte n known as âthe Nyquist rateâ after
Harry Nyquist, also of servo stability fame as well as other things. If you sample a nonbandlimited function,then the higher frequencies are âaliasedâ into lower ones, a word devised by Tukey to describe the fact thata single high frequency will appear later as a single low frequency in the Nyquist band. The same is not true98 CHAPTER 14
for any other set of functions, say powers of t. Under equal spaced sampling and reconstruction a single
high power of t will go into a polynomial (many terms) of lower powers of t.
Thus there are three good reasons for the Fourier f unctions: (1) time invariance, (2) linearity, and (3) the
Signals and Linear Systems
- The reconstruction of original functions from equally spaced samples is best analyzed using Fourier functions and complex exponents.
- Complex exponentials serve as the eigenfunctions of linear, time-invariant, equally spaced sampled systems.
- The transfer function in electrical engineering is fundamentally the set of eigenvalues corresponding to these eigenfunctions.
- A digital signal is defined as an equally spaced sequence of measurements, typically resulting from sampling and quantizing continuous natural signals.
- Equally spaced sampling inevitably leads to aliasing, where frequencies above the Nyquist limit are perceived as lower frequencies.
Lo, and behold, the famous transfer function is exactly the eigenvalues of the corresponding eigenfunctions!
reconstruction of the original func tion from the equally spaced samples is simple and easy to understand.
Therefore we are going to analyse the signals in terms of the Fourier functions, and I need not discuss
with electrical engineers why we usually use the comp lex exponents as the freque ncies instead of the real
trigonometric functions. We have a linear operation and when we put a signal (a stream of numbers) into thefilter then out comes another stream of numbers. It is natural, if not from your linear algebra course, thenfrom other things such as a course in differential equations, to ask what functions go in and come outexactly the same except for scale? Well, as noted above, they ar e the complex exponentials; they are the
eigenfunctions of linear, time invariant, equally spaced sampled systems.
Lo, and behold, the famous transfer function is exactly the eigenvalues of the corresponding
eigenfunctions! Upon asking various Electrical Engin eers what the transfer func tion was no one has ever
told me that! Yes, when pointed out to them it is the same idea they have to agree, but the fact it is the sameidea never seemed to have crossed their minds! The same, simple idea, in two or more different disguises intheir minds, and they knew of no connection between them! Get down to the basics every time!
We begin our discussion with, âWhat is a signal?â Nature supplies many signals which are continuous,
and which we therefore sample at equal spacing and further digitize (quantize). Usually the signals are a
function of time, but any experiment in a lab which us es equally spaced voltages, for example, and records
the corresponding responses, is also a digital signa l. A digital signal is, th erefore, an equally spaced
sequence measurements in the form of numbers, and we get out of the digital filter another equally spaced
set of numbers. One can, and at ti mes must, process nonequally spaced da ta, but I shall ignore them here.
The quantization of the signal into one of several levels of output often has surprisingly small effect. You
have all seen pictures quantized to two, four, eight, and more levels, and even the two level picture is
usually recognizable. I will ignore quantization here as it is usually a small effect, though at times it is very
important.
The consequence of equally spaced sampling is aliasing, a frequency above the Nyquist frequency
(which has two samples in the cycle) will be aliased in to a lower frequency. This is a simple consequence of
the trigonometric identity
Aliasing and Sampling Fundamentals
- Aliasing occurs when signals are sampled at a rate lower than the Nyquist rate, causing higher frequencies to fold into the fundamental interval.
- Complex exponential notation is preferred over real trigonometric functions because it avoids multiple eigenvalues and covers a symmetric frequency band.
- Standardizing notation by scaling time to one unit per sample simplifies analysis across diverse fields of application.
- Once sampling occurs, higher frequencies are permanently aliased into the lower band and effectively cease to exist as separate entities.
- The author suggests that while the Nyquist rate requires two samples per cycle, practical constraints like finite data points may require up to eight samples per cycle.
- Aliasing is an inherent property of the sampling process itself, independent of any subsequent signal processing.
I have found it convenient to think once the samples have been taken then all the frequencies are in the Nyquist band, and hence we do not need to draw periodic extensions of anything since the other frequencies no longer exist in the signal.
where a is the positive remainder after removing the integer number of rotations, k (we always use rotations
in discussing results, and use radians while applying the calculus, just as we use base 10 logs and base e
logs), and n is the step number. If a>1/2, then we can write the above as
The aliased band, therefore, is less than 1/2 a rotation, plus or minus. If we use the two real trigonometric
functions, sin and cos, we have a pair of eigenfunctions for each frequency, and the band is from 0 to 1/2 a
rotation, but when we use the complex exponential notation then we have one eigenfunction for each
frequency, but now the band reaches fromâ1/2 to 1/2 rotations. This avoidance of the multiple eigenvaluesis part of the reason the complex frequencies are so mu ch easier to handle than are the real sine and cosine
functions. The maximum sampling rate for which aliasing does not occur is two samples in the cycle, and iscalled the Nyquist rate. From the samples the original signal ca nnot be determined to within the aliased
frequencies, only the basic frequencies that fall in th e fundamental interval of un aliased frequencies (â1/2 to
1/2) can be determined uniquely. The signals from the various aliased frequencies go to a single frequencyin the band and are algebraically added; that is what we see once the sampling has been done . HenceDIGITAL FILTERSâI 99
addition or cancellation may occur during the aliasing, an d we cannot know from the aliased signal what we
originally had. At the maximum sampling rate one cannot tell the result from 1, hence the unaliased
frequencies must be within the band.
We shall stretch (compress) time so we can take the sampling rate to be one pe r unit time, because this
makes things much easier and brings experiences from the milli and micro second range to those which maytake days or even years between samples. It is alwa ys wise to adopt a standa rd notation and framework of
thinking of diverse thingsâone field of application may suggest things to do in the other. I have found it of
great value to do so whenever possibleâremove th e extraneous scale factors and get to the basic
expressions. (But then I was origin ally trained as a mathematician.)
Aliasing is the fundamental effect of sampling and has nothing to do with how the signals are processed.
I have found it convenient to think once the samples have been taken then all the frequencies are in theNyquist band, and hence we do not need to draw periodic extensions of anything since the other frequenciesno longer exist in the signalâonce the sampling has occurred the higher frequencies have been aliased intothe lower band, and do not exist up there any more. A significant savings in thinking! The act of sampling
produces the aliased signal we must use .
I now turn to three stories which use only the ideas of sampling and aliasing. In the first story I was trying
to compute the numerical solution to a system of 28 ordinary differential equations and I had to know thesampling rate to use (the step size of the solution is the sampling rate you are using), since if it were half as
large as expected then the computing bill would be about twice as much. For the most popular and practical
methods of numerical solution the ma thematical theory bases the step size on the fifth derivative. Who
could know the bound? No one! But viewed as sampling, then the aliasing begins at two samples for thehighest frequency present, provided you have data from minus to plus infinity. Having only a short range of
at most five points of data I intuitively figured I would need about twice the rate, or 4 samples per cycle.And finally, having only data on one side, perhaps another factor of 2; in all 8 samples per cycle.
I next did two things: (1) developed the theory, and (2) ran numerical tests on the simple differential equation
The Power of Sampling Fundamentals
- Hamming illustrates how a firm grasp of the sampling theorem can solve complex engineering problems, such as debugging a missile simulation from across the continent.
- He demonstrates that understanding aliasing allows for the removal of unnecessary hardware by using the sampling act itself to modulate high-frequency signals.
- The text emphasizes that mastering fundamentals enables engineers to perform 'fancy' tasks and innovate beyond standard instructions.
- The discussion transitions into nonrecursive filters, explaining their origins in telephone multiplexing and the use of trigonometric identities for frequency shifting.
- A practical example of filter design is introduced using the least squares method to fit a straight line to data points for smoothing noise.
Debugging a large program across the continent based on the sampling theorem!
They both showed at around 7 sample s per cycle you are on the edge of accuracy (per step) and at 10 you
are very safe. So I explained the situation to them and asked them for the highest frequencies in the
expected solution. They saw the justice of my request, and after some days they said I had to worry about the
frequencies up to 10 cycles per second and they would worry about those above. They were right, and the
answers were satisfactory. Th e sampling theorem in action!
The second story involves a remark, made to me casually in the halls of Bell Telephone Laboratories that
a certain West Coast subcontractor was having trouble with the simulation of a Nike missile launch, and
was using 1/1000 to 1/10,000 of second spacing. I laughed immediately, and said there must be some
mistake, 70 to 100 samples would be enough for the model they were using. It turned out they had a binary
number 7 position to the left, 128 times too large! Debugging a large program across the continent based on
the sampling theorem!
The third story is a group at Naval Postgraduate School was modulating a very high frequency signal
down to where they could afford to sample, according to the sampling theorem as they understood it. But I
realized if they cleverly sampled the high frequency then the sampling act itself would modulate (alias) it
down. After some days of argument, they removed the rack of frequency lowering equipment, and the rest of
the equipment ran better! Again, I n eeded only a firm understanding of th e aliasing effects due to sampling.
It is another example of why you need to know the fundamentals very well; the fancy parts then follow
easily and you can do things that they never told you about.100 CHAPTER 14
The sampling is fundamental to the way we currently process data, when we use the digital computers.
And now we understand what a signal is, and what sampli ng does to a signal, we can safely turn to more of
the details of processing signals.
We will first discuss nonrecursive filters, whose purpose is to pass some frequencies and stop others. The
problem first arose in the telephone company when they had the idea if one voice message had all its
frequencies moved up (modulated) to beyond the range of another then the two signals could be added andsent over the same wires, and at the other end filtered out and separated, and the higher one reduced(demodulated) back to its original frequencies. This shifting is simply multiplying by a sinusoidal function,and selecting one band (single sideba nd modulation) of the two frequencies which emerge according to the
following trigonometric identity (this time we use real functions)
There is nothing mysterious about the frequency shifting (modulation) of a signal, it is at most a variant of a
trigonometric identity.
The nonrecursive filters we will consider first are mainly of the smoothing type where the input is the
values u(t)=u(n)=u n and the output is yn
with Cj=C âj (the coefficients are symmetric about the middle value C0).
I need to remind you about least squares as it plays a fundamental role in what we are going to do, hence
I will design a smoothing filter to show you how filters can arise. Suppose we have a signal with ânoiseâadded and want to smooth it, remove the noise. We will assume it seems reasonable to you fit a straight line
to 5 consecutive points of the data in a least squares sense, and then take the middle value on the line as theâsmoothed value of the functionâ at that point.
For mathematical convenien ce we pick the 5 points at t=â2, â1,0,1,2 and fit the straight line, Figure 14.I ,
Least squares says we should minimize the sum of the squares of the differences between the data and the
points on the line, that is, minimize
What are the parameters to use in the differ entiation to find the minimum? They are the a and the b, not the
t (now the discrete variable k), and u. The line depends on the parameters a and b, and this is often a
Figure 14.I
Digital Filters and Smoothing Formulas
- The text derives smoothing formulas by applying least squares minimization to linear and quadratic functions over a data window.
- A simple linear fit over five points results in a uniform running average, while a quadratic fit introduces non-uniform coefficients including negative values.
- Digital filters can be conceptualized as a 'window' through which data is viewed, with the coefficients defining the window's shape and transmission properties.
- Smoothing formulas exhibit central symmetry in their coefficients, whereas differentiating formulas exhibit odd symmetry.
- Any non-recursive digital filter can be decomposed into the sum of a smoothing filter and a differentiating filter.
- The transfer function of a smoothing filter is represented by a Fourier cosine expansion, linking filter design to the mathematics of Fourier series.
Do not let that worry you as we were speaking of a window in a metaphorical way and hence negative transmission is possible.
DIGITAL FILTERSâI 101
stumbling block for the student; the parameters of the equation are the variables for minimization! Hence on
differentiating with respect to a and b, and equating the derivatives to zero to get the minimum, we have
In this case we need only a, the value of the line at the midpoint, hence using (some of the sums are for later
use),
from the top equation we have
which is simply the average of the five adjacent values. When you think ab out how to carry out the
computation for a, the smoothed value, think of the data in a vertical column, Figure 14.II , with the
coefficients each 1/5, as a running weighting of the data; then you can think of it as a window through which
you look at the data, with the âshapeâ of the window being the coefficients of the filter, this case of
smoothing being uniform in size.
Had we used 2 k+1 symmetrically placed points we would still have obtaine d a running average of the
data points as the smoothed value which is supposed to eliminate the noise.
Suppose instead of a straight line we had smoothed by fitting a quadratic, Figure 14.III ,
Setting up the difference of the squares and differentiating this time with respect to a, b and c we get:
Again we need only a. Rewriting the first and third equations (the middle one does not involve a), and
inserting the known sums
from above, we have
To eliminate c, which we do not need, we multiply the top equation by 17 and the lower equation byâ5, and
add to get
and this time our âsmoothing windowâ does not have uniform coefficients, but has some with negativevalues. Do not let that worry you as we were speaking of a window in a metaphorical way and hencenegative transmission is possible.102 CHAPTER 14
If we now shift these two least squares derived smoothing formulas to their proper places about the point
n we would have
We now ask what will come out if we put in a pure eigenfunction. We know because the equations are
linear they should give the eigenfunction back, but multiplied by the eigenvalue corresponding to the
eigenfunctionâs frequency, the transfer function value at that frequency. Taking the top of the two
smoothing formulas we have
Hence the eigenvalue at the frequency Ď (the transfer function) is, by elementary trigonometry,
In the parabolic smoothing case we will get
These are easily sketched along with the 2 k+1 smoothing by straight line curves, Figure 14.IV .
Figure 14.II
Figure 14.III
DIGITAL FILTERSâI 103
Smoothing formulas have central symmetry in their coefficients, while differentiating formulas have odd
symmetry. From the obvious formula
we see any formula is the sum of an odd and an even function, hence any nonrecursive digital filter is the
sum of a smoothing filter and a differentiating filter. When we have mastered these two special cases we
have the general case in hand.
For smoothing formulas we see the eigenvalue curve (the transfer function) is a Fourier expansion in
cosines, while for the differentiation formula it will be an expansion in sines. Thus we are led, given a
transfer function you want to achieve, to the problem of Fourier expansions of a given function.
Now to a brief recapitulation of Fourier series . If we assume that the arbitrary function f(t) is represented
we use the orthogonality conditions (they can be found by elementary trigonometry and simple
integrations):
we get
and because we used an a0/2 for the first coefficient the same formula for ak holds for the case k=0. In the
complex notation it is, of course, much simpler.
Next we need to prove the fit of any orthogonal set of functions gives the least squares fit. Let the set of
orthogonal functions be {fk(t)} with weight function w(t)âĽ0. Orthogonality means
As above the formal expansio n will give the coefficients104 CHAPTER 14
where the
when the functions are real, and in the case of complex functions we multiply through by the complex
conjugate function.
Figure 14.IV
DIGITAL FILTERSâI 105
Digital Filters and Innovation
- Orthogonal function fits are mathematically equivalent to least squares fits, providing a reliable method for finite approximations.
- Besselâs inequality serves as a practical guide for determining the necessary number of terms in a Fourier expansion.
- The history of computing shows that viewing new technology as merely an extension of the old prevents significant innovation.
- A change in magnitude, such as speed or cost, often creates fundamentally new effects rather than just incremental improvements.
- Digital filters were initially misunderstood as simple variants of analog filters rather than a distinct field of study.
- Simple smoothing filters can be combined to effectively remove high-frequency noise from a stream of numbers.
Those who claimed there was no essential difference never made any significant contributions to the development of computers.
Now consider the least squares fit of a complete set of orthogonal functions using the coefficients
(capitals) Ck. We have
to minimize. Differentia te with respect to Cm. You get
and we see from a rearrangement the Ck=ck. Hence all orthogonal function fits are least squares fits,
regardless of the set of orthogonal functions used.
If we keep track of the inequality we find we will have, in the general case, Besselâs inequality
for the number of coefficients taken in the sum, and this provides a running test for when you have taken
enough terms in a finite approximation. In practice this has proven to be a very useful guide to how many
terms to take in a Fourier expansion. 106 CHAPTER 14
15
Digital Filtersâ II
When digital filters first arose they were viewed merely as a variant of the classical analog filters; people
did not see them as essentially new and different. This is exactly the same mistake which was made
endlessly by people in the early days of computers. I was told repeatedly, until I was sick of hearing it,computers were nothing more than large, fast desk calculators. âAnything you can do by a machine you cando by hand.â, so they said. This simply ignores the speed, accuracy, re liability, and lowe r costs of the
machines vs. humans. Typically a single order of magnitude change (a factor of 10) produces fundamentallynew effects, and computers are many, many times fast er than hand computations . Those who claimed there
was no essential difference never made any significant contributions to the development of computers.Those who did make significant contributions viewed computers as something new to be studied on theirown merits and not as merely more of the same old desk calculators, perhaps souped up a bit.
This is a common, endlessly made, mistake; people always want to think that something new is just like
the pastâthey like to be comfortable in their minds as well as their bodiesâand hence they preventthemselves from making any significant contribution to the new field being created under their noses. Noteverything which is said to be new really is new, an d it is hard to decide in some cases when something is
new, yet the all too common reaction of, âIf s nothing new.â is stupid. When something is claimed to benew, do not be too hasty to think it is just the past slightly improvedâ it may be a great opportunity for you
to do significant things . But again it may be nothing new.
The earliest digital filter I used, in the early days of primitive computers, was one which smoothed first
by 3âs and then by 5âs. Looking at the formula for smoothing, the smoothing by 3âs has the transfer function
which is easy to draw, Figure 15.I . The smoothing by 5âs is the same except that the 3/2 becomes a 5/2 and
is again easy to draw. Figure 15.I . One filter followed by the othe r is obviously their product (each
multiplies the input eigenfunction by the transfer function at that frequency), and you see there will be threezeros in the interval, and the terminal value will be 1/ 15. An examination will show the upper half of the
frequencies were fairly well removed by this very simple program for computing a running sum of 3
numbers, followed by a running sum of 5âas is common in computing practice the divisors were left to
the very end where they were allowe d for by one multiplication, by 1/15.
Now you may wonder how, in all its detail, a digital filter removes frequencies from a stream of numbers
âand even students who have had courses in digital fi lters may not be at all cl ear how the miracle happens.
Hence I propose, before going further, to design a very simple digital filter and show you the inner working
on actual numbers.
I propose to design a simple filter with just two coefficients, and hence I can meet exactly two conditions
on the transfer function. When doing theory we use the angular frequency Ď, but in practice we use
rotations f, and the relationship is
Designing Simple Digital Filters
- The author establishes a digital filter with two specific constraints: a transfer function of 1 at frequency 1/6 and 0 at frequency 1/3.
- A simple filter form is derived using two coefficients, a and b, applied to the eigenfunction exp{2Ďifn}.
- The resulting smoothing filter formula defines the output as the sum of three consecutive inputs divided by two, with the middle value negated.
- Sample data is generated using cosine waves at the specified frequencies to test the filter's performance.
- The final test signal is a composite of both frequencies, designed to demonstrate the filter's ability to isolate specific signals.
- This process illustrates the fundamental mathematical relationship between frequency response requirements and filter coefficients.
In words, the output of the filter is the sum of three consecutive inputs divided by 2, and the output is opposite the middle input value.
Let the first condition on the digital filter be at f=1/6 the transfer function is exactly 1 (this frequency is to
get through the filter unaltered), and the second condition at f=1/3 it is to be zero (this frequency is to be
stopped completely). My simple filter has the form, with the two coefficients a and b,
Substituting in the eigenfunction exp{2 Ďifn} we will get the transfer function, and using n=0 for
convenience,
The solution is
and the smoothing filter is simply
In words, the output of the filter is the sum of three consecutive inputs divided by 2, and the output is
opposite the middle input value. [It is the earlier smoothing by 3âs except for the coefficient 1/2.]
Now to produce some sample data for the input to the filter. At the frequency f= 1/6 we use a cosine at
that frequency taking the values of the cosine at equal spaced values n=0,1,âŚ, while the second column of
data we use the second frequency f=1/3, and finally on the third column is the sum of the two other columns
and is a signal composed of the two frequencies in equal amounts.
Figure 15.I
108 CHAPTER 15
Digital Filters and Gibbs' Phenomena
- Digital filters operate by decomposing signals into frequencies and multiplying them by specific eigenvalues defined by a transfer function.
- Ideal filters require a sharp cutoff between passed and stopped frequencies, which mathematically implies a discontinuous transfer function.
- Approximating these discontinuities with a finite Fourier series leads to the Gibbs' phenomenon, characterized by a persistent overshoot.
- The overshoot does not diminish toward zero as more terms are added, maintaining a limit of approximately 8.949%.
- The phenomenon was famously identified by Josiah Willard Gibbs after the physicist Albert Michelson suspected his mechanical analyzer was malfunctioning.
- This historical episode illustrates that scientific discovery often favors those prepared to investigate anomalies rather than dismissing them as equipment error.
When Michelson did this he observed an overshoot and asked the local mathematicians why it happened. They all said it was his equipmentâand yet he was well known as a very careful experimenter.
Let us run the data through the filter. We compute, according to the filter formula, the sum of three
consecutive numbers in a column and then divide their sum by 2. Doing this on the first column you will see
each time the filter is shif ted down one line it reproduces the input fu nction (with a multip lier of 1). Try the
filter on the second column and you will find every output is exactly 0, the input function multiplied by its
eigenvalue 0. The third column, which is the sum of the first two columns, should pass the first and stop the
second frequency, and you get out exactly the first column. You can try the 0 frequency input and you
should get exactly 3/2 for every value, if you try f=1/4 you should get the input multiplied by 1/2 (the value
of the transfer function at f=1/2).
You have just seen a digital filter in action. The filter decomposes the input signal into all its frequencies,
multiplies each frequency by its corresponding eigenvalue (the transfer function), and then adds all
the terms together to give the output. The simple linear formula of the filter does all this!
We now return to the problem of designing a filter. What we often want ideally is a transfer function
which has a sharp cutoff between the frequencies it passes exactly (with eigenvalues 1), and those which it
stops (with eigenvalues 0). As you know, a Fourier series can represent such a discontinuous function, but it
will take an infinite number of terms. However, we have only a modest number available if we want a
practical filter; 2 k +1 terms in the smoothing filter gives only k+1 free coefficients, and hence only k+1
arbitrary conditions can be met by the corresponding sum of cosines.
If we simply expand the desired transfer function into a sum of cosines and then truncate it we will get a
least squares approximation to the transfer function. Bu t at a discontinuity the least squares fit is not what
you probably think it is.
To understand what we will see at a discontinuity we must investigate the Gibbs´ phenomena . We first
recall a theorem: If a series of continuous functions converges uniformly in a closed interval then the limit
function is continuous. But the limit function we want to approximate is not continuous, it has a jump
(discontinuity) between the pass and stop bands of fr equencies. No matter how many terms in the series we
take, since there cannot be a uniform convergence, we can expect(?) to see a significant overshoot in the
neighborhood of the singularity. As we take more terms the size of the overshoot will not approach 0.
Another story. Michelson, of Michelson-Morley fame, built an analog machine to find the coefficients of
a Fourier series out to 75 terms. The machine coul d also, because of the duality of the function and the
coefficients, go from the coefficien ts back to the function. When Michelson did this he observed an
overshoot and asked the local mathematicians why it ha ppened. They all said it was his equipmentâand yetDIGITAL FILTERSâII 109
he was well known as a very careful experimenter. Only Gibbs, of Yale, listened and looked into the matter.
The simplest direct approach is to expand a standard discontinuity, say the function
into a Fourier series of a finite number of terms, r earrange things, and then find the location of the first
maximum and finally the corresponding height of the function there. One finds, Figure 15.II , an overshoot of
0.08949, or 8.949% overshoot, in the limit as the number of terms in the Fourier series approaches infinity.
Many people had the opportunity to discover (really rediscover) the Gibbsâ phenomena, and it was Gibbs
who made the effort. It is another example of what I maintain, there ar e opportunities all around and few
people reach for them. As Pasteur said, âLuck favors the prepared mindâ. This time the person who was
prepared to listen and help a first class scientist in his troubles got the fame.
I remarked it was rediscovered. Yes.
Fourier Convergence and Lanczos Windows
- Cauchy's early textbooks contained a fundamental contradiction regarding the convergence of continuous functions that was eventually resolved by the concept of uniform convergence.
- The rate of convergence for Fourier series is directly observable from the function's smoothness on the real line, unlike Taylor series which are governed by complex singularities.
- Gibbs' phenomenon causes a persistent overshoot at discontinuities in least squares fits, regardless of how many terms are added to the series.
- Lanczos proposed a 'sigma factor' window that averages the output function to significantly reduce, though not entirely eliminate, these ripples.
- Adjusting the transition value to one-half in discrete cases provides an additional mathematical factor that improves the transfer function's behavior.
Thus the rate of convergence is directly observable from the function along the real lineâwhich is not true for the Taylor series whose convergence is controlled by singularities which may lie in the complex plane.
In the 1850s the contradiction in Cauchyâs textbooks: (1) a
convergent series of continuous functions converged to a continuous function (it was so stated in his book!),
and (2) the Fourier expansion of a discontinuous function (also in his boo k) flatly contradicted each other.
Some people looked into the matter an d found they needed the concept of uniform convergence . Yes, the
overshoot of the Gibbsâ phenomena occurs for any series of continuous functions, not just to the Fourierseries, and was known to some people, but it had not diffused into common usage. For the general set oforthogonal functions the amount of overshoot depends upon where in the interval the discontinuity occurs,which differs from the Fourier functions where the amount of the overshoot is independent of where thediscontinuity occurs.
We need to remind you of another feature of the Fourier series. If the function exists (for practical
purposes) then the coeffi cients fall off like 1/ n. If the function is continuous, Figure 15.III (the two extreme
end values must be the same) and the derivative exists then the coefficients fall off like 1/n
2; if the first
derivative is continuous and the second derivative exists then they fall off like 1/ n3; if the second derivative
is continuous and the third derivative exists then 1/ n3, etc. Thus the rate of convergence is directly
observable from the function along the real lineâwhich is not true for the Taylor series whose convergenceis controlled by singularities which may lie in the complex plane.
Now we return to our design of a smoothing digital f ilter using the Fourier series to get the leading terms.
We see the least squares fit has trouble at any discontinuityâthere is a nasty overshoot in the transferfunction for any finite number of terms, no matter how far out we go.
Figure 15.II
110 CHAPTER 15
To remove this overshoot we first examine Lanczosâ window, also called a âbox carâ, or a ârectangularâ
window. Lanczos reasoned if he averaged the output function over an interval of the length of a period of
the highest frequency present, then this averaging would greatly reduce the ripples. To see this in detail we
take the Fourier series expansion truncated at the N-th harmonic, and integrate about a point t in a
symmetric interval of length 1/ N of the whole interval. Set up the integral for the averaging,
We now do the integrations
apply a little trigonometry for the differen ce of sines and cosines from the two limits
and you come out with the original coefficients multiplied by the so-called sigma factors
An examination of these numbers as a function of k (N being fixed and is the number of terms you are
keeping in the Fourier series), you will find at k=0 the sigma factor is 1, and th e sigma factors fall off until at
k=N they are 0. Thus they are another example of a wi ndow. The effect of Lanczo sâ window is to reduce
the overshoot to about 0.01189 (by about a factor of 7), and the first minimum to 0.00473 (by about a factorof 10), which is a significant but not complete reduction of Gibbsâ phenomenon.
But back to my adventures in the matter. I knew, as you do, at the discontinuity the truncated Fourier
expansion takes on the midvalue of the two limits, one from each side. Thinking about th e finite, discrete
case, I reasoned instead of taking all 1 values in the pass band and 0 values in the stop band, I should take 1/2 at the transition value. Lo, and be hold, the transfer function becomes
and now has an extra factor (back in the rotational notation)
Figure 15.III
DIGITAL FILTERSâII 111
Windows and Convolution Theory
- The author identifies a modified trigonometric series that outperforms the Lanczos filter by vanishing at the Nyquist frequency.
- Personal discovery of these series led to a deeper investigation into the possibilities of 'windows' in signal processing.
- The mathematical relationship between filtering and convolution is established as the multiplication of corresponding functions.
- A digital filter is formally defined as the convolution of one array of coefficients by another.
- The act of recording finite data is modeled as looking through a rectangular window, which results in a convolution of the original coefficients.
- The resulting frequency response of a rectangular window mimics the typical diffraction patterns found in optics.
Multiplication on one side is convolution on the other side of the equation.
and the N+1 in the sine term goes to N as well as the denominator N+1 going to N. Clearly this transfer function
is nicer than the Lanczosâ as a low pass filter since it vanishes at the Nyquist frequency, and further dampens
all the higher frequencies. I looked around in books on trigonometric series and found it in only one,
Zygmundâs two volume work where it was called the modified series . The extra âbeing preparedâ did not
necessarily pay off this time in a great result, but having found it myself I naturally reasoned using even
more modification of th e coefficients of the Fourier series (how much and where re mained to be found), I might
do even better. In short, I saw more clearly wh at âwindowsâ were, and was slowly led to a closer
examination of their possibilities.
A still third approach to the important Gibbsâ phenomena is via the problem of combining Fourier series.
Let g(x) be (and we are using the neutral variable x for a good reason)
and another function be
The sum and difference of g(x) and h(x) are clearly the corresponding seri es with the sum or difference of
the coefficients.
The product is another matter. Evidently we will have again a sum of exponentials, and setting n=k+m
we will have the coefficients as indicated
The coefficient of exp {inx}, which is a sum of terms, is called the convolution of the original arrays of
coefficients.
In the case where there are only a few nonzero coefficients in the ck coefficient array, for example, say
symmetrically placed about 0, we will have for the coefficient
and this we recognize as the original definition of a digital filter! Thus a filter is the convolution of one
array by another, which in turn is merely the multiplication of the corresponding functions! Multiplicationon one side is convolution on the other side of the equation.
As an example of the use of this observation, suppos e, as often occurs, there is potentially an infinite
array of data, but we can record only a finite number of them (for example, turning on or off a telescopewhile looking at the stars). This function u
n is being looked at through the rectangular window of all Oâs
outside a range of (2 N+1) 1âsâthe value 1 where we observe, and the value 0 where we do not observe.
When we try to compute the Fourier expansion of the original array from the observed data we mustconvolve the original coefficients by the coefficients of the window array
Generally we want a window of unit area, so we need, finally, to divide by (2 N+1). The array is a geometric
progression with the starting value of exp {âiNx}, and constant ratio of exp {ix},112 CHAPTER 15
At x=0 this takes on the value 1, and otherwise oscillates rapidly due to the sine function in the numerator,
and decays slowly due to the increase of the sine in th e denominator (the range in x is ( Ď, Ď). Thus we have
the typical diffraction pattern of optics.
In the continuous case, before sampling, the situa tion is much the same but the rectangular window we
look through has the transform of the general form (ignoring all details)
Windowing and Spectral Leakage
- Gibbs' phenomena is re-examined as the convolution of a step function with a sinc-like function, illustrating the inherent overshoot in signal processing.
- The modification of Lanczos' window coefficients demonstrates that smoothing discontinuities in the window shape leads to more rapid convergence.
- The von Hann window, or raised cosine, provides greater smoothness than the Lanczos window but still suffers from side lobes that allow spectral leakage.
- The Hamming window was specifically engineered as a 'raised cosine on a platform' to minimize the maximum side lobe height.
- Choosing between windows involves a trade-off between total mean square leakage and the suppression of specific strong interference lines.
The Hamming window was devised to make the maximum side lobe a minimum.
and the convolution of a step function (a discontinuity) with it will, upon inspection, be Gibbsâ phenomena.Figure 15.II . Thus we see Gibbsâ phenomena overshoot in another light.
Some rather difficult trigonometric manipulation will directly convince you whether we sample the
function and then limit the range of observations, or limit the range and then sample, we will end up with
the same result; theory will tell you the same thing.
The simple modification of the discrete Lanczosâ window by changing only the outer two coefficients
from 1 to 1/2 produced a much better window. Lanczosâ window with its sigma factors modified all thecoefficients, but its shape had a corner at the ends , and this means, due to periodicity, there are two
discontinuity in the first derivative of the window shapeâhence slow convergence. If we reason usingweights on the coefficien ts of the raw Fourier series of the form of a raised cosine
then we will have something like Lanczosâ window bu t now there will be greater smoothness, hence more
rapid convergence.
Writing this out in the exponential form we find the weights on the exponentials are
This is the von Hann windowâsmoothing in the domain of the data with these weights is equivalent towindowing (multiplying) in the frequency domain. Actually I had rediscovered the von Hann window in theearly days of our work in power spectra, and later J ohn Tukey found von Hann had used it long, long before
in connection with economics. An examination of what it does to the signal shows it tails off rapidly, buthas some side lobes through which other parts of the spectrum âleak inâ.
We were at times dealing with a spectrum which had a strong line in it, and when looking elsewhere in
the spectrum through the von Hann window its side lobes might let in a lot of power. The Hamming windowwas devised to make the maximum side lobe a minimum. The cost is there is much more total leakage in the
mean square sense, but a single strong line is kept under control. If you call the von Hann window a âraisedcosineâ with weights
then the Hamming window is a âraised cosine on a platformâ with weights
DIGITAL FILTERSâII 113
Windows and Collaborative Science
- The Hamming window is often used due to its mysterious aura, though the von Hann window is frequently superior for general applications.
- Hamming recounts how John Tukey named the 'hamming' window after him, illustrating how fame often comes from the recognition of peers.
- The era of the isolated individual scientist is ending, making teamwork and cooperation essential for modern complex projects.
- Hamming advises helping others and letting them take the lead on publications to avoid the appearance of stealing ideas and to build professional goodwill.
- The systematic design of nonrecursive filters begins with an ideal filter model, such as a low pass, high pass, or differentiator.
- Differentiators are particularly sensitive to high-frequency noise because the differentiation process multiplies the signal by its frequency.
The Hamming window has a mysterious, hence popular, aura about it with its peculiar coefficients, but it was designed to do a particular job and is not a universal solution to all problems.
(Figure 15.IV ). Actually the weights depend on N, the length of data, but so slightly these constants are
regularly used for all cases. The Hamming window has a mysterious, hence popular, aura about it with its
peculiar coefficients, but it was designed to do a particular job and is not a universal solution to all
problems. Most of the time the von Hann window is preferable. There are in the literature possibly 100various windows, each having some special merit, and none having all the advantages you may want.
To make you a true insider in this matter I must tell you yet another story. I used to tease John Tukey you
are famous only when your name was spelled with a lower case letter such as watt, ampere, volt, fourier(sometimes), and such. When Tukey first wrote up his work on Power Spectra, he phoned me fromPrinceton and asked if he could use my name on the Hamming window. After some protesting on thematter, I agreed with his request. The book ca me out with the name âhammingâ! There I am!
It must be your friends, in some sense, who make y ou famous by quoting and citing you, and it pays, so I
claim, to be helpful to others as they try to do their work. They may in time give you credit for the work,which is better than trying to claim it yourself. Cooperation is essential in these days of complex projects;the day of the individual worker is dying fast. Team work is more and more essential, and hence learning towork in a team, indeed possibly s eeking out places where you can help others, is a good idea. In any case
the fun of working with good people on important problems is more pleasure than the resulting fame. Andthe choice of important problems means generally management will be willing to supply all the assistanceyou need.
In my many years of doing computing at Bell Teleph one Laboratories I was very careful never to write
up a result which involved any of the physics of the situ ation lest I get a reputation for âstealing otherâs ideasâ.
Instead I let them write up the result s, and if they wanted me to be a co-author, fine! Teamwork implies a
very careful consideration for othe rs and their contributions, and they may see their contributions in a
different light than you do!
Figure 15.IV
114 CHAPTER 15
16
Digital FiltersâIII
We are now ready to consider the systematic design of nonrecursive filters. The design method is based on
the Figure 16.I , which has 6 parts. On the upper left is a sketch of the ideal filter you wish to have. It can be
a low pass, a high pass, a band pass, a band stop, a notch filter, or even a differentiator. For other thandifferentiator filters you usually want either 0 or 1 as the height in the various intervals, while for the
differentiator you want iĎ since the derivative of the eigenfunction is
hence the desired eigenvalues are the coefficient iĎ. For a differentiator there is likely to be a cutoff at some
frequency because, as you can see, di fferentiation magnifies, multiplies by Ď, and is larger at the high
frequencies, which is where the noise usually is, Figure 16.II . See also Figure 15.II .
Figure 16.I
Digital Filter Design Methods
- The design of digital filters begins with computing Fourier coefficients for a desired transfer function using complex exponentials.
- Truncating an infinite Fourier series to a finite number of terms introduces the Gibbs' effect, which causes unwanted oscillations.
- Windowing techniques are applied to the truncated coefficients to mitigate the Gibbs' effect and refine the filter's performance.
- Traditional filter design is often a trial-and-error process involving the manual selection of term counts and window shapes.
- The Kaiser design method automates this by calculating the necessary number of terms and window parameters based on specified tolerance and transition width.
- Final filter coefficients are derived by multiplying original Fourier coefficients by window weights, often involving Bessel functions for precision.
It is a âtrial and errorâ design method.
The coefficients of the correspond ing formal Fourier series are easily computed since the integrands of
their expressions are straightforward (using integration by parts when you have a derivative). Suppose we
represent the series in the form of the complex exponentials. Then the coef ficients of the filter are just the
Fourier coefficients of the corresponding exponential terms. On the upper right of Figure 16.I we have a sketch
of the coefficients symbolically (t hey are, of course, complex numbers).
Next, we must truncate the infinite Fourier series to 2 N+1 terms (meaning use a rectangular window),
shown just below in Figure 16.I with the corresponding Fourier representation on the left showing Gibbsâ
effect.
Third, we then choose a window to remove the worst of this Gibbsâ effect. The windowed coefficients are
shown on the lower right, with the corresponding final digital filter on the lower left. In practice, you should
round off the filter coefficients before evaluating th e transfer function so their effect will be seen.
In the method as sketched above, you must choose both the N, the number of terms to be kept, and the
particular window shape, and if what you get does not suit you then you must make new choices. It is a
âtrial and errorâ design method.
J.F.Kaiser has given a design method which finds both the N and the member of a family of windows to
do the job. You have to specify two things beyond the shape: the vertical distance you are willing to toleratemissing the ideal, labeled δ, and the transition width between the pass and stop bands, labeled âF,
Figure 16.III .
For a band pass filter, with f
p as the band pass and fs as the band stop frequen cies, the sequence of design
formulas is:
If N is too big you stop and reconsider your design. Otherwise you go ahead and compute in turn:
Figure 16.II
116 CHAPTER 16
Figure 16.III
(this is plotted in Figure 16.IV ). The original Fourier coefficients for a band pass filter are given by:
These coefficients are to be multi plied by the corresponding weights wk of the window
where
DIGITAL FILTERSâIII 117
I0(x) is the pure imaginary Bessel function of order 0. For computing it you will need comparatively few
terms as there is an n! squared in the denominator and hence the series converges rapidly.
I0(x) is best computed recursively; for a given x the successive terms of the series are given by
For a low pass or a high pass one of the two frequencies fp or ps has the limit possible for it. For a band stop
filter there are slight changes in the formulas for the coefficients ck.
Let us examine Kaiserâs window coeffi cients, the wk:
Kaiser Windows and FFT Origins
- The Kaiser window provides a flexible mathematical framework for filter design that replaces guesswork with specific parameters.
- James Kaiser developed his formulas through a combination of theoretical insight and experimental computer trials, using I0 functions to approximate prolate spheroidal functions.
- The Kaiser method is computationally efficient enough for handheld devices but can occasionally fail when ripples from multiple edges combine.
- The Fast Fourier Transform (FFT) reduced computational complexity from N-squared to N log N, revolutionizing science and engineering.
- The author recounts a personal anecdote about missing the opportunity to develop the FFT because he mistakenly labeled it a 'bad idea' based on outdated hardware limitations.
All I remembered was it was one of Tukeyâs few bad ideas; I completely forgot why it was badâ namely because of the equipment I had at time.
As we examine these numbers we see they have, for Îą>0, something like the shape of a raised cosine
and resemble the von Hann and Hamming windows. There is a âplatformâ when A>21. For A<21 then a=0,
all the wk=1 and it is a Lanczosâ type window. As A increases the platform gr adually appears. Thus the
Kaiser window has properties like many of the more popular ones, and the particular window you use is
determined from your specifications via his window rather than by guess or prejudice.
How did Kaiser find the formulas? To some extent by trial and error. He first assumed he had a single
discontinuity and he ran a large number of cases on a computer to see both the rise time âF and the ripple
height δ. With a fair amount of thinking, plus a touch of genius, and noting as a function of A, as A
increases we pass from a Lanczosâ window ( A<21) to a platform of increasing height, 1/ I0(Îą). Ideally he
wanted a prolate spheroidal function but he noted they are accurately ap proximated, for his values, by the I0
(x). He plotted the results and approximated the functions. I asked him how he got the exponent 0.4. He
replied he tried 0.5 and it was too large, and 0.4, being the next natural choice, seemed to fit very well. It is
a good example of using what one knows plus the computer as an experimental tool, even in theoreticalresearch, to get very useful results.
Kaiserâs method will fail you once in a while because there will be more than one edge (indeed, there is
the symmetric image of an edge on the negative part of the frequency line) and the ripples from differentedges may by chance combine and make the filter ripples go beyond the designated amount. In this case,
which seldom arises, you simply repeat the design with a smaller tolerance. The whole program is easily
accommodated on a primitive hand held programmable comp uter like the TI-59, let alone on a modern PC.
Figure 16.IV
118 CHAPTER 16
We next turn to the finite Fourier series . It is a remarkable fact the F ourier functions are orthogonal, not
only over a line segment, but for any discrete set of equally spaced points. Hence the theory will go muchthe same, except there can be only as many coefficients determined in the Fourier series as there are points.
In the case of 2 N points, the common case, there is one term of the highest frequency only, the cosine term
(the sine term would be identically zero at the sample points). The coefficients are determined as sums of
the data points multiplied by the appropriate Fourier functions. The resulting re presentation will, within
roundoff, reproduce the original data.
To compute an expansion it would look like 2 N terms each with 2 N multiplications and additions, hence
something like (2 N)
2 operations of multiplication and addition. But using both: (1) the addition and
subtraction of terms with the same multiplier before doing the multiplications, and (2) producing higherfrequencies by multiplying lower ones, the Fast Fo urier Transform (FFT) has emerged requiring about N log
N operations. This reduction in computing effort has greatly transformed whole areas of science and
engineeringâwhat was once impossible in both time and cost is now routinely done.
Now for another story from life. You have all heard about the Fast Fourier Transform, and the Tukey-
Cooley paper. It is sometimes called the Tukey-Cooley transform, or algorithm. Tukey had suggested to me,
sort of, the basic ideas of the FFT. I had at that ti me an IBM Card Programmed Calculator (CPC), and the
âbutterflyâ operation meant it was completely impractic able to do with the equipment I had. Some years
later I had an internally programmed IBM 650 and he re marked on it again. All I remembered was it was
one of Tukeyâs few bad ideas; I completely forgot w hy it was badâ namely because of the equipment I had
at time. So I did not do the FFT, though a book I had already published (1961) shows I knew all the factsnecessary, and could have done it easily!
The Logic of Impossibility
- Understanding the specific constraints that make a task impossible is as important as knowing it cannot be done.
- Retaining the underlying reasoning allows for future re-evaluation when circumstances change.
- A shift in technology or environment may invalidate the original reason for failure.
- Deep knowledge of limitations prevents permanent dismissal of potentially viable ideas.
- Strategic memory of 'why' fosters innovation by identifying when barriers have been removed.
Moral: when you know something cannot be done, also remember the essential reason why, so later,
Moral: when you know something cannot be done, also remember the essential reason why, so later,
The Pitfalls of Spectral Analysis
- The author warns against the common intellectual error of assuming a task is impossible based on outdated circumstances or past failures.
- Power spectra are essential tools for analyzing 'black boxes,' historically enabling breakthroughs like Bohr's model of the atom by focusing on signal properties rather than time origins.
- The act of sampling a continuous signal inherently alters it, convolving the original data with a window function that smears pure spectral lines.
- Using the Fast Fourier Transform (FFT) forces a continuous spectrum into a discrete line spectrum, imposing a false periodicity that may not exist in the original signal.
- Standard data processing techniques, such as removing the mean or linear trends, can introduce significant discontinuities and distortions into the resulting spectrum.
We force all nonharmonic frequencies into harmonic onesâwe force a continuous spectrum to be a line spectrum!
when the circumstances have changed, you will not say, âIt canât be doneâ. Think of my error! How much
more stupid can anyone be? Fortunately for my ego, it is a common mistake (and I have done it more thanonce) but due to my goof on the FFT I am very sensitive to it now. I also note when others do itâwhich isall too often! Please remember the story of how stupid I was and what I missed, an d not make that mistake
yourself. When you decide something is not possible, donât say at a later date it is still impossible withoutfirst reviewing all the details of why you originally were right in saying it couldnât be done.
I must now turn to the delicate topic of power spectra, which is the sum of the squares of the two
coefficients of a given frequency in the real domain, or the square of the absolute value in the complex
notation. An examination of it will convince you this quantity does not depend on the origin of the time, butonly on the signal itself, contrary to the dependence of the coefficients on the location of the origin. Thespectrum has played a very important ro le in the history of science and en gineering. It was the spectral lines
which opened the black box of the atom and allowed Bohr to see inside. The newer Quantum Mechanics,starting around 1925, modified things slightly to be sure, but the spectrum was still the key. We alsoregularly analyse black boxes by examining the spectrum of the input and the spectrum of the output, alongwith correlations, to get an understanding of the insidesânot that there is always a unique insides, butgenerally we get enough clues to formulate a new theory.
Let us analyse carefully what we do and its implications, because what we do to a great extent controls
what we can see . There is, usually, in our imaginations at least, a continuous signal. This is usually endless,
and we take a sample in time of length 2 L. This is the same as multiplying the signal by a Lanczosâ
window, a box car if you prefer. This means the original signal is convolved with the correspondingfunction of the form (sin x)/x function, Figure 16.V âthe longer the signal the narrower the (sin x)/x loops
are. Each pure spectral line is smeared out into its (sin x)/x shape.DIGITAL FILTERSâIII 119
Next we sample at equal spaces in time, and all the higher frequencies are aliase d into lower frequencies.
It is an obvious interchanging these two operations, and sampling and then limiting the range, will give the
same resultsâand as I earlier said I once carefully wo rked out all the algebraic details to convince myself
what I thought had to be true from theory was indeed true in practice.
Then we use the FFT, which is only a cute, accurate, wa y of getting the coefficien ts of a finite Fourier
series. But when we assume the fi nite Fourier series representation we are making the function periodic â
and the period is exactly the sampling interval size times the number of samples we take! This period hasgenerally nothing to do with the periods in the original signal. We force all nonharmonic frequencies into
harmonic ones âwe force a continuous spectrum to be a line sp ectrum! This forcing is not a local effect,
but as you can easily compute, a nonharmonic frequency goes into all the other frequencies, most stronglyinto the adjacent ones of course, but nont rivially into more remote frequencies.
I have glossed over the standard statistical trick of removing the mean, either for convenience, or because
of calibration reasons. This reduces the amount of the zero frequency in the spectrum to 0, and produces a
significant discontinuity in the spectrum. If you later use a window, you merely smear this around toadjacent frequencies. In processing data for Tukey I regularly removed linear trend lines and even trend
parabolas from some data on the flight of an airplane or a missile, and then analyzed the remainder. But the
spectrum of a sum of two signals is not the sum of the spectraânot by a long shot!
Aliasing Noise and Linear Limits
- Algebraic addition of frequencies during function summation can lead to false results and slow coefficient decay due to discontinuities.
- The sampling process aliases high-frequency noise into lower frequencies, often resulting in a flat 'white noise' spectrum.
- Over-sampling allows for the use of low-pass filters to remove noise that exists beyond the signal's frequency range.
- Fourier analysis of stock markets only proves the unpredictability of future prices when restricted to simple linear predictors.
- Numerical integration methods, such as predictor-corrector formulas, are effectively recursive digital filters that can produce unbounded outputs.
- Physical environments dictate error growth; for example, the lack of air drag on the moon leads to quadratic error growth in position calculations.
A little knowledge is a dangerous thingâespecially if you lack the fundamentals!
When you add twofunctions the individual frequencies are added algebraically, and they may happen to reinforce or cancel
each other, and hence give entirely false results! No one I know has any reasona ble reply to my objections
hereâwe still do it partly because we do not know what else to do âbut the trend line has a big
discontinuity at the end (remember we are assuming that the functions are all periodic) and hence itscoefficients fall off like 1/ k, which is not rapid at all!
Let us turn to theory. Every spectrum of real noise falls off reasonably rapidly as you go to infinite
frequencies, or else it would have infinite energy. Figure 16.VI . But the sampling process aliases the higher
frequencies in lower ones, and the folding as indica ted, tends to produce a fl at spectrumâremember the
frequencies when aliased are algebraically added. Hence we tend to see a flat spectrum for noise, and if it is
flat then we call it white noise . The signal, usually, is mainly in the lower frequencies. This is true for
several reasons, including the reason âover samplingâ (sampling more often than is required from theNyquist theorem), means we can get some averaging to reduce the instrumental errors. Thus the typicalspectrum will look as shown in the Figure 16.VI . Hence the prevalence of low pass filters to remove the
Figure 16.V
120 CHAPTER 16
noise. No linear method can separate the signal from th e noise at the same frequencies, but those beyond the
s i gn a l c a n be s o r e m ov e d b y a lo w pa ss f il t e r . T h er efore, when we âover samp leâ we have a chance to
remove more of the noise by a low pass filter.
Remember, there is the implicit understanding we are processing a linear sy stem. The old stock market
Fourier analysis which revealed there was only whit e noise was interpreted to mean there was no way of
predicting the future prices of the stocksâand this is correct only if you intend to use simple linear
predictors. It says nothing about the practical use of nonlinear predictors, however. Once again a wide
spread misinterpretation of a re sult because of a lack of unders tanding of the basics behind the
mathematical tool, and only knowing the tool itself. A little knowledge is a dangerous thingâespecially if
you lack the fundamentals!
I carefully said in the opening talk on digital filters I thought at that time I knew nothing about them.
What I did not know was, because I was then ignorant of recursive digital filter design, I had effectively
created it when I examined closely the theory of predictorcorrector methods of numerically solving ordinary
differential equations. The corrector is practically a recurs ive digital filter!
While doing the study on how to integrate a system of ordinary differential equation numerically I was
unhampered by any preconceived ideas about digital f ilters, and I soon realized a bounded input, in the
words of the filter experts, could produce, if you were integrating, an unbounded outputâwhich they said
was unstable, but clearly it is just what you must have if you are to integrate; even a constant will produce a
linear growth in the out put. Indeed, when later I faced integratin g trajectories down to the surface of the
moon where there is no air, hence no drag, hence no first derivatives explicitly in the equations, and wanted
to take advantage of this by using a suitable formula for numerical integration, I found I had to have a
quadratic error growth; a small roundoff error in the computation of the acceleration would not be corrected
and would lead to a quadratic error in position: an er ror in the acceleration produ ces a quadratic growth in
position. That is the nature of the problem, unlike on earth where the air drag provides some feedbackcorrection to the wrong value of the acceleration and hence some correction to the error in the position.
Figure 16.VI
DIGITAL FILTERSâIII 121
Recursive Digital Filters
- Digital filter stability is defined by the absence of exponential growth from bounded inputs, differing from classical analog criteria.
- Recursive filters utilize feedback by incorporating previous output values into the current calculation, which necessitates constant stability monitoring.
- The use of past values is often a constraint of real-time processing, though non-real-time data allows for more accurate two-sided prediction.
- A recursive filter is mathematically equivalent to a linear difference equation with constant coefficients where the signal acts as a forcing function.
- In steady-state operation, a linear filter only outputs the input frequency, though phase shifts and transient frequencies may occur.
In picture processing, a recursive digital filter which used only data from one side of the point being processed would be foolish since it would not to use some of the available, relevant information.
Thus I have to this day the attitude stability in digital filters means ânot exponential growthâ from bounded
inputs, but allows polynomial growth, and this is not the standard stability criterion derived from classicalanalog filters, where if it were not bounded you would melt things down âand anyway they had neverreally thought hard about integration as a filter process.
We will take up this important topic of recursive filt ers, which are necessary for integration, in the next
chapter. 122 CHAPTER 16
17
Digital FiltersâIV
We now turn to recursive filters which have the form
From this formula it will be seen we have values on only one side of the current value n, and we use both
old and the current signal values, un, and old values of the outputs, yn. This is classical, and arises because we
are often processing a signal in r eal time and do not have access to future values of the signal.
But considering basics, we see if we did have âfuture valuesâ then a two sided prediction would probably
be much more accurate. We would then, in computing the yn values, face a system of simultaneous linear
equationsânothing to be feared in these days of cheap computing. We will set aside this observation,
noting only often these days we record the signal on a tape or other media, and later process it in the labâ
and therefore we have the future ava ilable now. Again, in picture processing, a recursive digital filter which
used only data from one side of the point being processed would be foolish since it would not to use some
of the available, relevant information.
The next thing we see is the use of old output as new input means that we have feedback âand this
automatically means questions of stability . It is a condition we must watch at all times in the design of a
recursive filter; it will restrict what we can do. Stability here means the effects of the initial conditions do
not dominate the results.
Being a linear system we see whatever pure frequenc y we put into the filter when in the steady state, only
that frequency can emerge, though it may be phase shifted. The transients, however, can have other
frequencies which arise from the solution of the homogeneous difference equation. The fact is we are
solving a difference equation with constant coefficients with the u n terms forming the âforcing functionââ
that is exactly what a recursive filter is, and nothing else.
We therefore assume for the steady st ate (which ignore s the transients)
(with the Aâs possibly complex to allow for the phase shift), and this leads, on solving for the ratio of A0/AI,
to the transfer function
Feedback and System Instability
- Recursive filters are represented as rational functions in a complex variable rather than simple polynomials.
- The design of recursive filters currently lacks a systematic theory, relying instead on specialized methods like Butterworth, Chebyshev, and elliptic filters.
- Feedback systems are prone to instability, often caused by a delay between an action and the detection of its result.
- The author uses a hotel shower analogy to illustrate how delayed feedback leads to 'hunting' and oscillation around a target temperature.
- Recursive digital filters share theoretical roots with predictor-corrector formulas used in solving ordinary differential equations.
- Stability is more complex in differential equations because feedback paths can be both linear and nonlinear.
I found myself, in spite of many experiences, in the same classic hunting situation of instability.
This is a rational function in the complex variable exp {iĎt}=z rather than, as before with non-recursive
filters, a polynomial in z. There is a theory of Fourier series represen tation of a function; there is not as yet a
theory of the representation of a function as the ratio of two Fourier series (though I see no reason why there
cannot be such a theory). He nce the design methods are at present no t systematic (as Kais er did for the non-
recursive filter design theory), bu t rather a collection of trick methods. Thus we have Butterworth, two
types of Chebyshev (depending on having the equal ripples) in the pass or the stop band, and elliptic filters
(whose name comes from the fact elliptic functions are used) which are equal ripple in both.
I will only talk about the topic of feedback. To ma ke the problem of feedback graphic I will tell you a
story about myself. One time long ago I was host of a series of six, one half hour, TV programs aboutcomputers and computing, and it was made mainly in San Francisco. I found myself out there frequently,
and I got in the habit of staying always in the same room in the same hotelâit is nice to be familiar with the
details of your room when you are tired late at night or when you may have to get up in the middle of thenight âhence the desire for the same room.
Well, the plumber had put nice, large diameter pipes in the shower, Figure 17.I . As a result in the
morning when I started my shower it was too cold, so I turned up the hot water knob, still too cool, so more,still too cool, and more, and then when it was the right temperature I got in. But of course it got hotter andhotter as the water which was admitted earlier finally got up the pipe and I had to get out, and try again tofind a suitable adjustment of the knob. The delay in the hot water getting to me was the trouble. I found
myself, in spite of many experiences , in the same classic hunting situation of instability. You can either view
my response as being too strong (I was too violent in my actions), or else the detection of the signal was toomuch delayed, (I was too hasty in getting into the tub). Same effect in the long run! Instability! I neverreally got to accept the large delay I had to cope with, hence I daily had a minor trouble first thing in themorning! In this graphic example you see the essen ce of instability.
I will not go on to the design of recursive digital filters here, only note I had effectively developed the
theory myself in coping with corrector formulas fo r numerically solving ordinary differential equations.
The form of the corrector in a predictor-corrector method is
We see the uj of the recursive filter are now the derivatives ynâ of the output and come from the differential
equation. In the standard nonrecursiv e filter there no feedback pathsâthe yn that are computed do not appear
later in the right hand side. In the differential equation formula they appear both in this feedback path and
also through the derivative terms they form another, usually nonlinear, feedback path. Hence stability is a more
difficult topic for differential equations than it is for recursive filters.
Figure 17.I
124 CHAPTER 17
Challenging Expertise and Choosing Problems
- The author identifies a counter-example to the common claim that all recursive filters must have an infinite impulse response.
- He argues that experts often repeat inherited knowledge without questioning its validity in current contexts.
- A chance encounter leads the author to a difficult problem involving the differentiation of ragged radioactive spectrum data.
- The author insists on visiting the physicist's laboratory to evaluate the researcher's competence before committing to the project.
- The author emphasizes the importance of vetting the significance of a problem before dedicating time and effort to it.
- The ultimate moral is to prioritize working on problems that have the potential for significant impact.
If you will only ask yourself, âIs what I am being told really true?â it is amazing how much you can find is, or borders on, being false, even in a well developed field!
These recursive filters are often called âinfinite impulse response filtersâ (IIR) because a single
disturbance will echo around the feedback loop, which even if the filter is stable will die out only like a
geometric progression. Being me, of course I asked myself if all recursive filters had to have this property,
and soon found a counter example. True, it is not th e kind of filter you would normally design, but it
showed their claim was superficial. If you will only ask yourself, âIs what I am being told really true?â it isamazing how much you can find is, or borders on, being false, even in a well developed field!
In Chapter 26 I will take up the problem of dealing with the expert. Here you see a simple example of
what happens all too often. The experts were told something in class when they were students first learningthings, and at the time they did not question it. It becomes an accepted fact, which they repeat and never
really examine to see if what they are saying is true or not, especially in their current situation.
Let me now turn to another story. A lady in the Ma thematics Department at Bell Telephone Laboratories
was square dancing with a physicist one weekend at a party, and on Monday morning in the hallway shecasually mentioned to me a problem he had. He was measuring the number of counts in a radioactive
experiment at each of, as I remember, 256 energy le vel. It is called âthe spectrum of the processâ. His
problem was he needed the derivative of the data.
Well, you know: (a) the number of nuclear counts at a given energy is bound to produce a ragged curve,
and (b) differentiating this to get the local slope is going to be a very difficult thing to do. The more Ithought about her casual remark the more I felt he needed real guidanceâmeaning me! I looked him up inthe Bell Telephone Laboratories phone book and explained my interest and how I got it. He immediatelywanted to come up to my office, but I was obdurate and insisted on meeting in his laboratory. He tried usinghis office, and I stuck to the lab. Why? Because I wanted to size up his abilities and decide if I thought hisproblem was worth my time and effort, since it promised to be a tough nut to crack. He passed the lab test withflying colorsâhe was clearly a very co mpetent experiment er. He was at about the li mit of what he could do
âa weekâs run to get the data and a lot of shielding was around the radio-active source, hence not much wecould do to get better data. Furthermore, I was soon convinced, although I knew little about the details, hisexperiment was important to physics as well as to Bell Telephone Laboratories. So I took on the problem.
Moral: To the extent you can choose, then work on problems you think will be important.
The Curse of Expertise
- The author highlights how specialized expertise can limit a professional's ability to apply their skills to non-traditional contexts, such as treating energy as a time variable.
- By modeling theoretical expectations against synthetic data, the team identified that the signal occupied only 5% of the Nyquist interval, allowing for 95% noise removal.
- The author emphasizes the importance of 'degrees of freedom' in data processing, correcting the physicist's attempt to dishonestly adjust filter cutoffs mid-run.
- A successful collaboration resulted in a classic paper after the author persuaded the physicist to use square roots of counts to achieve equal variances.
- The narrative argues for the increasing necessity of 'generalists' who can bridge narrow specializations and maintain a broader, honest view of scientific problems.
- Digital filtering is often associated with time signals, but its future utility lies in diverse, special-purpose studies across various independent variables.
The curse of the expert with their limited view of what they can do.
Obviously it was a smoothing problem, and Kaiser was just teaching me the facts, so what better to do
than take the experimentalist to Kaiser and get Ka iser to design the appropriate differentiating filter?
Trouble immediately! Kaiser had always thought of a signal as a function of time, and the square of the areaunder the curve as the energy, but here the energy wa s the independent variable! I had repeated trouble with
Kaiser over this point until I bluntly said, âAll right, his energy is time and the measurements, the counts, is
the voltageâ. Only then coul d Kaiser do it. The curse of the expert with their limited view of what they can
do. I remind you Kaiser is a very able man, yet his ex pertise, as so often happens to the expert, limited his
view. Will you in your turn do better? I am hoping such stories as this one will help you avoid that pitfall.
As I earlier observed, it is usually the signal which is in the lower part of the Nyquist interval of the
spectrum and the noise is pretty well across the whole of the Nyquist in terval, so we needed to find the
cutoff edge between the meaningful physicistâs signal and the flat white noise. How to find it? First, I
extracted from the physicist the theoretical model he had in his mind, which was a lot of narrow spectral
lines of gaussian shape on top of a broad gaussian shape (I suspected Cauchy shapes, but did not argue withhim as the difference would be too small, given the kind of data we had). So we modeled it, and he created
some synthetic data from the model. A quick spectral analysis, via an FFT, gave the signal confined to thelowest 1/20 of the Nyquist interval. Second, we processed a run of his experimental data and found thesame location for the edge! What luck! (Perhaps the lu ck should be attributed to the excellence of the
experimenter.) For once theo ry and practice agreed! We would be ab le to remove about 95% of the noise.DIGITAL FILTERSâIV 125
Kaiser finally wrote for him a program which would, given the cutoff edge position wherever the
experimenter chose to put it, design the corresponding filter. The program: (1) designed the correspondingdifferentiating filter, (2) wrote the program to compute the smoothed output, and then (3) processed the datathrough this filter without any interference from the physicist.
I later caught the physicist adjusting the cutoff edge for different parts of the energy data on the same run,
and had to remind him there was su ch a thing as âdegrees of freedomâ, and what he was doing was not
honest data processing. I had much more trouble, once things were going well, to persuade him to get themost out of his expensive data, he should actually work in the square roots of the counts as they had equalvariances. But he finally saw the light and did so. He and Kaiser wrote a classic paper in the area, as it
opened the door on a new range of things which could be done.
My contribution? Mainly, first identifying the problem, next getting the right people together, then
monitoring Kaiser to keep him straight on the fact filtering need not have exclusively to do with timesignals, and finally, reminding them of what they knew from statistics (or should have known and probablydid not).
It seems to me from my experience this role is increasingly needed as people get to be more highly
specialized and narrower and narrower in their knowledge. Someone has to keep the larger view and see to
it things are done honestly. I think I came by this role from long a long education in the hands of JohnTukey, plus a good basic grounding in the form of the universal tool of Science, namely Mathematics. I will
talk in Chapter 23 about the nature of Mathematics.
Most signal processing is indeed done on time signals. But most digital filters will probably be designed
for small, special purpose studies, not necessarily si gnals in time. This is where I ask for your future
attention.
Digital Filtering for Management
- Digital filters are essential tools for top-level managers to identify long-term trends within noisy organizational data.
- Applying low-level filtering to non-standard datasets often yields greater gains than using them for traditional engineering tasks like radar reduction.
- Fourier analysis assumes a linear model and can lead to massive financial waste when applied to highly nonlinear phenomena.
- The running median filter is a powerful nonlinear tool that smooths local noise while preserving sharp discontinuities in a system.
- Every linear theory, including digital signal processing, is governed by an inherent uncertainty principle similar to that of Quantum Mechanics.
- Managers must be wary of using intellectual tools like Fourier analysis simply because they do not know what else to do.
When this was pointed out to them, their reply seemed to be they did not know what else to do, so they persisted in doing the wrong thing!
Suppose when you are in charge of things at the top, you are interested in some data which showspast records of relative expenses of manpower to equi pment. It is bound to be noisy data, but you would like
to understand, in a basic sense, what is going on in the organizationâwhat long term trends are happening-so slowly people hardly sense them as they happen, but which never-the-less ar e fundamental to understand
if you are to manage well. You will need a digital filter to smooth the data to get a glimpse of the trend, if it
exists. You do not want to find a trend when it does not exist, but if it does you want to know pretty muchwhat it has been, so you can project what it is likely to be in the near future. Indeed, you might want toobserve, if the data will support it, any change in the slope of the trend. Some si gnals, such as the ratio of
fire power to tonnage of the corresponding ship, need not involve time at all, but will tell you somethingabout the current state of the Navy. You can, of course, also study the relationship as a function of time.
I suggest strongly, at the top of your career you will be able to use a lot of low level digital filtering of
signals, whether in time or not, so you will be bett er able to manage things. Hence, I claim, you will
probably design many more filters for such odd jo bs than you will for radar data reduction and such
standard things. It is usually in the new applications of knowledge where you can expect to find the greatestgains.
Let me supply some warnings against the misuse of intellectual tools, and I will talk Chapter 27 on topics
closer to statistics than I have time for now. Fourier analysis implies linearity of the underlying model. Youcan use it on slightly nonlinear situations, but often elaborate Fourier analyses have failed because theunderlying phenomena was too nonlinear. I have seen mi llions of dollars go down that drain when it was
fairly obvious to the outsider the nonlinearities would vitiate all the linear analysis they could do using theFourier function approach. When this was pointed out to them, their reply seemed to be they did not knowwhat else to do, so they persisted in doing the wrong thing! I am not exaggerating here.
How about nonlinear filters? The possibilities are endless, and must, of course, depend on the particular
problem you have on hand. I will take up only one, the running median filter . Given a set of data you126 CHAPTER 17
compute the running median as the output. Consider how it will work in practice. First, you see it will tend
to smooth out any local noiseâthe median will be near the average, which is the straight line least squares fit
used for local smoothing. But at a discontinuity, Figure 17.II , say we picture a flat level curve and then a
drop to another flat curve, what will the filter do? With an odd number of terms in the median filter, you seethe output will stay up until you have more than half of the points on the lower level, where upon it will jump
to the lower level. It will follow the discontinuity fairly well, and will not try to smooth it out completely!For some situations that is the kind of filtering you want. Remove the noise locally, but do not lose thesudden changes in the state of the system being studied.
I repeat, Fourier analysis is linear, and there exist many nonlinear filters, but the theory is not well
developed beyond the running median. Kalmann filters are another example of the use of partially nonlinearfilters, the nonlinear part being in the âadaptingâ itself to the signal.
One final basic observation I made as I tried to learn digital filters. One day in examining a book on
Fourier integrals, I found there is a theorem which stat es the variability of the function times the variability
of its transform must exceed a certain constant. I said to myself , âWhat else is this than the famous
uncertainty principle of Quantum Mechanics?â Yes, ever y linear theory must have an uncertainty principle
Linearity and Simulation Risks
- The uncertainty principle in quantum mechanics may be a mathematical artifact of assuming linear time invariance rather than an inherent physical reality.
- The Eddington fisherman story illustrates how our tools and methodologies predetermine the limits of what we can observe.
- Scientific leadership requires a delicate balance between doubting established rules and accepting them to avoid paralysis.
- The shift from physical experimentation to computer simulation risks a return to scholasticism where textbooks are favored over reality.
- Simulations are increasingly preferred because they are cheaper, faster, and capable of modeling scenarios impossible to recreate in a laboratory.
It is as if you put on blue tinted glasses; everywhere you look you must see things with a bluish tint!
involving conjugate variables. Once you adopt the linear approach, and QM claims absolute additivity of
the eigenstates, then you must find an uncertainty pr inciple. Linear time invari ance leads automatically to
the eigenfunctions e
iĎt. They lead immediately to the Fourier integral, and Fourier integrals have the
uncertainty principal. It is as if you put on blue tinted glasses; everywhere you look you must see thingswith a bluish tint! You are therefore not sure the famous uncertainty principle of QM is really there or not;it may be only the effect of the assumed linearity. More than most people want to believe, what we seedepends on how we approach the problem! Too often we see what we want to see, and therefore you need toconsciously adopt a scientific attitude of doubting your own beliefs.
To illustrate this I will repeat the Eddington story of the fishermen. They used a net for fishing, and when
they examined the size of the fish they had caught th ey decided there is a minimu m size to the fish in the
sea.
In closing, if you do not, now and then, doubt accepted rules it is unlikely you will be a leader into new
areas; if you doubt too much you will be paralyzed and will do nothing. When to doubt, when to examine thebasics, when to think for yourself, a nd when to go on and accept things as th ey are, is a matter of style, and
I can give no simple formula on how to decide. You must learn from your own study of life. Big advancesusually come from significant changes in the underlying beliefs of a field. As our state of knowledgeadvances the balances between aspects of doing re search change. Similarly, when you are young then
serendipity has probably a long time to pay off, but when you are old it has little time and you shouldconcentrate more on what is at hand.
Figure 17.II
DIGITAL FILTERSâIV 127
18
SimulationâI
A major use of computers these days, after writing and text editing, grap hics, program comp ilation, etc. is
simulation .
A simulation is the answer to the question: âWhat ifâŚ?â
What if we do this? What if this is what happened?
More than 9 out of 10 experiments are done on computers these days. I have already mentioned my
serious worries we are depending on simulation more and more, and are looking at reality less and less, and
hence seem to be approaching the old scholastic attitude what is in the textbooks is reality and does not need
constant experimental checks. I will not dwell on this point further now.
We use computers to do simulations because they are:
1. cheaper,2. faster,3. often better,4. can do what you cannot do in the lab.
Simulations Versus Laboratory Experiments
- Programming simulations is often cheaper and faster than maintaining laboratory equipment, which suffers from both physical and intellectual 'shelf life.'
- Simulations can provide more accurate readings in dynamic situations and explore wider variable ranges than physical setups allow.
- A simulation can model scenarios where physical experimentation is impossible, such as the design of the first atomic bomb where critical mass is binary.
- Effective simulation requires deep domain expertise to determine which physical factors are vital and which can be safely ignored.
- The economic viability of a simulation depends on highly repetitive computational tasks that justify the initial cost of programming.
Intellectual shelf life is often more insidious than is physical shelf life.
On points 1 and 2, as expensive and slow as programming is, with all its errors and other faults, it isgenerally much cheaper and faster th an getting laboratory eq uipment to work. Furthe rmore, in recent years
expensive, top quality laboratory equipment has been purchased and then you often find in less than 10years it must be scrapped as being obsolete. All of the above remarks do not apply when a situation isconstantly recurring and the lab testing equipment is in constant use. But let lab equipment lie idle for some
time, and suddenly it will not work properly! This is called âshelf lifeâ, but it is some times the âshelf lifeâ
of the skills in using it rather than the âshelf lifeâ of the equipment itself! I have seen it all too often in my
direct experience. Intellectual shelf life is of ten more insidious than is physical shelf life.
On point 3, very often we can get more accurate read ings from a simulation than we can get from a direct
measurement in the real world. Fiel d measurements, or even laboratory measurements, are often hard to get
accurately in dynamic situations. Fu rthermore, in a simulation we can often run over much wider ranges of
the independent variables than we can do with any one lab setup.
On point 4, perhaps most important of all, a simulation can do what no experiment can do.I will illustrate these points with specific stories using simulations I have been involved in so you can
understand what simulations can do for you. I will also indicate some of the details so those who have hadonly a little experience with simulations will have a better feeling for how you go about doing oneâit is notfeasible to actually carry out a big simulation in class, they often take years to complete.
The first large computation I was involved with was at Los Alamos during WW-II when we were
designing the first atomic bomb. There is no possib ility of a small scale experimentâeither you have a
critical mass or you do not.
Without going into classified deta ils, you will recall one of the two designs was spherically symmetric
and was based on implosion, Figure 18.I . They divided the material and sp ace into many concentric shells.
They then wrote the equations for the forces on each she ll (both sides of it) as well as the equation of state
which gives, among other things, the density of the ma terial from the pressures on it. Next they broke time
up into intervals of 10â8 seconds (shakes, from a shake of a la mbâs tail, I suppose). Then for each time
interval we calculated, using the co mputers, where each shell would go an d what it would do during at that
time, subject to the forces on it. Th ere was, of course, a special treatment for the shock wave front from the
outer explosive material as it went through the region. But the rules were all, in principle, well known to
experts in the corresponding fields. The pressures were such there had to be a lot of guessing things would
be much the same outside the realms of past testing, but a little physics theory gave some assurances.
This already illustrates a main point I want to make. It is necessary to have a great deal of special
knowledge in the field of application. Indeed, I tend to regard many of the courses you have taken, and willtake, as merely supplying the corresponding expert knowledge. I want to emphasize this obvious necessityfor expert knowledgeâall too often I ha ve seen experts in simulation igno re this elementary fact and think
they could safely do simulations on their own. Only an expert in the field of application can know if what
you have failed to include is vita l to the accuracy of the simulation, or if it can safely be ignored.
Another main point is that in most simulations there has to be a highly repetitive part, done again and
again from the same piece of programming, or else yo u cannot afford to do the initial programming! The
same computations were done for each shell and then for each time intervalâa great deal of repetition!
Simulation and Machine Power
- The inherent power of modern machines often exceeds our ability to program them efficiently.
- Effective simulation design requires identifying and exploiting repetitive patterns within a problem.
- Weather prediction serves as a primary example of complex simulation through atmospheric modeling.
- Atmospheric simulations divide the air into discrete blocks with specific initial conditions.
- Key variables for these blocks include temperature, pressure, moisture, and velocity.
In many situations, the power of the machine itself so far exceeds our powers to program it is wise to look early and constantly for the repetitive parts of a proposed simulation.
In
many situations, the power of the m achine itself so far exceeds our powers to program it is wise to look
early and constantly for the repetitive parts of a prop osed simulation, and when possible cast the simulation
in the corresponding form.
A very similar simulation to the atomic bomb arises in weather prediction. There the atmosphere is
broken up into large blocks of air, and the relevant conditions for cloud cover, albedo, temperature, pressure,
moisture, velocity, etc. must be initially assigned to each block, Figure 18.II . Then using conventional
Figure 18.I
SIMULATIONâI 129
Stability and Simulation Limits
- Simulations are highly feasible when a system exhibits stability and resistance to small changes, but become difficult when outcomes are sensitive to minor details.
- The 'butterfly effect' illustrates how small perturbations in weather systems can lead to drastically different short-term results despite long-term seasonal stability.
- Identifying whether stability or instability dominates a problem is crucial before committing significant time and resources to a simulation.
- Practical experience with the NIKE missile system showed that simulations can lead to counter-intuitive design improvements, such as favoring vertical launches.
- The author warns that mathematical models and constants used for small perturbations may lose accuracy when a simulation leads to large-scale structural changes.
- Prudence is required when promising results from simulations, as some problems are practically impossible to handle due to inherent instabilities.
Indeed, it is claimed whether or not a butterfly flaps its wings in Japan can determine whether or not a storm will hit this country and how severe it will be.
physics for the atmosphere, we trace wh ere each block goes in a short time interval, along with the relevant
changes. It is the same kind of step by step evolution as before.
However, there is a significant difference between the two probl ems, the bomb and the weather
prediction. For the bomb small differ ences in what happened along the wa y did not greatly affect the overall
performance, but as you know the weat her is quite sensitive to small chan ges. Indeed, it is claimed whether
or not a butterfly flaps its wings in Japan can determine whether or not a storm will hit this country and how
severe it will be.
This is fundamental theme I must dwell on. When the simulation has a great deal of stability, meaning
resistance to small changes in its overall behavior, then a simulation is quite feasible; but when smallchanges in some details can produce greatly different ou tcomes then a simulation is a difficult thing to carry
out accurately. Of course, there is lo ng term stability in the weather; the seasons follow their appointed
rounds regardless of small details. Thus there is both s hort term (day to day) instabilities in the weather, and
longer term (year to year) stabilities as well. But the i ce ages show there are also very long term instabilities
in the weather, with apparently even longer stabilities!
I have met a large number of this last kind of prob lem. It is often very hard to determine in advance
whether one or the other, stability or instability, will dominate a problem, and hence the possibility orimpossibility of getting the desired answers. When you undertake a simulation, look closely at this aspect ofthe problem before you get too involved and then find, after a lot of work, money, and time, you cannot get
suitable answers to the problem. T hus there are situations which are eas y to simulate, ones which you cannot
in a practical sense handle at all, and most of the ot hers which fall between the two extremes. Be prudent in
what you promise you can do via simulations!
When I went to Bell Telephone Laboratories in 19461 soon found myself in the early stages of the design
of the earliest NIKE system of guided missiles. I was sent up to MIT to use their RDA #2 differential
analyser, given the interconnections of the parts of the analyser, and much advice from others who knew alot more than I did about how to run the simulations.
They had a slant launch in the original design, along with variational equations which would give me
information to enable me to make sensible adjustments to the various components, such as wing size. I shouldpoint out, I suppose, the solution time for one trajectory was about 1/2 hour, and about half way through onetrajectory I had to commit myself to the next trial shot. Thus I had lots of time to observe and to think hardas to why things went as they did. After a few days I gradually got a âfeelingâ for the missile behavior, whyit did as it did under the different guidance rules I had to supply. As time went on I gradually realized avertical launch was best in all cases; getting out of the dense lower air and into the thin air above was better
than any other strategyâI could well afford the later induced drag when I had to give guidance orders tobend the trajectory over. In doing so, I found I was gr eatly reducing the size of the wings, and realized, at
least fairly well, the equations and constants I had b een given, for estimating the changes in the effects due
to changes in the structure of the missile, could hardly be accurate over so large a range of perturbations
Figure 18.II
130 CHAPTER 18
Simulation Insight and Missile Design
- The author argues that slow, primitive computing allowed for an 'intimate feeling' for the simulation that high-speed volume output might have obscured.
- Early simulations led to radical design changes, including a vertical launch system and reducing wing size by two-thirds to improve end-game maneuverability.
- A philosophy of starting with simple simulations is advocated to gain system-wide insights before adding complex, disguising details.
- Conflicting data from the only available supersonic wind tunnels highlighted the extreme uncertainty in early guided missile development.
- The author applied similar simulation principles to traveling wave tube design, optimizing energy transfer between electron beams and electromagnetic waves.
Volume output seems to me to be a poor substitute for acquiring an intimate feeling for the situation being simulated.
(though they had never told me the source of the equations, I inferred it). So I phoned down for advice and
found I was rightâ I had better come home and get new equations.
With some delay due to other users wanting their time on the RDA #2, I was soon back and running
again, but with a lot more wisdom and experience. Again, I developed a feeling for the behavior of themissileâI got to âfeelâ the forces on it as various programs of trajectory shaping were tried. Hanging over
the output plotters as the solution slowly appeared gave me the time to absorb what was happening. I have often
wondered what would have happened if I had had a modern, high speed computer. Would I ever haveacquired the feeling for the missile, upon which so much depended in the final design? I often doubthundreds more trajectories would have taught me as muchâI simply do not know. But that is why I am
suspicious, to this day, of getting too many solutions and not doing enough very careful thinking about what
you have seen . Volume output seems to me to be a poor substitute for acquiring an intimate feeling for the
situation being simulated.
The results of these first simulati ons were we went to a vertical launch (which saved a lot of ground
equipment in the form of a circular rail and other complications), made many other parts simpler, andseemed to have shrunk the wings to about 1/3 of the size I was initially given. I had found bigger wings,while giving greater maneuverability in principle, produced in practice so much drag in the early stages ofthe trajectory the later slower velocity in fact gave le ss maneuverability in the âend gameâ of closing in on
the target.
Of course these early simulations used a simple atmosphere of exponential decrease in density as you go
up, and other simplifications, which in simulations do ne years later were all modified. This brings up
another belief of mineâdoing simple simulations at the early stages lets you get insights into the whole
system which would be disguised in any full scale simulation. I strongly advise, when possible, to start with
the simple simulation and evolve it to a more complete, more accurate, simulation later so the insights can
arise early. Of course, at the end, as you freeze the final de sign, you must put in a ll the small effects which
could matter in the final performance. But (1) start as simply as you can provided you include all the maineffects, (2) get the insights, and then (3) evolve the simulation to the fully detailed one.
Guided missiles were some of the earliest explorations of supersonic flight, and there was another great
unknown in the problem. The data from the only two supersonic wind tunnels we had access to flatlycontradicted each other!
Guided missiles led naturally to space flight where I played a less basic part in the simulations, and more
as an outside source of advi ce and initial planning of the mission profile, as it is called.
Another early simulation I recall wa s the travelling wave tube desi gn. Again, on primitive relay
equipment I had lots of time to mull over things, and I realized I could, as the co mputation evolved, know what
shape to give other than the always assumed constant diameter pipe. To see how th is happens, consider the
basic design of a travelling wave tube. The idea is you send the input wave along a tightly wound spiralaround a hollow pipe, and hence the effective velocity of the electromag netic wave down the pipe is greatly
reduced. We then send down the center of the pipe an electron beam. The beam has initially a greater
velocity than the wave has to go along the helix. Th e interaction of the wave and the beam means the beam
will be slowed downâmeaning energy goes from the beam to the wave, meaning the wave is amplified! But,
of course, there comes a place along the pipe when their velocities are about the same and further
interactions will only spoil things.
The Power of Active Minds
- The author improved wave energy transfer by calculating ideal pipe tapers and identifying nonlinear components that invalidated linear approximations.
- An active mind can contribute significantly to a specialized field even as an amateur by paying close attention to small details.
- Mastering jargon is essential for communication but dangerous because it can block thinking outside of a restricted area.
- Jargon serves as an evolutionary defense mechanism, similar to caveman tribalism, used to exclude outsiders from a group.
- Intellectual activity is required to gain the benefits of specialized language while avoiding its inherent pitfalls.
- Mathematics is not always a universal language, as demonstrated by complex simulations involving simultaneous differential equations.
Jargon is both a necessity and a curse.
So I got the idea if I gradually expanded the diameter of the pipe thenagain the beam would be faster than the wave and stil l more energy would be tr ansferred from the beam to
the wave. Indeed, it was possible to compute at each cycle of computation the ideal taper for the signal.
I also had the nasty idea since I had found the equations were really local linearizations of more complex
nonlinear equations, I could, at about every twentieth to fiftieth step, estimate the nonlinear component. ISIMULATIONâI 131
found to their amazemen t on some designs the estimated nonlinear component was larger than the computed
linear componentâthus vitiat ing the approximation and stop ping the useless computations.
Why tell the story? Because it illustrates another point I want to makeâan active mind can contribute to
a simulation even when you are dealing with experts in a field where you are a st rict amateur. You, with
your hands on all the small details, have a chance to see what they have not seen, and to make significant
contributions, as well as save machine time! Again, all too often I have seen things missed during thesimulation by those running it, and hence were not likely to get to the users of the results.
One major step you must do, and I want to emphasize this, is to make the effort to master their jargon. Every
field seems to have its special jarg on, one which tends to obscure what is going on from the outsider-and
also, at times, from the insiders! Be ware of jargonâlearn to recognize it for what it is, a special language to
facilitate communication over a restri cted area of things or events. But it also blocks thinking outside the
original area for which is was designed to cover. Jargon is both a necessity and a curse. You should realizeyou need to be active intellectually to gain the advantages of the jargon and to avoid the pitfalls, even inyour own area of expertise!
During the long years of cave man evolution apparently people lived in groups of around 25 to 100 in
size. People from outside the group were generally not welcome, though we think there was a lot of wife
stealing going on. When the long years of cave man livi ng are compared with the fe w of civilization: (less
than ten thousand years) we see we have been mainly selected by evolution to resent outsiders, and one of
the ways of doing this is the use of special, jargon, languages. The thievesâ argot, group slang, husband andwifeâs private language of words, gestures, and even a lift of an eyebrow, are al l examples of this common
use of a private language to ex clude the outsider. Hence this instinctive use of jargon when an outsider
comes around should be consciously resisted at all timesâwe now work in much larger units than those of
cave man and we must try continually to overwrite this earlier design feature in us.
Mathematics is not always the unique language you wish it were. To illustrate this point recall I earlier
mentioned some Navy Intercept simulations involving the equivalent of 28 simultaneous first orderdifferential equations. I need to develop a story. Ignoring all but the essential part of the story, consider the
problem of solving one differential equation
The Nuances of Simulation
- A shared understanding of mathematical symbols does not guarantee a shared interpretation of their physical application.
- The author emphasizes that domain experts must be involved in detailed programming to prevent catastrophic errors in simulation logic.
- Creative mathematical approximations, such as treating differential equations as impedance lines, allow complex systems to be modeled on limited hardware.
- Simulations are not restricted to time-dependent functions but can also model spatial stability, such as signal growth across relay stations.
- The concept of 'space stabilization' describes how a pulse might decay locally at each station but grow uncontrollably as it travels across a continent.
- Almost any mathematically describable situation can be simulated, provided one is cautious with unstable systems.
It is as good an example as I know of to illustrate the fact both of us understood exactly what the mathematical symbols meantâwe both had no doubtsâbut there was no agreement in our interpretations of them!
Figure 18.III . Keep this equation in mind as I talk about the real problem. I programmed the real problem of
28 simultaneous differential equations to get the solutio n and then limited certain valu es to 1, as if it were
voltage limiting. Over the objections of the proposer, a friend of mine, I insisted he go through the raw,
absolute binary coding of the prob lem with me, as I explained to him what was going on at each stage. I
refused to compute until he did thisâso he had no real choice! We got to the limiting stage in the program
and he said, âDick, that is fin limiting, not voltage limiting.â meaning the limited value should be put in at
each step and not at the end. It is as good an example as I know of to illustrate the fact both of us understood
exactly what the mathematical symb ols meantâwe both had no doubtsâbut there was no agreement in our
interpretations of them! Had we not caught the error I doubt any real, live experiments involving airplanes
would have revealed the decrease in maneuverability which resulted from my interpretation. That is why, to
this day, I insist a person with the intimat e understanding of what is to be simulated must be involved in the
detailed programming. If this is not done then you may face similar situ ations where both the proposer and
the programmer know exactly what is meant, but their interpretations can be significantly different, giving
rise to quite different results!
You should not get the idea simulations are always of time dependent functions. One problem I was given
to run on the differential analyser we had built out of old M9 gun director parts was to compute the
probability distributions of blocking in the central office. Never mind they gave me an infinite system of132 CHAPTER 18
interconnected linear differential equations, each one giving the probab ility distribution of that many calls
on the central office as a function of the total load. Of course on a finite machine something must be done,
and I had only 12 integrators, as I remember. I viewed it as an impedance line, and using the difference of
the last two computed probabilities I assumed they were proportional to the difference of the next two, (Iused a reasonable constant of proportionality derived from the difference from the two earlier functions) thusthe term from the next equation beyond what I was computing was reasonably supplied. The answers werequite popular with the switching department, and made an impression, I believe, on my boss who still had alow opinion of computing machines.
There were underwater simulations, especially of an acoustic array pu t down in the Bahamas by a friend
of mine where, of course, in winter he often had to go to inspect things and take further measurements.
There were numerous simulations of transistor design and behavior. There were simulations of the microwave âjump-jumpâ relay stations wi th their receiver horns, and the ove rall stability aris ing from a single
blip at one end going through all the separate relay stations. It is perfectly possible while each stationrecovers promptly from the blip, never-the-less the size of the blip could grow as it crossed the continent.At each relay station there was stability in the sense the pulse died out in time , but there was also the
question of the stability in spaceâdi d a random pulse grow indefinitely as it crossed the continent? For
colorful reasons I named the problem âSpace stabilizationâ. We had to know the circumstances in which
this could and could not happenâhence a simulation was necessary because, among other things, the shapeof the blip changed as it went across the continent.
I hope you see almost any situation you can describe by some sort of mathematical description can be
simulated in principle. In practice you have to be very careful when simulatin g an unstable situationsâ
though I will tell you in Chapter 20 about an extreme case I had to so lve because it was important to the
The Reliability of Simulations
- The author emphasizes a personal commitment to finding solutions for important problems regardless of the difficulty.
- Faulty simulations are a major risk because they can lead researchers to abandon viable ideas or promote false conclusions.
- The 'Club of Rome' world simulation is cited as a famous failure where equations were biased toward catastrophe and contained computational errors.
- Reliability is often obscured by irrelevant metrics like the amount of manpower used or the speed of the computer involved.
- Validation and authentic representation are essential requirements for any simulation used in critical decision-making processes.
It turned out the equations they chose were designed to show a catastrophy no matter how you started or chose most of the coefficients!
Laboratories, and that meant, at least to me, I had to get the solution, no matter what excuses I gave myselfit could not be done. There are always answers of some sort f or important problems if you are determinedto get them. They may not be perfect, but in desperation something is better than nothingâprovided it isreliable!
Faulty simulations have caused people to abandon good ideas, and these occur all too often! However, one
seldom sees them in the literature as they are very, very seldom repor ted. One famuous faulty simulation
which was widely reported (before the errors were noted by others) was a whole world simulation done bythe so called âClub of Romeâ. It tu rned out the equations they chose we re designed to show a catastrophy
Figure 18.III
SIMULATIONâI 133
no matter how you started or chose most of the coeffici ents! But it also turned out when others finally got
the equations and tried to repeat the computations the comuputations has serious errors! I will turn to this
aspect of simulating things in the next chapter as it is a very serious matterâ to either report things which
make people believe what they want to believe, and are not so, or which you discourage people frompursuing their good ideas. 134 CHAPTER 18
19
SimulationâII
We now take up the question of the reliability of a simulation. I can do no better than quote from the
Summer Computer Simula tion Conference of 1975,
âComputer based simulation is now in wide spread us e to analyse system models and evaluate theoretical
solutions to observed problems. Since important decision s must rely on simulation, it is essential that its
validity be tested, and that its advocates be able to describe the level of authentic representation which they
achieved.â
It is an unfortunate fact when you raise the question of the reliability of many simulations you are often
told about how much man power went into it, how large and fast the computer is, how important the
problem is, and such things, which are completely irrelevant to the question that was asked.
I would put the problem slightly differently:
Simulation Versus Reality
- The relevance and accuracy of a simulation must be established before any work begins to avoid misleading or erroneous results.
- A significant gap often exists between the reliability of a simulation and the reliability of the actual physical event it models.
- Simulation experts frequently fall into the trap of identifying their models with reality while ignoring independent checks.
- While simulations like flight trainers allow for safe practice of emergency scenarios, they risk omitting vital new interactions as technology evolves.
- The increasing lack of real-world experience among practitioners makes it harder to ensure that models include all essential details.
- The persistence of errors in long-standing computer programs serves as a warning against overconfidence in complex simulation data.
His refusal to reply, under repeated requests, was a clear admission my point went home, he himself knew the Director did not understand this difference but thought the report was the reliability of the actual shot.
Why should anyone believe the simulation is relevant?
Do not begin any simulation until you have given this question a great deal of thought and foundappropriate answers. Often there are all kinds of reasons given as to why you should postpone trying toanswer the question, but unless it is answered satisfactorily then all that you do will be a waste of effort, or
even worse, either misleading, or even plain erroneous. The question covers both the accuracy of the
modeling and the accuracy of the computations.
Let me inject another true story. It happened one evening after a technical meeting in Pasadena,
California we all went to dinner together and I happened to sit next to a man who had talked about, and wasresponsible for, the early sp ace flight simulation reliability. This was at the time when there had been about
eight space shots. He said they never launched a flig ht until they had a more than 99 point something
percent reliability, say 99.44% reliability. Being me I observed there had been something like eight spaceshots; one live simulation had killed the astronauts on the ground, and we had had one clear failure, so howcould the reliability be that high? He claimed all sort of things, but fortunately for me the man on his other sidejoined in the chase and we forced a reluctant admission from him what he calculated was not the reliabilityof the flight, but only the reliability of the simula tion. He further claimed everyone understood that. Me,
âIncluding the Director who finally approves of the flight?â His refusal to reply, under repeated requests,
was a clear admission my point went home, he himself knew the Director did not understand this differencebut thought the report was the reliability of the actual shot.
He later tried to excuse what he had done with things like, what else could he do, but I promptly pointed
out a lot of things he could do to connect his simulation with reality much closer than he had. That was aSaturday night, and I am sure by Monday morning he was back to his old habits of identifying thesimulation with reality and making little or no independent checks which were well within his grasp. That is
what you can expect from simulation expertsâ they ar e concerned with the simulation and have little or no
regard for reality, or even âobserved realityâ.
Consider the extensive business simulations and war gaming which goes on these days. Are all the
essentials incorporated correctly in to the model, or are we training the people to do the wrong things? How
relevant to reality are these gaming models? And many other models?
We have long had airplane pilot trainers which in many senses give much more useful training than can be
given in real life. In the trainer we can subject the pilot to emergency situations we would not dare to do inreality, nor could we ever hope to produce the rich variety the trainer can. Clearly these trainers are very
valuable assets. They are comparatively cheap, efficient in the use of the pilotâ s tim e, and are very flexible.
In the current jargon, they are examples of âvirtual realityâ.
But as time goes on, and planes of other types are de veloped, will the people th en be as careful as they
should be to get all the new interactions into the model, or will some small, but vital, inter-actions of the
new plane be omitted by oversight, thus preparing the pilot to fail in these situations?
Here you can see the problem clearly. It is not that simulations are not essential these days, and in the near
future, rather it is necessary for the current crop of people, who have had very little experience with realityto realize they need to know enough so the simulations include all essential details. How will you convinceyourself you have not made a mistake somewhere in the vast amount of detail? Remember how manycomputer programs, even after some y ears of field use, still have serious errors in them! In many situations
The Sloppiest Simulation
- The accuracy and reliability of simulations are critical because errors can lead to loss of life, equipment, or capital.
- To demonstrate analog computing at a 1955 open house, the author created a tennis simulation using classical mechanics and physical dials.
- The simulation was deemed 'sloppy' because the author relied on peer intuition for constants rather than rigorous empirical testing.
- During the demonstration, children quickly mastered the game's interface while adults consistently failed to grasp the mechanics.
- The experiment revealed a fundamental cognitive gap: younger minds possess a unique elasticity for new ideas that older minds often lack.
- Professionals must account for this mental rigidity when presenting innovative concepts to older decision-makers.
I noticed, after a while, not one adult ever got the idea of what was going on enough to play successfully, and almost every child did!
such errors can mean the difference between life or death for one or more people, let alone the loss of
valuable equipment, money, and time.
The relevant accuracy and reliability of simulati ons are a serious problem. There is, unfortunately, no
silver bullet, no magic in cantation you can perform, no panacea for this problem. All you have is yourself.
Let me now describe my sloppiest simulation. In the summer of 1955 Bell Telephone Laboratories
decided to hold an open house so the people living nearby, as well as relatives and friends of employees,could learn a little about what the people who worked there did. I was then in charge of, for that time, alarge analog differential analyzer, and I was expected to give demonstrations all day Saturday. Much of
what we were doing at that time was trajectories of gu ided missiles, and I was not about to get into security
trouble showing some sanitized versions. So I decided a tennis game, which clearly involves aerodynamics,trajectories, etc. would be an honest demonstration of what we did, and anyway I thought it would be a lot
more appealing and interesting to the visiting people.
Using classical mechanics I set up the equations, in corporated the elastic bounce, set up the machine to
play one base line with the human player on the other, along with both the angle of the racket and thehardness with which you hit the ball which were se t by two dials conveniently placed. Remember, in those
days (1955) there were not the game playing machines in many public places, hence the exhibit was a bitnovel to the visitors. I then invited a smart physicist friend, who was also an avid tennis player, to inspectand tune up the constants for bounce (asphalt court) and air drag. When he was satisfied, then behind hisback I asked another physicist to give me a similar op inion without letting him alter the constants. Thus I got
a reasonable simulation of tennis without âspinâ on the ball.
Had it been other than a public amusement I would have done a lot more. I could have hung a tennis ball
on a string in front of variable strength fan and noted carefully the angle at which it hung for different windvelocities, thus getting at the drag, and included those for variously worn tennis balls. I could have droppedthe balls and noted the rebound for different heights to te st the linearity of the elastic constants. If it had
been an important problem I could have filmed some games and tested I could reproduce the shots whichhad no spin on them. I did not do any of these things! It was not worth the cost. Hence it was my sloppiestsimulation.136 CHAPTER 19
The major part of the story, however, is what happe ned! As the groups came by they were told what was
going on by some assistants, and shown the display of the game as it developed on the plotting boardoutputs. Then we let them play th e game against the machine, and I had programmed the simulation so the
machine could lose. Watching the entire process from the background, human and machine, I noticed, after
a while, not one adult ever got the idea of what was going on enough to play su ccessfully, and almost every
child did! Think that over! It speaks volumes about the elasticity of young minds and the rigidity of olderminds! It is currently believed most old people cannot run VCRs but children can!
Remember this fact, older minds have more trouble adjusting to new ideas than do younger minds since
you will be showing new ideas, and even making formal presentations to, older people throughout much ofyour career. That your children could understand what you are showing is of little relevance to whether ornot the audience to whom you are running the exhibition can. It was a terrible lesson I had to learn, and I
have tried not to make that mistake again. Old people are not very quick to grasp new ideasâit is not they aredumb, stupid, or anything else like that, it is simply older minds are usually slow to adjust to radically new
ideas.
The Limits of Economic Simulation
- Economic simulations lack the foundational reliability of hard sciences because economics lacks universal, non-tautological laws.
- Simpson's Paradox demonstrates how aggregating data can create the illusion of trends or biases that do not exist within individual subsets.
- A Berkeley graduate school study illustrates this paradox, where apparent gender discrimination vanished when data was analyzed by department.
- Successful simulations in fields like aerospace rely on knowing exactly when complex systems can be simplified into point masses without losing accuracy.
- The author argues that many ecological and social simulations are used as propaganda because they lack mathematically expressed rules for interactions.
- Reliable simulation requires both a deep understanding of background theory and access to real-world data for verification.
You are used to the idea combining data can obscure things, but it can also create effects is less well known.
I have emphasized the necessity of having the underlying laws of what ever field you are simulating well
under control. But there are no such laws of economi cs! The only law of economics that I believe in is
Hammingâs law, âYou cannot consume what is not pr oducedâ. There is not another single, reliable law in
all of economics I know of which is not either a tautology in mathematics, or else it is sometimes false.
Hence when you do simulations in economics you have not the reliability you have in the hard sciences.
Let me inject another story. Some years ago the following happened at U. C.Berkeley. About equal
numbers of males and females applied to graduate sc hool, but many more men were accepted than women.
There was no reason to assume the men were better pr epared on the average than were the women. Hence
there was obvious discrimination in terms of the ideal model of fairness. The President of the Universitydemanded to know which departments were guilty. A close examination showed no department was guilty!How could that be? Easy! Various departments have varying numbers of openings for the entering graduateschool, and various ratios of men to women applying for them. Those with both many openings and manymen applying are the hard sciences, including mathema tics, and those with the lo w ratios of acceptance and
many women applying, are the soft ones like literature, history, drama, social sciences, etc. Thus the
discrimination, if you can say it occurred, because the men, at a younger age, were made to take
mathematics which is the preparation for the hard sciences, and the women could or could not take
mathematics as they chose. Those who avoided mathem atics, physics, chemistry, engineering, and such,
were simply not eligible to apply where the openings were readily available, but had to apply where there
was a high probability of rejection. People have trouble adapting to such situations these days!
Here you see a not widely recognized phenomena, bu t one which has been exte nsively examined in many
of its appearances by statisticians; the combining of da ta can create effects not there in detail. You are used
to the idea combining data can obscure things, but it ca n also create effects is le ss well known. You need to
be careful in your future this do es not happen to youâ you are accused, from amalgamated data, of what
you are not guilty. Simpsonâs paradox is a famous example where both subsamples can favor A over B andC, but the combined data favors B over A.
Now you may say in the space flight simulations we combined data and at times made the whole vehicle
into a point. Yes, we did, but we knew the laws of mechanics and knew when we could and could not do it.Thus, in midcourse corrections you get the vehicle pointed in exactly the right direction and then fire theretro or other rockets to get the corrections, and during such times you do not allow the people to movearound in the vehicle as that can produce rotations and hence spoil the careful directing of the rockets. WeSIMULATIONâII 137
thought we knew enough of the background theory, and we had had years of experience in the matter, so the
combining of all the details into one point mass still gave reliable simulation results.
In many proposed areas of simulation there are neither such known experiences nor theory. Thus when I
was occasionally asked to do some ecological simulation I quietly asked for the mathematically expressedrules for every possible interaction, for example given the amount of rain what growth of the trees wouldoccur, what exactly were the constant s, and also where I could get some r eal live data to compare some test
runs. They soon got the idea and went elsewhere to get someone more willing to run very questionablesimulations which would give the results they wanted and could use for their propaganda.
The Limits of Human Simulation
- Human knowledge of a simulation's predictions often leads to behavioral changes that invalidate the simulation's results.
- The stock market is inherently resistant to public strategies because widespread adoption would immediately ruin the strategy's effectiveness.
- Systemic corruption and insider trading further complicate financial modeling by creating an uneven playing field that resists automation.
- The 'method of scenarios' offers a viable alternative by projecting ranges of possible outcomes rather than making specific, rigid predictions.
- Reliability in simulation requires checking for missing vital effects, data stability, and internal conservation laws to ensure accuracy.
- Maintaining personal integrity is crucial to avoid becoming a tool for propaganda when conducting or presenting simulations.
In the stock market, if there were any widely known strategy for making lots of money, the very knowledge of it would ruin the strategy!
I suggest youkeep your integrity and do not allow yourself to be used for other peopleâs propaganda; you need to be warywhen agreeing to do a simulation!
If these soft science situations are hard to simulate with much reliability, think of those in which humans
by their knowledge of the simulation can alter their behavior and thus vitiate the simulation. In theinsurance business the company is betting you will live a long time and you are betting you will die young.For an annuity the sides are reversed, in case you had not thought about that point. While, in principle, youcan fool the insurance companies a nd commit suicide, it is not common, and the insurance companies are
indeed careful ab out this point.
In the stock market, if there were any widely k nown strategy for making lots of money, the very
knowledge of it would ruin the strategy! In this case people would alter their behavior to vitiate thepredictions you made. Not that some legally permissible strategy could not exist (though I am pretty sure itwould have to be a fairly nonlinear theory to do much good above the normal stock market rise) but itwould have to be kept very private. The basic trouble is the stock market is crooked. The insiders haveknowledge which according to the explicitly stated laws they may not act on, but they do so all the time! Ifyou do not use inside information then you have little chance against those who do, and if you do act oninside information you are acting illegally! It is a bad business either way, and the insiders are resisting all
attempts to automate the trading by machine which would eliminate some of the inside deals they nowprofit on. It is known they do but it is apparently not provable in court! Furthermore, false âinsideinformationâ is constantly circulated in the hopes the outsiders will think they are inside and act on it to theprofit of the originators of the rumors.
Thus beware of any simulation of a situation which allows the human to use the output to alter their
behavior patterns for their own benefit, since they will do so whenever they can.
But all is not lost. We have devised the method of scenarios to cope with many difficult situations. In this
method we do not attempt to predict what will actually happen, we merely give a number of possibleprojections. This is exactly what Spock did in his baby raising book. From the observations of manychildren in the past he assumed the future (early) behavior of children would not differ radically from theseobservations, and he predicted not what your specific child would do but only gave typical patterns withranges of behavior, on such things as when babies begi n to crawl, talk, say ânoâ to everything, etc. Spock
predicted mainly the biological behavior and avoided as much as he could the cultural behavior of the child.
In some simulations the method of scenarios is the best we can do. Indeed, that is what I am doing in this
set of chapters; the future I predict cannot be known in detail, but only in some kinds of scenarios of what islikely to happen, in my opinion. More on this topic in the next chapter.
I want to return to the problem of deciding how you can make realistic estimates of the reliability of your
simulations, or those which are presented to you in the future. First, does the background field support theassumed laws to a high degree? How sure are you some sm all, but vital, effect is not missing? Is the input
data reliable? Is the simulation stable or unstable? What cross checks against known past experience haveyou available for checking things? Can you produce any internal checks, such as a conservation of mass, or138 CHAPTER 19
energy or angular momentum? Without redundancy, as you know from the talks on error correcting codes,
there can be no check on the reliability.
I have not so far mentioned what at first will appear to be a trivial point; do the marks on the paper which
describe the problem get into the machine accurate ly?
Reliability and Digital Simulation
- Hamming describes automating the generation of differential equations to prevent human programming errors in complex chemical simulations.
- The goal was to keep chemists focused on chemistry rather than the mechanics of computing while maintaining their responsibility for the results.
- Reliability in simulation is paramount and cannot be assumed simply because a machine produces professional-looking output.
- The transition from analog to digital computing is discussed through the lens of the Nyquist sampling theorem.
- Practical digital solutions require seven to ten samples per highest frequency to account for one-sided sampling and to minimize aliasing.
- Designers must provide accurate estimates of signal frequency content to ensure the validity of the digital representation.
It is not something you can take for granted just because a big machine gives out nicely printed sheets, or displays nice, colorful pictures.
Programming errors are known to be all too
common.
Let me tell another story which illustrates this poin t there are things one can do about this problem. One
time the chemistry department was considering a contract to examine, for the Federal Government, thechemistry of the upper atmosphere immediately after an atomic bomb explosion. I was asked only to supplyadvice and guidance. Upon looking in to the problem I found there would be in each case which was to be
computed somewhere around 100 ordinary differential equations to be solved, depending on the particularchemical reactions they expected.
I did not think they could get the various sets of th ese equations into the machine correctly every time, so
I said we would first write a program which would go from the punched cards , one card describing each
particular reaction with all its relevant constants of interactions, to the equations themselves, thus insuring
all the terms were there; no errors in the coefficients not being the same for the same reaction as it appears
in different equations, etc. By hindsight it is an obvious thing to do; at the time it was a surprise to them, butit paid off in effort on their part. They had only to select those cards from the file they wanted to include inthe particular simulation they were going to run, a nd the machine automated all the rest, including the
spacing of the steps in the integratio n. My main idea, besides the ease and accuracy, was to keep their minds
focused on what they were best able to doâchemistryâand not have them fussing with the machine withwhich they were not experts. They were, moreover, in charge of the actual comp uting. I made it easy to do
the book-keeping and the mechanics of the computer, but I refused to relieve them of the thinking part.
In summary, the reliability of a simulation, of which you will see many in your career since it is
becoming increasingly common, is of vital importance. It is not something you can take for granted justbecause a big machine gives out nicely printed sheets, or displays nice, colorful pictures. You are
responsible for your decisions, and cannot blame them on those who do the simulations, much as you wish
you could. Reliability is a central question with no easy answers.
Let us return to the relationship of analog to digita l computers. The point sometimes arises in these of
days of neural nets . The argument is made the analog machin es can compute things which the digital
version cannot. We need to look at this point more closely âit is really the same as was made years ago
when the analog computers were being displaced by digital computers. In these chapters we now have therelevant knowledge to approach the topic carefully.
The basic fact is the Nyquist sampling theorem says it takes two samples for the highest frequency present
in the signal (for the equally sp aced points on the entire real line) to reproduce (withi n roundoff) the
original signal. In practice most sign als have a fairly sharp cutoff in th e frequency band; with no cutoff there
would be infinite en ergy in the signal!
In practice we use only a comparatively few samples in the digital solution and hence something like
twice the number Nyquist requires is needed. Furthermore, usually we have samples on only one side and this
produces another factor of two. Hence, something from seven to ten samples for the highest frequency areneeded. And there is still a little aliasing of the highe r frequencies into the band which is being treated (but
this is seldom where the information in the signal lies). This can be checked both theoretically and
experimentally.
Sometimes the mathematician can accurately estimate the frequency content of the signal (possibly from
the answer being computed), but usually you have to go to the designers and get their best estimates. Acompetent designer should be able to deliver such estimates, and if they cannot then you need to do a lot ofSIMULATIONâII 139
Analog Computing and Error Dynamics
- Digital machines offer superior precision and deep computation, whereas analog machines are limited by component accuracy to roughly one part in 10,000.
- Analog computers remain valuable for their real-time responsiveness and ability to integrate physical components without needing full mathematical descriptions.
- The 'Garbage In, Garbage Out' (GIGO) principle is challenged, as accurate inputs do not always guarantee accurate outputs in complex simulations.
- Direction fields in differential equations illustrate how small initial errors can either diverge into large discrepancies or converge into negligible ones.
- The accuracy of a solution is not absolute but depends on the specific trajectory and state of the function being computed.
Analog machines are generally ignored these days, so I feel I need to remind you they have a place in the arsenal of tools in the kit of the scientist and engineer.
exploring of the solutions to estimate this critical number, the sampling rate of the digital solution. The step
by step solution of a problem is actually sampling th e function, and you can use adaptive methods of step by
step solution if you wish. You have much theory and some practice on your side.
For accuracy the digital machine can carry many digits, while analog m achines are rarely better than one
part in 10,000 per component, if that much. Thus an alog machines cannot give very accurate answers, nor
carry out âdeep computationsâ. But of ten the situation you are simulating ha s uncertainties of a similar size,
and with care you can ha ndle the accuracy problem.
With the passage of time we have developed wider band width analog computers, but we have used this
to speed up the computations rather than use the implied band width of the circuits for accuracy. In any case,the fundamental accuracy of the anal og parts limits what you can do with an analog machine. The old
mechanical computers, like the R DA #2, took about half an hour per solution; the electrical computers
derived from the gun directors, which still had some mechanical parts, took minutes; later all electronic
ones took seconds, and now some of them can flash the solution on th e screen as fast as you can supply
input.
In spite of their relatively low accur acy analog computers are still valuab le at times, especially when you
can incorporate a part of the proposed device into the circuits so you do not have to find the propermathematical description of it. Some of the faster an alog computers can react to the change of a parameter,
either in the initial conditions or in the equations themselves, and you can see on the screen the effect
immediately. Thus you can get a âfeelâ for the problem easier than for the digital machines which generally
take more time per solution and must have a full math ematical description. Anal og machines are generally
ignored these days, so I feel I need to remind you they have a place in th e arsenal of tools in the kit of the
scientist and engineer. 140 CHAPTER 19
20
SimulationâIII
I will continue the general trend of the last chapter, but center on the old expression âgarbage in, garbage outâ,
often abbreviated GIGO. The idea is if you put ill-determined numbers a nd equations (garbage) in then you
can only get ill-determined results (garbage) out. By im plication the converse is tacitly assumed, if what
goes in is accurate then what comes out must be ac curate. I shall show both of these assumptions can be
false.
Because many simulations still involv e differential equations we begin by considering the simplest first
order differential equations of the form
You recall a direction field is simply drawing at each point in the xây plane a line element with the slope
given by the differential equation, Figure 20.I . For example, the differential equation
has the indicated direction field, Figure 20.II . On each of the concentric circles
Figure 20.I
the slope is always the same, the slope depending on the value of k. These are called isoclines .
Looking at the following picture, Figure 20.III , the direction field of another differential equation, on the
left you see a diverging direction field, and this mean s small changes in the initial starting values, or small
errors in the computing, will soon produce large differences in the values in the middle of the trajectory.
But on the right hand side the direction field is co nverging, meaning large differences in the middle will
lead to small differences on the right end. In this si ngle example you see both small errors can become large
ones, and large ones can become smal l ones, and furthermore, small errors can become large and then again
become small. Hence the accuracy of the solution depends on where you are talking about it, not any absolute
accuracy over all. The function behind all this is
whose differential equation is, upon differentiating,
Numerical Solutions and Error Accumulation
- Visualizing n-dimensional differential equations as expanding or contracting tubes can be misleading due to high-dimensional paradoxes.
- Numerical integration begins by moving along a local tangent line, which inherently introduces a small error by using 'the slope that was' rather than the interval's average.
- Predictor-corrector methods mitigate this by estimating a future slope and averaging it with the current one to refine the step forward.
- Step size is dynamically adjusted based on the discrepancy between predicted and corrected values to maintain local accuracy.
- Total accumulated error is distinct from local step error and is fundamentally driven by the convergence or divergence of the underlying direction field.
- Advanced numerical methods utilize higher-degree polynomials and treat the corrector as a recursive digital filter processing derivative data.
The four circle figure in two dimensions, leading to the n-dimensional paradox by ten dimensions, shows how tricky such imagining may become.
Probably in your mind, you have drawn a âtubeâ about the âtrue, exact solutionâ of the equation, and seen
the tube expands first and then contracts. This is fine in two dimensions, but when I have a system of n such
differential equations, 28 in the Navy intercept problem mentioned earlier, then these tubes about the truesolutions are not exactly what you might think they were . The four circle figure in two dimensions, leading
to the n-dimensional paradox by ten dimensions, Chapter 9 , shows how tricky such imagining may become.
This is simply another way of looking at what I said in earlier chapter about stable and unstable problems;but this time I am being more specific to the extent I am using differential equations to illustrate matters.
How do we numerically solve a differential equation? Starting with only one first order ordinarily
differential equation of first degree, we imagine the direction field. Our problem is from the initial value,which we are given, we want to get to the next nearby point. If we take the local slope from the differentialequation and move a small step fo rward along the tangent line then we will make a only small error,
Figure 20.IV . Using that point we go to the next point, but as you see from the Figure we gradually depart
Figure 20.II
Figure 20.III
142 CHAPTER 20
from the true curve because we are always using âthe sl ope that wasâ, and not a typi cal slope in the interval.
To avoid this we âpredictâ a value, use that value to evaluate the slope there, (u se the differential equation),
and then use the average slope of th e both ends to estimate the aver age slope to use for the interval,
Figure 20.V . Then using this average slope we move the step forward again, this time using a âcorrectorâ
formula. If the predicted and correc ted values are âcloseâ then we assu me we are accurate enough, but if
they are far apart then we must shorten the step size. If the difference is too sma ll then we should increase
the step size. Thus the traditiona l âpredictor-correctorâ methods have built into them an automatic
mechanism for checking the step -bystep errorâbut this step-by-step error is, of course, not the whole
accumulated error by any means! The accumulate d error clearly depends on the convergence or divergence
of the direction field.
We used simple straight lines for both predicting and correcting. It is much more economical, and
accurate, to use higher degree polynomials, and typical ly this means about fo urth degree polynomials,
(Milne, Adams-Bashforth, Hamming, etc). Thus we must use several old values of the function and
derivative to predict the next value, and then using this in the differential equation we get an estimated newslope, and with this slope plus using old values of the function and slope, we correct the value. A momentâs
thought and you see the corrector is just a recursive digital filt er where the input data are the derivatives, and
Figure 20.IV
Figure 20.V
SIMULATIONâIII 143
Numerical Analysis vs Filter Theory
- Differential equations are fundamentally recursive digital filters that compute numbers through difference equations.
- Numerical analysis traditionally uses polynomials to approximate trajectories, while filter theory uses frequencies as its basis.
- Polynomial methods can create small discontinuities in acceleration that disrupt the 'feel' of a simulation.
- Frequency-based approaches prioritize the sensory experience and response of a system over exact positional accuracy.
- The conflict between mathematical precision and engineering utility highlights a lack of knowledge regarding human sensory perception.
- Simulators must account for concealed factors like frequency response to ensure pilots are properly prepared for real-world physics.
In the frequency approach we will concentrate on getting the frequencies right and let the actual positions be what happen.
the output values are the positions. Stability and all we discussed there are relevant. As mentioned before,
there is the extra feedback through th e differential equationâs predicted value which goes into the corrected
slope. But both are simply solving a diff erence equa tionârecursive digital filters are simply this formula
and nothing more. They are not just transfer functions as your course in digital filters might have made youthink; plainly and simply, you are computing numbers coming from a difference equation. There is adifference however. In the filter you are strictly pr ocessing by a linear formula, but because in the
differential equation there is the nonlinearity which arises from the evaluation of the derivative terms, it isnot exactly the same as a digital filter.
If you have n differential equations then you are dealing with a vector with n components; you predict
each component forward, evaluate each of the n derivatives, correct each predicted value, and finally take
the step, or reject it if the error is too large in a sense you think fairly measures the local error. You tend to
think about small errors as a ât ubeâ surrounding the actual computed trajectory, but again you need to
remember the four circle paradox, in a high dimension the âtubesâ are not at all like you wish they were.
Now let me note a significant differ ence between the two approaches, nume rical analysis and filter theory.
The classical methods of numerical analysis, and still about the only one you will find in the accepted texts,use polynomials to approximate functions, but the recursive filter used frequencies as the basis forevaluating the formula! This is a different thing entirely!
To see this difference suppose we are to build a simulator for hum ans landing on Mars. The classical
formulas will concentrate on the trajectory shape in terms of local po lynomials, and the path will have small
discontinuities in the acceleration as we move from inte rval to interval. In the frequency approach we will
concentrate on getting the frequencies right and let the actual positions be what happen. Ideally thetrajectories are the same; practical ly they can be quite different.
Which solution do you want? The more you think about it the more you realize the pilot in the trainer
will want to get the âfeelâ of the landing vehicle, and this seems to mean the frequency response of thesimulator should feel right to the pilot. If the position is a bit off, then the feedback control during landing
can compensate for this, but if it feels wrong in the actual flight then the pilot is going to be bothered by thenew experience which was not in the simulator. It has al ways seemed to me the simulator should prepare the
pilots for the actual experi ence as best we can (we cannot fake out for long the lower gravity of Mars), so
they will feel comfortable when the real event occurs, having experienced it many times in the trainer. Alas,
we know far too little of what the pilot âfeelsâ (senses). Does the pilot feel only the Fourier real
frequencies, or maybe they also feel the decay ing Laplace complex frequencies (or should we use
wavelets?). Do different pilots feel the same kinds of things? We need to know more than we apparently
now do about this important design criterion.
The above is the standard conflict between the Math ematicianâs and engineerâs approaches. Each has a
different aim in solving the differential equations (and in many other problems), and hence they getdifferent results out of their calculations. If you are involved in a simulation then you see there can be
highly concealed matters which are important in practi ce, but which the Mathematicians are unaware of and
they will deny the effects matter. But looking at the two trajectories I have crudely drawn, Figure 20.VI , the
top curve is accurate in position but th e corners will give a very differen t âfeelâ than reality will, and the
Challenging the GIGO Myth
- The author argues that individuals with deep insight into a problem must immerse themselves in solution methods rather than relying on traditional approaches.
- During early Nike missile testing, unexplained mid-flight breakups threatened the project's timeline and design phase.
- Despite lacking accurate initial telemetry data, the author used simulation to identify a periodic energy transfer between pitch and yaw caused by missile rotation.
- The simulation succeeded because the guidance system's convergent direction field corrected for initial inaccuracies, proving that 'Garbage In' does not always result in 'Garbage Out'.
- This early success in accident simulation demonstrated that understanding the underlying system dynamics is more critical than having perfect starting data.
I had earlier realized the nature of the field trials being simulated was such that small deviations from the proposed trajectory would be corrected automatically by the guidance system!
second curve will be more wrong in position but more right in âfeelâ. Again, you see why I believe theperson with the insight into the problem must get deep inside the solution methods and not accepttraditional methods of solution.
I now turn to another story about the early days of Nike guided missile testing. At this point they were
field testing at White Sands what was called âthe telephone pole testsâ. They were simply firings where themissile was to follow a preassigned trajectory, and at the last moment explode so the whole would not come144 CHAPTER 20
down outside the range and do great damage, rather the parts would more gently fall to the ground in the
range and supposedly do less harm. The object of the tests was to get realistic measurements of drag, lift,
and other properties as functions of altitude and velocity , for purposes of settling the details of the design as
well as for improving the design.
I found my friend back at the Labs wandering around the halls looking qui te unhappy. Why? Because the
first two of some six test shots have broken up in mid-flight and no one knew why. The delay meant thedata to be gathered to enable us to go to the ne xt stage of design was not available and hence the whole
project was in serious trouble. I observed to him if he would give me the differential equations describingthe flight I would put a girl on the job of hand calculating the solution (big computers were not readilyavailable in the late 1940s). In about a week they delivered seven first order equations, and the girl wasready to start. But what are the starting conditions just before the trouble arose? (I did not in those days havethe computing capacity to do the whole trajectory rapidly.) They did not know! The telemetered data wasnot clear just before the failure. I was not surprised, and it did not bother me much. So we used the guessedaltitude, slope, velocity, angle of attack, etc. one for each of the seven variables of the trajectory;
one condition for each equati on. Thus I had garbage in. But I had earli er realized the nature of the field
trials being simulated was such that small deviatio ns from the proposed trajectory would be corrected
automatically by the guidance system! I was dealing with a strongly convergent direction field.
We found both pitch and yaw were stable but as each one settled down it threw mo re energy into the other;
thus there was not only the traditional stability oscillations in pitch and yaw, but due to the rotation of themissile about its long axis there wa s a periodic transfer of increasing en ergy between them. Once the computer
curves for even a short length of the trajectory were shown ev eryone realized immediately they had
forgotten the cross connection stability, and they knew how to correct it. Now we had the solution theycould then also read the hashed up telemetered data from the trials and check the peri od of the transfer of energy
was just about correctâmeaning they had supplied the correct differential equations to be computed. I hadlittle to do except to keep the girl on the desk calculator honest and on the job. My real contribution was: (1)
the realization we could simulate what had happened, which is now routine in all accidents but was novelthen, and (2) the recognition there was a convergent direction field so the initial conditions need not beknown accurately.
My reason for telling you the story is to show you GIGO need not be right. Another example comes from
my earliest Los Alamos experience on bomb simulation. I gradually came to realize behind the computationwas fairly inaccurate data for computing the equation of state, which relates pressure to density (and
temperature which I will ignore for the moment). Da ta from high pressure labs, from estimates from
earthquakes, from estimates from the de nsity of the cores of stars, and fi nally from the asymptotic theory of
Feedback and Error Compensation
- The author describes using French curves to manually interpolate data points for nuclear simulations at Los Alamos.
- Despite using 'garbage' data with low initial accuracy, the final predictions for the 'gadget' were remarkably precise.
- The accuracy resulted from local errors being averaged out over the history of the simulation's physical shells.
- Feedback loops in both physics and engineering allow for accurate systems to be built from inaccurate components.
- Good design principles should aim to protect systems from the need for high-accuracy components through feedback.
- The author emphasizes that these intuitive design principles are not yet formally taught in standard curricula.
Hence garbage in, but accurate results out never-the-less!
infinite pressures were plotted as a set of points on a very large piece of graph paper, Figure 20.VII Then
large French curves were used to draw a curve connecting th e thinly scattered points. We then read this
Figure 20.VI
SIMULATIONâIII 145
curve to 3
decimal places, meaning we guessed at a 5 or a 0 in the fourth place. We used those numbers to
subtabulate a five digit table, and at places in the ta ble to six digit numbers, wh ich were then the official
data for the actual computations we ran. I was at that time, as I earlier said, sort of a janitor of computing,
and my job was to keep things going to free the physicists to do their job.
Figure 20.VII
At the end of the war I stayed on at Los Alamos an extra six months, and one of the reasons was I wanted
to know how it was such inaccurate data could have led to such accurate predictions for the final design.
With, at last, time to think for long periods, I found the answer. In the middle of the computations we were
using effectively second differences; the first differences gave the forces on each shell on one side, and the
differences from the adjacent shells on the two sides gave the resultant force moving the shell. We had to
take thin shells, hence we were differencing numbers which were very close to each other and hence the
need for many digits in the numbers. But further examination showed as the âgadgetâ goes off, any one
shell went up the curve and possibly at least partly dow n again, so any local error in the equation of state
was approximately averag ed out over its history. What was important to get from the equation of state was
the curvature, and as already noted even it had only to be on the average correct. Hence garbage in, but
accurate results out never-the-less!
These examples show what was loosel y stated before; if there is feedb ack in the proble m for the numbers
used, then they need not necessarily be accurately kno wn. Just as in H.S.Blackâ s great insight of how to
build feedback amplifiers, Figure 20.VIII , so long as the gain is very high only the one resistor in the
feedback loop need be accurately chosen, all th e other parts could be of low accuracy. From the
Figure 20.VIII you have the equation146 CHAPTER 20
We see almost all the uncertainty is in the one resist or of size 1/10, and the gain of the amplifier, (â10â9),
need not be accurate. Thus the feedba ck of H.S.Black allows us to acc urately build things out of mostly
inaccurate parts.
You see now why I cannot give you a nice, neat formula for all situations; it must depend on how the
particular quantities go through the whole of the computation; the whole computation must be understood
as a whole. Do the inaccurate nu mbers go through a feedback situ ation where their errors will be
compensated for, or are they vitally out in the ope n with no feedback protect ion? The word âvitallyâ
because it is vital to the computa tion, if they are not in some feedback position, to get them accurate.
Now this fact, once understood, impacts design! Good design protects you from the need for too many
highly accurate components in the system. But such desi gn principles are still, to this date, ill-understood
and need to be researched extensively. Not that good designers do not understand this intuitively, merely itis not easily incorporated into the design methods you were taught in school. Good minds are still needed in
spite of all the computing tools we have developed. But the best mind will be the one who gets the principle
into the design methods taught so it will be automatically available for lesser minds!
I now look at another example, and the principle wh ich enabled me to get a solution to an important
problem. I was given the differential equation
Walking the Stability Crest
- The author encountered a transistor design problem involving a differential equation that was inherently unstable in both directions.
- Standard numerical integration techniques failed because any small error caused the solution to diverge rapidly toward infinity.
- Despite initial skepticism about the model's validity, the author felt a professional obligation to solve the problem to maintain his reputation.
- The solution involved using the instability itself as a guide, manually correcting the slope whenever the curve began to deviate.
- This 'piece by piece' approach allowed the author to navigate the 'crest of a sand dune' and produce a viable solution for the space charge problem.
- The author reflects on how professional pride and persistence are essential when facing problems that appear insoluble or poorly posed.
In the past I had used the obvious trick when facing a divergent direction field of simply integrating in the opposite direction and you get an accurate solution. But in the above problem you are, as it were, walking the crest of a sand dune, and once both feet are one side of the crest you are bound to slip down.
You see immediately the condition at infinity is really the right hand side of the differential equation
equated to 0, Figure 20.IX .
But consider the stability. If the y at any fairly far out point x gets a bit too large, then the sinhy is much
too large, the second derivative is then very positive, and the curve shoots off to plus infinity. Similarly, if
the y is too small the curve shoots off to minus infinity. And it does not matter which way you go, left to
right, or right to left. In the past I had used the obvious trick when facing a divergent direction field ofsimply integrating in the opposite direction and you get an accurate solution. Bu t in the above problem you
Figure 20.VIIISIMULATIONâIII 147
are, as it were, walking the crest of a sand dune, and once both feet are one side of the crest you are bound
to slip down.
You can probably believe while I c ould find a decent power series expa nsion, and an even a better non-
power series approximate expansion around the origin, still I would be in trouble as I got fairly well along
the solution curve, especially for large k. All the analysis I, or my friends, could produce was inadequate. So
I went to the proposers and first objected to the condition at infinity, but it turned out the distance was beingmeasured in molecular layers, and (in those days) any r ealistic transistor would ha ve effectively an infinity
number of layers. I objected then to the equation itself; how could it represent reality? They won again, so I
had to retreat to my office and think.
It was an important problem in the design and understanding of the transistors then being developed. I
had always claimed if the problem was important an d properly posed then I could get some kind of a
solution. Therefore, I must find the solution; I had no escape if I were to hold on to my pride.
It took some days of mulling it over before I rea lized the very instability was the clue to the method to
use. I would track a piece of the solution, using the diff erential analyzer I had at th e time, and if the solution
shot up then I was a bit too high in my guess at the corresponding slope, and if it shot down I was a bit toolow. Thus piece by piece I walked the crest of the du ne, and each time the solution slipped on one side or
the other I knew what to do to get back on the track. Yes, having some pride in your ability to deliver whatis needed is a great help in getting important results under difficult conditions. It would have been so easy to
dismiss the problem as insoluble, wrongly posed, or any other excuse you wanted to tell yourself, but I stillbelieve important problems properly posed can be used to extract some useful knowledge which is needed.A number of space charge problems I have computed showed the same diffic ult instability in either
direction.
I need to introduce for the next story the idea of a Rorschach test which was popular in my youth. A blob
of ink is put on a piece of pa per, it is squeezed on a fold , and when it is opened y ou have a symmetric blot with
essentially a random shape. A sequence of these blots is shown to the subject and th ey are asked to report on
what they see. Their answers were used to analyse th e âpersonalityâ of the person. Obviously what a person
reports is a figment of their imagination since the blot is essentially a random shape. It is like watching the
clouds in the sky and discussing what the shapes they resemble; it is your imagination and not realityyou are discussing, and as such it is, to some extent, revealing things about yourself and not about theclouds. I believe the ink blot method is no longer in use.
Figure 20.IX
148 CHAPTER 20
The Trap of Randomness
- A psychological experiment at Bell Labs revealed that scientists consistently invent complex theories to explain purely random events.
- Traditional education focuses on replacing old theories with new ones, but fails to teach the validity of 'nothing' or randomness as an explanation.
- Statisticians use confidence limits to distinguish signal from noise, yet even these methods are subject to Type 1 and Type 2 errors.
- Many modern simulations function like Rorschach tests, where researchers adjust assumptions until the results match their preconceived expectations.
- The necessity of double-blind experiments in medicine highlights how both subjects and observers can unconsciously bias data to see patterns that do not exist.
You are lovingly taught how one theory was displaced by another, but you are seldom taught to replace a nice theory with nothing but randomness!
Now to the next story. A psychologist friend at Bell Telephone Laboratories once built a machine with
about 12 switches and a red and a green light. You set the switches, pushed a butt on, and either you got a
red or a green light. After the first person tried it twenty times they wrote a theory of how to make the greenlight come on. The theory was given to the next victim and they had their twenty tries and wrote theirtheory, and so on endlessly. The stated purpose of the test was to study how theories evolved.
But my friend, being the kind of person he was, had connected the lights to a random source! One day he
observed to me that no person in all the tests (and they were all high class Bell Telephone Laboratoriesscientists) ever said there was no message. I prompt ly observed to him not one of them was either a
statistician or an information theorist, the two cl asses of people who are intimately familiar with
randomness. A check revealed I was right!
This is a sad commentary on your education. You are lovingly taught how one theory was displaced by
another, but you are seldom taught to replace a nice theory with nothi ng but randomness! And this is what
was needed; the ability to say the theory you just read is no good and there was no definite pattern in thedata, only randomness.
I must dwell on this point. Statisticians regularly ask th emselves, âIs what I am seeing really there, or is it
merely random noise?â They have tests to try to answer these ques tions. Their answer is not a yes or no, but
only with some confidence a âyesâ or ânoâ. A 90% confidence limit means typically in ten trys you willmake the wrong decision about once, if all the other hypotheses are correct! . Either you will chose when
there is nothing there (Type 1 error) or you will reject when there is something there (Type 2 error). Muchmore data is needed to get to the 95% confidence limit, and these days data can often be very expensive togather. Getting more data is also time consuming so the decision is further delayedâa favorite trick of peoplein charge who do not want to bear the responsibility of their positionââGet more dataâ, they say.
Now I suggest to you quite seriously, many simulations are nothing more than Rorschach tests. I quote a
distinguished practioneer of management decision theory, Jay Forrester, âFrom the behavior of the system,doubts will arise that will call fo r a review of the original assumpti ons. From the process of working back
and forth between assumptions about the parts and the observed behavior of the whole, we improve ourunderstanding of the structure and dynamics of the system. This book is the result of several cycles of re-examination and revision by the authorâ.
How is the outsider to distinguish this from a Rorschach test? Did he merely find what he wanted to find,
or did he get at ârealityâ? Regretab ly, many, many simulations have a large element of this adjusting things
to get what they want to get. It is so easy a path to follow. It is for this reason traditional Science has a largenumber of safeguards, which these days are often simply ignored.
Do you think you can do things safely, that you know better? Consider the famous double blind
experiments which are usual in medical practice. The doctors first found if the patients thought they weregetting the new treatment then they responded with better health, and those who thought they were part ofthe control group felt they were not getting it and did not improve. The doctors then randomized thetreatment and gave some patients a placebo so the patient could not respond and fool the doctors this way.But to their horror, the doctors also found the doctors, knowing who got the treatment and who did not, alsofound improvement where they expected to and not wher e they did not. As a last resort, the doctors have
widely accepted the double b lind experimentâuntil all the data are in neither the patients nor the doctors
The Dangers of Simulation
- Human self-delusion makes it nearly impossible for researchers to remain objective without strict, blind protocols.
- Simulation is the only viable tool for answering 'What if...?' questions in a highly technical future.
- The scale of a computer or the quality of its output does not guarantee the validity of a simulation's results.
- Decision-makers must take personal responsibility rather than relying on mediocre committee compromises.
- Mastering simulation concepts is essential for leaders to effectively question results and avoid being misled.
Simulation is essential to answer the 'What if...?', but it is full of dangers, and is not to be trusted just because a large machine and much time has been used to get the nicely printed pages, or colorful pictures on the oscilloscope.
know who gets the treatment and who does not. Then the statistician opens the sealed envelop and the
analysis is carried out. The doctors wanting to be honest found they could not be! Are you so much better indoing a simulation you can be trusted not to find what you want to find? Self-delusion is a very commontrait of humans.SIMULATIONâIII 149
I started Chapter 19 with the problem of why anyone should believe in a simulation which has been done.
You now see the problem more clearly. It is not easy to answer unless you have taken a lot moreprecautions than are usually done. Remember also y ou are probably going to be on the receiving end of
many simulations to decide many questions which will arise in your highly technical future; there is noother way than simulations to answer the question âWhat ifâŚ?â In Chapter 18 I observed decisions must be
made and not postponed forever if the organization is not to flounder and drift endlesslyâand I amsupposing you are going to be among those who must make the choices. Simulation is essential to answerthe âWhat ifâŚ?â, but it is full of dangers, and is no t to be trusted just because a large machine and much
time has been used to get the nicely printed pages, or colorful pictures on the oscilloscope. If you are theone to make the final decision then in a real sense you are responsible. Committee decisions, which tend todiffuse responsibility, are seldom the best in practiceâ most of the time they represent a compromise which
has none of the virtues of any path and they tend to end in mediocrity. Experience has taught me generally adecisive boss is better than a waffling oneâyou know where you stand and can get on with the work whichneeds to be done!
The âWhat ifâŚ?â will arise often in your futures, hence the need for you to master the concepts and
possibilities of simulations, and be ready to question the results and to dig into the details when necessary. 150 CHAPTER 20
21
Fiber Optics
The Rise of Fiber Optics
- The author reflects on fiber optics as a case study in how to evaluate and engage with emerging technologies of high potential.
- Bandwidth is identified as the primary driver for the telephone company, with optical frequencies offering vastly superior transmission rates over electrical ones.
- Economic factors such as the scarcity of copper versus the abundance of sand (silica) made glass fibers a sustainable long-term alternative.
- Practical urban infrastructure constraints, like overcrowded wire ducts in Manhattan, necessitated the development of smaller-diameter transmission media.
- The author identified early technical hurdles, such as the difficulty of splicing hair-thin glass fibers without losing signal quality.
- Smaller fiber diameters were found to be essential for flexibility, allowing the glass to bend without leaking light.
During the early part of the talk the speaker remarked, âGod loved sand, He made so much of itâ.
One of the reasons for taking up th e topic of fiber optics is its significant history occurred within my
scientific lifetime, and I can therefore give you a report of how the topic looked to me at the time it was
occurring. Thus it provides an illustration of the style I adopted when facing a newly developing field of
great potential importance. The field of fiber optics is al so, of course, important in its own right. Finally, it
is a topic you will have to deal with as it further evolves during your lifetime.
When I first heard of a seminar on the topic of fiber optics at Bell Telephone Laboratories I considered
whether I should attend or notâafter all one must try to do oneâs own work and not spend all oneâs time inlectures. First, I reflected optical frequencies were very much higher than the electrical ones in use at time,
and hence the fiber optics would have much greater bandwidthâand bandwidth is the effective rate (bitsper second) of transmission, and is the name of the game for the telephone company, my employers at the
time. Second, I recalled Alexander Graham Bell had on ce sent a telephone conver sation over a light beamâ
but then he was a bit of a gadgeteer all his life. So it could be done, and had been done long ago. Third, Ialso knew about the internal reflections as you go from a higher index medium to a lower index mediumâyou see it in still water when viewed from below wher e there are angles which to tally reflect the light back
down into the water, Figure 21.I . Hence I understood, in a fair way, what an optical fiber would beâthey
were a novel idea then. I certainly had enough experience in college labs with drawing glass into fibers tounderstand how easy it would be due to the effects of surface tension to make round fibers of a fairlyuniform diameter, and to some exte nt the corresponding ro le of surface tension for liquid glass. Hence I
took the time to go and learn about this promising new development.
During the early part of the talk th e speaker remarked, âGod loved sand, He made so much of itâ. I heard,
inside myself, we were already having to exploit lower grade copper mines, and could only expect to havean increasing cost for good copper as the years went by, but the material for glass is widely available and is
not likely to ever be in short supply.
Either at the lecture, or soon afterwards I heard th e observation, âThe telephone wire ducts in Manhattan
(NYC) are running out of space and if the city continues to grow, as it has of late, then we will have to lay a
lot more ducts and this means digging up streets and sidewalks, but if we use glass fibers with their smallerdiameters then we can pull out the co pper wires and put the glass fibers in their placeâ. This told me for that
reason alone the Labs would have to do everything they could to develop glass fibers rapidly, that it wasgoing to be an ongoing source of computation problems, and hence I had better keep myself abreast ofdevelopments.
Long before this, once I had decided to stay at the Labs and realized my poverty in the knowledge of
practical electronics, I bought a couple of Heathkits and assembled them just for the experience, though theresulting objects were also useful. I knew, therefore, the amount of soldering of wires that went on, andimmediately identified a difficult point to watch forâhow did they propose to splice these fine, hair sized,
glass fibers and still have good transmission? You could not simply fuse them together and expect to get
decent transmission.
Why such small diameters as they we re proposing? It is obvious once you look at a picture of how a glass
fiber works, Figure 21.II . The thinner the diameter, the more the fiber can bend without letting the light get
out. That is one good reason for the smaller and sma ller proposed diameters, and it is not the cost of the
material nor the extra weight of la rger diameter fibers. Also, for many forms of transmission, a smaller
The Evolution of Fiber Optics
- Fiber optic cables offer superior signal clarity and inherent security because they are difficult to tap without detection.
- The technology is uniquely resistant to electromagnetic disturbances, including lightning and atmospheric atomic explosions, ensuring military support.
- Engineers solved light leakage by developing a graded index of refraction, similar to focusing techniques used in cyclotrons.
- The author advocated for single-mode signaling based on the same logic that favored binary systems in early computing.
- A critical challenge was ensuring that fiber splicing could be performed by field technicians under adverse conditions rather than just by lab experts.
- Advancements in glass purity reached a point where, if ocean water were as clear, one could see to the bottom of the Pacific.
They said if the ocean waters were as clear as were some of the glasses then you could see to the bottom of the Pacific Ocean!
diameter fiber will clearly have less distortion in the signal when going a given distance.
There was another major dividend I soon realized. The fi bers are so efficient, meaning they lose so few
photons, âtappingâ a line will be a difficult feat. Not that it is impossible, only it will be difficult. About thesame time I came to realize (due to some computations I was doing with a group in chemistry) that fiber
optics were resistant to electromagnetic disturbancesâespecially atomic bomb explosions in the upperatmosphere or on a battle field, or even lightning strikes. Yes, fibers were bound to get large amounts ofsupport for further research from the Milita ry, as well as from the Labs directly.
A trouble soon which arose, and I had anticipated it, was the outer sheathing put on the fine hair-sized
fibers might alter the local index of refraction ratios and let some of the light escape. Of course putting a
mirrored surface on the fiber would solve it. They soon had the idea of putting a lower index glass sleeve
around the higher index core, at human sizes where it is easily done, and then drawing out the resultingshape into the very thin fibers they needed.
Much later I heard of not one layer, but a smoothly graded change in index of refraction, and recognized
this was the same thing as the strong focusing which had been developed some years before for cyclotrons.
The grading could be done either by chemical or radiation treatments. Ra ther than have sharp reflections,
you can use the gradual bending of the rays back to the center as they get away from the middle of the fiber,Figure 21.III .
Figure 21.I
Figure 21.II
152 CHAPTER 21
I did not try to follow all the arguments for the multi-mode vs. the single-mode methods of signalingâ
and while I did a number of simulations via computers for the two sides of the debate, I sort of backed the
single mode on the same grounds that we had backed the binary against any higher base number systems incomputers. It is a technical detail anyway, includ ing the details of detectors and emitters, and not a
fundamental feature of the optical signaling.
Along the way I was constantly watching to see how they were going to splice the fibers. With the
passage of time there were a number of quite clever ways proposed and tested, and the very number ofalternates made me decide probably that feature which first attracted my attention would be handled fairly
easilyâat least the problem would not prove to be fatal in the field where it has to be done by techniciansand not in the labs where things can be done by experts under controlled conditions. I well knew thedifference by watching various projects (mostly in other companies) come to grief on the miserable factwhat can be done reliably in the lab by experts is not always the same as what can be done in the field bytechnicians who are in a hurry and are often operating under adverse conditions, to say the least.
As I recall they first field tested fiber optics by conn ecting a pair of cen tral offices in Atlanta, Georgia. It
was a success (the trial required some years to comp lete). Furthermore, outsiders from the glass business
began to make glasses which were remarkably clear at the frequencies we wanted to useâmeaning the
frequencies at which we had reliable lasers. They said if the ocean waters were as clear as were some of the
glasses then you could see to the bottom of the Pacific Ocean!
I soon noticed in the fiber cables we were: (1) detecting the optical signals, (2) conv erting to electronic form,
The Quest for Optical Amplification
- The author critiques the inefficiency of existing systems that convert signals between optical and electronic forms.
- Bell Labs and other research institutions prioritized the development of pure optical amplification to solve this design flaw.
- Several competing technologies emerged as potential candidates for standardizing optical field equipment.
- Solitons are highlighted as a superior transmission method because they maintain their shape during the amplification process.
- The stability of solitons prevents signal degradation as data travels through the fiber optic network.
One of the virtues of solitons is they can be amplified without changing their shape (which does not degrade as it goes
(3) amplifying it, and (4) conv erting back to optical form. It is hard to imagine a worse system design. So it
was immediately evident to me the Labs, and many others, would have to work intensively on opticalamplification. Watching things from afar, it soon became evident there were several candidates for optical
amplifiers, and therefore probably one or more would materialize as standard field equipment. One of the
virtues of solitons is they can be amplified without changing their shape (which does not degrade as it goes
Figure 21.III
FIBER OPTICS 153
Fiber Optics and Satellite Competition
- The author describes a strategic approach to emerging technologies: monitoring developments without necessarily becoming an expert in every field.
- Geopolitical tensions arise from equatorial countries claiming airspace rights over the limited orbital slots available for stationary satellites.
- Fiber optics offer superior bandwidth and privacy compared to satellites, which are limited by atmospheric interference and orbital spacing.
- The industry is transitioning from classical pulse signaling to soliton signaling, which may revolutionize signal analysis methods.
- Practical advantages of fiber optics, such as significant weight reduction, are transforming military and commercial aviation infrastructure.
The use of satellites means broadcasting the signalâcables give a degree of privacy and the ability to make the user pay rather than get a free ride.
along the fiber) while pulses are regenerated (which effectively reshapes them and appears to be slightly more
complex an operation than simple amplification).
All the practical parts seemed to be coming togeth er remarkably well, and as you know we now use of
fiber optics widely. I have told you as best I can, how I approached a new technology, what I looked for,what I watched for, what I ignored, what I kept abreast of, and what I pondered. I had no desire to becomean expert in the field; I had my hands full with computers and their rapid development, both hardware andsoftware, as well as the expanding range of applica tions. Every new field which arises in your future will
present you with similar questions, and you wi ll effectively answer by your later actions.
The present applications of fiber opt ics are very wide spread. I had lo ng realized as time went on the
satellite business was in for trouble. Stationary sa tellites for communication must be parked above the
equator; there is no other place fo r them. A number of the countrie s along the equator have, from the
earliest days, claimed we were inva ding their airspace and should be paying for the use of it. So far they
have not been able to enforce their claims, as the ad vanced countries have simply continued to use the space
without paying for it. I leave to you the justice of the situation: (1) the blatant ignoring of their claims, (2)whether or not they have a legitimate point, and (3) if because they are un able to use it now everyone else must
wait until they canâif ever! It is not a trivial questio n of international relations, and there is some merit on
all sides.
The satellites are now parked at about every 4° or so, and while we could park them closer, say 2°, we
will have to use much more accurate (larger diameter?) dishes on earth to beam signals up to them without
one signal slopping into the adjacent satellites. To a fair extent we can widen the bandwidth of the signalingand thus for a time extend the amount of traffic they can carry, but there ar e limits due to the atmosphere the
signals must traverse. On the other hand, fiber optics can be laid down on earth with any density you wish;cables of fibers can be easily made and the total possible bandwidth boggles the mind. The use of satellitesmeans broadcasting the signalâcables give a degree of privacy and the ability to make the user pay rather
than get a free ride. Both satellites and cables have th eir advantages and disadvantages. At present satellites
are frequently being used for what are essentially private communications and not broadcast situations.Time will probably readjust the matter so each is used in their best way.
Where are we now? We have already seen transoceanic cables with fibers instead of coaxial wave guides
at a great deal less cost and a great deal more ba ndwidth. We are at the moment (1993) haggling over
whether to use the most recently developed soliton signaling system or the classical pulse system of
communicating across the Pacific ocean to Japan. It is, I think, a matter of engineering de velopmentâin the
long run I believe solitons will be the dominant method, and not pulses. I advise you to watch to see if thereis a significant change in the technologyâcertainly if for the transmission of information via solitons wins
out over the current pulse signaling method then this should produce basically new methods of signalanalysis in the future, and you had best keep abreast of it if it happens, or else you, like so many otherpeople, will be left behind.
I read that in the Navy, as well as in the obvious Air Force and commercial aviation applications, the
decreased weight means great savings which can be used for other things. On a tour of the carrier Enterprisesome 14 years ago, being even then well aware of the trend to optical fibers, I looked especially at the duct
The Future of Fiber Optics
- Fiber optics are poised to replace traditional wiring for information handling in both military ships and civilian skyscrapers.
- The development of armored and lightweight fibers enables new applications, such as missiles that maintain two-way communication during flight.
- Optical technology could eventually move onto the chips themselves, potentially using light beams to replace physical wiring and avoid interference.
- A significant drop in switching costs through optical crossbars would necessitate a complete redesign of computer architecture beyond the von Neumann model.
- The author emphasizes that anticipating technological shifts is essential for researchers to lead rather than passively follow.
- Active anticipation prepares the mind to absorb and utilize new developments more effectively than a reactive approach.
Light beams can pass through one another without interference (provided the intensity is not too high) which is more than you can do with wires.
wiring and decided fibers will replace all those wires in so far as they are information handling wires. For
the distribution of power it is another matter entirely. But then, will centralized power distribution remain
the main method, or will, due to battle conditions, a decentralized power system aboard a ship become the
preferred method? It would better blend in with the obviously redundant fiber optic systems which willundoubtedly be installed as a matter of safety practice. And battle ships are not very different from WorldTrade type skyscraper office buildings!154 CHAPTER 21
We now have fiber optic cables which are sufficiently armored trucks can run over them safely, fibers so
light missiles are fired with an unreeling fiber attached throughout the flightâand this means two way
communication, both to direct the missile to the target and to get back what the missile can see as it flies.
Being in computers, I naturally asked myself how this could and would impact the design of computers.
You probably know we now (1993) often interconnect the larger units of a computer with fiber optics. Itseems only a matter of time before major parts of inte rnal wiring will go optical. Cannot one make, in time,
âmother boardsâ by which the integrated circuit chips are interconnected, using fiber optics? It does not
seem to be unreasonable in this day of the material sciences. How soon will fiber optic techniques get downto the chips? After all, the bandwidth of optics means, inferentially, higher pulse rates! Can we not in time
make optical chips, and have a general light source falling on a photocell on the chip (like some hand heldcalculators) to power the chip and avoid all the wiring of power distribution to the chips? Figure 21.IV .
Can we replace chip wiring with light beams? Light beams can pass through one another without
interference (provided the intensity is not too high) which is more than you can do with wires, Figure 21.V .
This brings up switching. Can crossbar switches be made to be optical and not electronic? Would not the
Bell Telephone Laboratories and others have to work on it intensively? If th ey succeed then will not it be
true switching, which has traditionally been one of the most expensive parts of a computer, will becomeperhaps one of the cheapest? At first memory was the expensive part of computers, but with magnetic
cores, and now with electro nic storage at fantastically cheap prices , the design and use of computers has
significantly changed. If a major drop in switching costs came about, how would you design a computer?
Would the von Neumann basic design survive at all? What would be the appropriate computer designs withthis new cost structure? You can try, as I indicat ed above, to keep reasonably abreast by actively
anticipating the way things and ideas might go, and then seeing what actually happens. Your anticipationmeans you are far, far better prepared to absorb the ne w things when they arise than if you sit passively by
and merely follow progress. âLuc k favors the prepared mind.â
That is the reason for this talkâto show you how some one tried to anticipate and be prepared for rapid
changes in technologies which would impact their research and work. You cannot lead everywhere in thishighly technological society, but you need not be left behind by every new developmentâas many peopleare in practice.
I have said again and again in this book, my duty as a professor is to increase the probability you will be a
significant contributor to our society, and I can think of no better way than establishing in you the habit ofanticipating things and lead ing rather than passively following. It seems to me I must, to accomplish my
Figure 21.IV
Figure 21.V
FIBER OPTICS 155
The Future of Fiber Optics
- The author emphasizes the importance of moving from a passive role to an active, anticipating role within an institution.
- Fiber optic 'drop lines' are predicted to replace traditional wiring, providing a single conduit for TV, radio, phone, and personalized news.
- Digital filters at the consumer end will allow users to select specific information channels from a unified data stream.
- Technological feasibility and economic efficiency do not guarantee adoption due to legal, social, and political restraints.
- A successful 'seer' or specialist must account for regulatory and societal resistance to avoid making false predictions.
Just because it can be done economically does not mean it should be done.
duty to you and to the institution, move as many of you as I can from a passive to a more active, anticipating
role.
In todayâs chapter you see I claim to have made no significant contribution, but at least I was prepared to
help others who were more deeply involved by supplying the right kinds of computing rather than slightlymisconceived computations which are so often done. I believe I often supplied that kind of service at BellTelephone Laboratories during the 30 years I spent there before my retirement In the fiber optics area I havetold you some of the details of what I did and how I did them.
Let me now turn to predictions of the immediate future . It is fairly clear in time âdrop linesâ from the
street to the house (they may actually be buried but are probably still called âdrop linesâ!) will be fiber
optics. Once a fiber optic wire is installed then poten tially you have available almost all the information you
could possibly want, including TV and radio, and possi bly newspaper articles selected according to your
interest profile (you pay the printing bill which occurs in your own house). There would be no need forseparate information channels most of the time. At your end of the fiber there are on e or more digital filters.
Which channel you want, the phone, radio or TV can be selected by you much as you do now, and thechannel is determined by the numbers put into the digi tal filterâthus the same filter can be multipurpose if
you wish. You will need one filter for each channel you wish to use at the same time (though it is possible a
single time sharing filter would be available) and each filter would be of the same standard design.Alternately, the filters may come with the particular equipment you buy.
But will this happen? It is necessary to examine political, economic, and social conditions before saying
what is technologically possible will in fact happen. Is it likely the government will want to have so muchinformation distribution in the hands of a single company? Would the present cable companies be willing toshare with the telephone company and possibly lose some profit thereby, and certainly come under moregovernment regulation? Indeed, do we as a society want it to happen?
One of the recurring themes in this book is freque ntly what is technologically feasible, and is even
economically better, is restrained by legal, social, and economic conditions. Just because it can be done
economically does not mean it should be done. If you do not get a firm grasp on these aspects then as a
practicing seer of what is going to happen in your area of specialization you will make a lot of falsepredictions you will have to explain as best you can when they turn out to be wrong. 156 CHAPTER 21
22
Computer Aided InstructionâCAI
The Illusion of Easy Learning
- The history of education is filled with failed promises of 'royal roads' to knowledge, ranging from sleep-learning to speed-reading.
- The author argues that intellectual achievement, like running a four-minute mile, requires inherent hard labor and cannot be bypassed by gimmicks.
- The absence of exceptional individuals produced by these alternative methods suggests that most 'shortcuts' to intelligence are ineffective.
- The Hawthorne effect explains why new educational tools often show initial success: students and teachers perform better simply because they feel cared for.
- While modern computers offer new possibilities, one must remain skeptical of claims that technology will finally eliminate the struggle of learning.
- Wishful thinking often leads people to believe in new methods because they desperately want an easier path to success.
There is a story from ancient Greek times of a Mathematician telling a ruler there were royal roads for him to walk on, and royal messengers to carry his mail, but there was no royal road to geometry.
Because computers were early installed in many Universities it was natural the question of Computer Aided
Instruction (CAI) would arise and be explored in some depth. Before we get to the modern claims it is wise
to get some perspective on the matter.
There is a story from ancient Greek times of a Math ematician telling a ruler there were royal roads for
him to walk on, and royal messengers to carry his mail, but there was no royal road to geometry. Similarly,
you will recognize money and coaching will do only a little for you if you want to run a four minute mile.There is no easy way for you to do it. The four minute mile is much the same for everyone.
There is a long history of people wanting an easy path to learning. Aldous Huxley, in his book Brave New
World discusses the idea of learning while sleeping via a microphone under your pillow telling you things
while you sleep, and he exposes the severe limitations of it. During my years at the Bell TelephoneLaboratories the Dianetic movement arose and promised it could âclearâ your brain of all its errors and then
you would be able to reason perfectly. There are still Dianetic Institutes, but the consensus is against themâ
particularly as the people produced by them seem not to have dominated any sector of human activities, let
alone all sectors. Another organization promises to rev eal the secrets of the anci ents (who were, some how,
so much smarter than we are now). We have endless ad s for speed reading, speed learning, etc, all of which
promise, in one way or another, to greatly improve your mind without the hard labor most of us have to put
in if we want to succeed. The test of all the previous proposals is not one of them has, as yet, produced a
significant number of exceptional people (that we know of at present). As Fermi said about the ExtraTerrestrial Intelligence and UFO people, âWhere are they, and why have we not met them?â
Hence all of past history with its many, many claims of easy learning speaks eloquently against the
current rash of promises, but it cannot, of course, prove some new gimmick will not succeed. You need to
take a large grain of salt with every such proposalâ but there could be new things the past did not know,and new tools like the cheap computers now available which were not available then, which could make thedifference. Regularly I read or hear I am supposed to believe the new gimmick, typically these days the
computer, will make a significant difference in spite of all past promises which have apparently failedmiserably. Beware of the power of wishful thinking on your partâyou would like it to be true so you assumeit is true!
There is another important factor, known as the Hawthorne effect, it is necessary to explain. At the
Hawthorne plant of Western Electric, long, long ago, some psychologists were trying to improveproductivity by various changes in the environment. They painted the walls an attractive color, andproductivity rose. They made the lighting softer and productivity rose. Each change caused productivity torise. One of the men got a bit suspicious and sneaked a change back to the original state and productivityrose! Why? It appears when you show you care then th e person on the other end responds more favorably
than if you appear not to care. Th e workers all thought the changes were being made for their benefit and
they responded accordingly.
In the field of education, if you tell the students you are using a new method of teaching then they
respond by better performance, and so, incidentally, do es the professor. A new method may, or may not, be
better, indeed it may be worse, but the Hawthorne effect , which is not small in the educational area, is likely
to indicate here is a new, importan t, improved teaching method. It hard ly matters what the new method is,
its trial will produce improvements if the students perceive it as being done for their benefit. Thus the
The Hawthorne Effect Challenge
- The Hawthorne Effect significantly compromises the validity of most educational experimentation.
- Experimental results are often skewed because subjects change their behavior simply because they are being observed.
- The author draws a direct parallel between educational research and the necessity of double-blind studies in medicine.
- Without controlling for the psychological impact of being studied, experimental outcomes may be misleading.
- The text emphasizes that this phenomenon is a pervasive issue across various fields of human research.
Hawthorne effect vitiates most educational experimentation.
Hawthorne effect vitiates most educational expe rimentation. You will recall my earlier discussion,
Chapter 20 , of the necessity of âdouble blindâ experiment s in medicineâit is the same in all situations
The Hawthorne Effect in Education
- Experimental results involving humans are often compromised by the Hawthorne effect, where subjects perform better simply because they feel they are receiving special attention.
- To ensure validity, experiments must be double-blind, keeping both the participants and the evaluators ignorant of who received the specific treatment.
- The author suggests that the ideal teaching method involves constant experimental change to maintain the belief in improvement among both professors and students.
- Automated systems like the Stanford 'grader program' often fail not due to lack of utility, but because of minor technical shifts and a lack of institutional persistence.
- Large-scale educational technology projects like PLATO often fail to provide rigorous evidence of improvement that accounts for the Hawthorne effect or long-term compounding benefits.
The Hawthorne effect strongly suggests the proper teaching method will always to be in a state of experimental change, and it hardly matters just what is done, all that matters is both the professor and the students believe in the change.
where the respondee senses special tr eatment and special care are being given. Those who later measure the
effects must also be kept in ignorance of who did or did no t get the special treatment! It is a fact of life in all
such experimentation, but it is usually ignored. Hence you should never believe the results of carelessly doneexperiments when they involve humans. The prestig e of the experimenter, the elaborateness of the
equipment, the cleverness of the data reduction, and especially your desire to believe, should not be allowed
to sway you. Again, this does not mean there is nothing there, only you need to be very, very careful beforeacting on such experiments.
The Hawthorne effect strongly suggests the proper teaching method will always to be in a state of
experimental change, and it hardly matters just what is done, all that matters is both the professor and the
students believe in the change.
Let me turn to some of the past history of the use of computers to greatly assist in learning. I recall in
1960 while I was at Stanford on a sabbatical, there wa s a âgrader programâ. Any problem the professor wanted
to assign to the class in a programming course required the professor to give a correct running program tosolve it, the names of the input variables, the ranges in which the input numbers could occur, and also alimit for the roundoff of the output numbers to be acceptable. When the students felt their program wasready for submission, they called the grader, gave their identification, and the machine generated some
random admissible input, ran both their and the professorâs program, and compared the results. Each outputnumber was, âRight or wrongâ. Such a grader can easily incorporate the time of compiling and the time ofrunning, which are mere numbers, and still be required to make no judgment on style.
The method is flexible, easily adapted to changes in the course and in the speci fic exercises assigned from
year to year. The program keeps a record in a private data base of the professor, and on demand from himgives the raw facts, leaving any evaluation to the profe ssor. Of course class averag es, variances, distribution
of grades, etc, can all be supplied to the professor from his data base, if wanted.
When I visited Stanford a couple of years later I asked about the grader program. I found it was not in
use. Why? Because, so they said, the first professor who had got it going left and a change had been made inthe monitor system would require a few changes in the program! Diligent watching and asking shows this isvery typical on many campuses. The machine is programm ed to greatly assist, apparently, the professor, but
the program is soon forgotten.
Let me turn to the project PLATO done by a friend of mine at the University of Illinois. I regularly met
him at various meetings, and once on a long airplane ride, and every time he told me how wonderfulPLATO was. For example, once he said at the same time Plato had a pupil from Scotland, one from
Canada, and one from Kentucky. I said I knew the telephone company could do that, and what he wassaying was totally irrelevant to whether or not PLATO was doing a better job than humans did. He never, to
my knowledge, produced any serious evidence PLATO did improve teaching in a significant fashionâabove what you would expect from the Hawthorne effect.158 CHAPTER 22
One claim made was the student was advanced about 10% along the education path over those who did
not use the system. When I inquired as to whether this meant it was the same 10% shift all through theeducational system, or whether he meant 10% on each course, compound interest as it were, he did not
know! What had he done about the Hawthorne effect? Nothing! So I do not know what was or was notaccomplished after spending the millions and millions of dollars of Federal money.
Once when I was the chief edit or of the ACM Publications a programmed book on computing was
submitted for publication.
Limits of Programmed Learning
- Programmed books and early computer-aided instruction (CAI) often fail because they prevent browsing and backtracking, making the learning experience rigid.
- Bright students frequently sabotage programmed texts by choosing wrong answers out of boredom to see how the system reacts.
- A lack of empirical evidence supports the claims that programmed texts are superior to traditional methods, relying instead on anecdotal opinions.
- Automated instruction excels at rote learning, such as arithmetic tables, where machines can tirelessly drill students on specific errors.
- Machines are highly effective for training conditioned responses, such as pilot training or fencing, where split-second reactions are more vital than deep reflection.
- The shift toward automation in organizations means future employment will value human judgment over the rote skills machines now handle better.
Another terrible fact is carefully watching the students to see what happens in practice has shown a good student often picks what they know is the wrong answer simply out of either boredom or amusement to see what the book will say.
A programmed book regularl y asks questions of the reader, and then, depending
on the response, the reader is sent to one of several branch points (pages ). In principle th e errors are caught
and explained again, and correct answers send the reader on to new material. Sounds good! Each studentgoes at their own pace. But consider , there can be no back tracking to find something you read a few pages
ago and are now a bit fuzzy about where you came from or how you got here. There can be no organized
browsing through the text. It really is not a book, thoug h from the outside it looks like one. Another terrible
fact is carefully watching the studen ts to see what happens in practice has shown a good student often picks
what they know is the wrong answer simply out of either boredom or amusement to see what the book willsay. Hence it does not always work out as it was thought it would; the better students do not necessarilyprogress significantly faster than the poorer ones!
I did not want to reject programmed books on my own opinion, so I went to the Bell Telephone
Laboratoriesâ psychology department and found the local expert. Among other things he said was there was
to be a large conference on programmed books the following week, and why did not I go? So I did. On theopening day we sat next to each other. He nudged me and said, âN otice no one will ever produce any
concrete evidence, they will only make claims programmed texts are betterâ. He was exactly rightâno
speaker had anything to offer in the form of hard, experimental evidence, only their opinions. I rejected thebook, and on hindsight I think I did the right thing. We now have computer discs which claim to do thesame thing, but I have little reason to suspect the di sc format makes a significant difference, though they
could backtrack through the pa th you used to get there.
I have just given some of the negative side of CA I. Now to the positive side. I have little doubt in
teaching dull arithmetic, say the addition and multiplication tables, a machine can do a better job than ateacher, once you incorporate the simplest program to no te the errors and generate more examples covering
that point, such as multiplying by 7, until the point is mastered. For such rote learning I doubt any of you
would differ from my opinion. Unfortunately, in the future we can expect corporations and other largeorganizations will have removed much of the need for just such rote learning (computers can often do itbetter and cheaper) and employment will usually require judgment on your part.
We now turn to airplane pilot training in the current trainers. They again do a better job, by far, than can
any real life experience, and generally the pilots have fairly little other human interactive training during the
course. Flying, to a fair extent, I point out, is a conditioned response is being trained into the human. It is not
much thinking, though at times thinking is necessary, it is more training to react rapidly and correctly, both
mentally and physically, to unforeseen emergencies.
It seems to me for this sort of training, where there is a conditioned response to be learned, machines can
do a very good job. It happens as a child I learned fencing. In a duel there is no time for local thinking; youmust make a rapid conditioned response. There is indeed a large overall planning of a duel, but moment to
moment it must be a response which does not involve the delay of thinking.
When I first came to the Naval Postgraduate School in 1976 there was a nice dean of the extension
division concerned with education. In some hot disc ussions on education we di ffered. One day I came into
his office and said I was teaching a weight lifting class (which he kn ew I was not). I went on to say
graduation was lifting 250 pounds, and I had found many students got discouraged and dropped out, someCOMPUTER AIDED INSTRUCTIONâCAI 159
repeated the course, and a very few graduated.
The Weight of Learning
- The author uses a weightlifting analogy to question whether making learning easier for students actually diminishes their intellectual development.
- A distinction is made between learning from others to follow and learning for oneself to lead, suggesting that struggle is essential for leadership.
- The author argues that speed in learning is a critical metric for identifying valuable individuals for society and leadership roles.
- Over-reliance on specific visual aids or graphics may inadvertently restrict a student's ability to generalize concepts to new, unfamiliar contexts.
- The 'transfer of training' problem is highlighted by students failing to recognize the same mathematical concepts when the notation or environment changes.
- The fundamental difficulty in assessing educational technology like CAI is the lack of a clear definition of what an educated person should be.
What you learn from others you can use to follow; What you learn for yourself you can use to lead.
I went on to say thinking this over last night I decided the
problem could be easily cured by simply cutting the weights in halfâthe student in order to graduate wouldlift 125 pounds, set them down, and then lift the other 125 pounds, thus lifting the 250 pounds.
I waited a moment while he smiled (as you probably have) and I then observed when I found a simpler
proof for a theorem in Mathematics an d used it class, was I or was I not cutting the weights in half? What is
your answer? Is there not an elemen t of truth in the observation the easier we make the learning for the
student the more we are cutting the weights in half? Do not jump to the conclusion I am saying poorchapters should be given because then the students must work harder. But a lot of evidence on what enabled
people to make big contributions points to the conclusion a famous prof was a terrible lecturer and thestudents had to work hard to learn it for themselves! I ag ain suggest a rule:
What you learn from others you can use to follow;
What you learn for yourself you can use to lead.
To get closer to the problem, to what extent is it proper to compare physical muscles with âmental
musclesâ? Probably they are not exactly equivalent, but how far is it a reasonable analogy? I leave it to youto think over.
Another argument I had with this same dean was his belief the students should be allowed to take the
extension courses which were under his wing at thei r own pace; I argued the speed in learning was a
significant matter to organizationsârapid learners were much more valuable than were slow learners (other
things being the same); it was part of our job to increase the speed of learning and mark for society thosewho were the better ones. Again, this is opinion, but surely you do not want very slow learners to be incharge of you. Speed in learning new things is not ev erything, to be sure, but it seems to me it is an
important element.
The fundamental trouble in assessing the value of CA I is we are not prepared to say what the educated
person is, nor how we now accomplish it (if we do!). We can say what we do, but that is not the same as
what we should be doing. Hence I can only give more anecdotes.
Consider the claims graphics well done would be of great assistance to learning basic concepts. Sounds
good, but consider the story I told you about my friend Kaiser, and how having learned filter theory in termsof time and voltage, he could not cope, in spite of directions, with the independent variable being energy.Again, Kaiser is a very smart person, but his education had restricted his view of the use of what he hadlearned. The better we inculcate the basic idea with the pictures drawn by the professor, the more we prevent
the student from later extending the ideas to completely new areas not thought of by the professor (and putinto the graphic display).
Let me tell you another story about the transfer of training, as it is calledâthe use of ideas from one
place to another. During the very early part of WWII I was teaching a calculus course at an engineering
school in Louisville. The students were having trouble in a course in thermodynamics taught by the dean ofengineering, who was an ex-sub marine commander and who scared the students. With the deanâs
permission I visited a class to see what was happening. He put on the board, at one point,
and asked what it was, and no student knew. The very next hour in my class across the hall I wrote
160 CHAPTER 22
The Transfer of Training
- Students often fail to apply knowledge learned in one classroom to an identical problem in a different setting.
- The 'transfer of training' is the critical ability to utilize existing ideas and skills in entirely new situations.
- The author identifies this lack of transfer as a major hurdle in education and a key area of their professional contribution at Bell Labs.
- Many students rely on rote memorization to pass mathematics courses rather than developing deep conceptual understanding.
- Analytic integration serves as a pedagogical wall where memorization fails and pattern recognition becomes necessary.
The fact is, what they knew in one class at on e hour with one professor did not transfer to the another hour in a room across the hall with another professor.
and they all knew immediately it was log x plus a constant When I wrote
they again knew. âWhy,â said I, âdid you not respond with that in the deanâs class last hour?â The fact is,
what they knew in one class at on e hour with one professor did not transfer to the another hour in a room across
the hall with another professor. Sounds strange, but that is what is known as the âtransfer of trainingââthe
ability to use the same ideas in a new situation. Transfer of training was a large part of my contribution to
Bell Telephone Laboratories -I did it quite often, though of course I do not know how many chances I
missed!
Let me turn to the calculus course I have often taught at the Naval Postgraduate School, though I had
formed this opinion years before. Students are remarkably able to memorize their way through many math
classes, and many do so. But when I get to analytic integration (I give the students a function and ask for its
indefinite integral) there is no way they can memori ze their way through the course the way I teach it. They
must learn to recognize
The Limits of Educational Automation
- Mastering analytic integration is essential for developing abstract pattern recognition and general intelligence.
- Removing difficult foundational tasks from curricula can have long-term negative consequences, similar to students failing to learn the alphabet.
- Computer Aided Instruction (CAI) is effective for low-level conditioned training but remains unproven for high-level education.
- The definition of an 'educated person' is poorly understood, making it difficult to judge the success of new educational proposals.
- Simulations like war games or business management programs risk training individuals for the wrong situations if the underlying models are flawed.
- Historical shifts in education, such as the move away from classical Latin and Greek, demonstrate that the definition of a core curriculum is constantly evolving.
At their age then it was practically impossible to make them so overlearn the alphabet they could use such information sources easily.
in an almost infinite number of disguises. For the first time in their career th ey are forced to learn to
recognize forms independent of the particular representationâwhich is a basic feature of Mathematics and
general intelligence. To take analytic integration out of the course, or transfer it to routines in computers, is
to defeat the purpose of a stage of learning something that is essential, in my opinion, unless something ofequivalent difficulty is put in. The students must mast er abstract patter n recognition if they are to progress
and use Mathematics later in their careers.
A very similar error was made years ago when I wa s a student at the University of Chicago. The
Education Department ran an Elementary School for research purposes. They had found students learn to
read by syllables not by letters, and so they decided to skip teachi ng the alphabet and get on to the real
reading. Which they did. Things went on quite well until late high school when it was found not knowingthe alphabet thoroughly the students could not effectively use dictionaries, phone books, etc. At their agethen it was practically impossible to make them so overlearn the alphabet they could use such information
sources easily. Thus I am wary of proposed changes until the consequences have been followedout carefully through long term predictions of all necessary needs for the material they are now going toomit.
In summary, as best I can, clearly in low level cond itioned response situations, typically associated with
training, I believe computers can greatly add in the learning process, but at the other end, high level
thinking, education, I am very skeptical. Skeptical, mainly b ecause we ourselves do not understand either
what we want to do, nor what we are presently doing! We simply do not know what we mean by âtheeducated personâ, let alone what it will mean in the year 2020. Without that knowledge, how am I to judgethe success of any proposal which is tried? Between low level training and high level education there is alarge area to be explored and exploited by organizatio ns outside the universities as well as inside. I will
discuss at great length in Chapter 26 the point rarely do the experts in a field make the significant steps forward;
great progress generally comes from the outside. The role of CAI in organizations with large training
programs will increase in the future as progress cons tantly obsoletes old tools and introduces new ones into
the organization that are generally more complex technically to use.COMPUTER AIDED INSTRUCTIONâCAI 161
Consider the programs on computers which are supposed to teach such things as business management,
or, even more seriously, war games. The machines can take care of the sea of minor details in the
simulation, indeed should buffer the player from them, and expect good, high level decisions. There may besome elements of low level training which must be incl uded, as well as the higher level thinking. We must
ask to what extent it is training and to what extent it is education. Of course, as mentioned in the threechapters on simulation, we also need to ask if the simulation is relevant to the future for which the trainingis being given. Will the presence of the gaming programs, if at all widespread, perhaps vitiate the training?You can be sure, however, even if the proposers cannot answer these questions, they will still produce andadvertise the corresponding programs. You may be a victim of being trained for the wrong situations!
A few hundred years ago the standard higher education was learning to read, write, and speak Latin,
along with a smattering of Greek and a knowledge of the Classics. This was the basic education with whichEnglishmen, for example, went out and created an em pire. Our present education has very, very little in
common with the classical one.
The Future of Mathematics
- Future education will differ as drastically from current systems as modern schooling does from classical education.
- Preparing students for a high-tech future requires a vision of the 'educated person' rather than mere technical tinkering.
- The universal availability of laptop computers and massive data processing power necessitates a total reevaluation of CAI projects.
- Progress often comes from examining 'background' elements like language and mathematics that are usually taken for granted.
- Mathematics is central to science and engineering, yet even professional mathematicians struggle to define it beyond circular logic.
Just because something can be done, especially using computers, does not mean it should be done.
I suggest strongly the future education will have as little to do with thepresent education as the present educ ation has with the classical education. Tinkering with small changes in
our present educational system will not meet the problem we face in preparing the students for the year
2020 when lap top computers are un iversally available along with imme nse storage capacity for information
and ability to process the data. Without a vision of wh at kind of education will be appropriate at that time
how are we to evaluate proposed CAI projects? Just because something can be done, especially using
computers, does not mean it should be done. We must create a vision of what the educated person will be in
the future society, and only then can we confidently approach th e problems which arise in CAI. 162 CHAPTER 22
23
Mathematics
As you live your life your attention is generally on the foreground things, and the background is usually
taken for granted. We take for granted, most of the ti me, air, water, and many other things such as language
and Mathematics. When you have worked in an organiza tion for a long time its structure, its methods, its
âethosâ if you wish, are usually taken for granted.
It is worth while, now and then, to examine these background things which have never held your close
attention before, since great steps forward often arise fr om such actions, and seldom otherwise. It is for this
reason we will examine Mathematics, though a similar examination of language would also prove fruitful.We have been using Mathematics without ever discussi ng what it isâmost of you have never really thought
about it, you just did the Mathematicsâbut Mathematics plays a central role in science and engineering.
Perhaps the favorite definition of Ma thematics given by Mathematicians is:
âMathematics is what is done by Mathemat icians, and Mathematicians are those who do
Mathematicsâ.
Coming from a Mathematician its circ ularity is a source of humor, but it is also a clear admission they do
not think Mathematics can be defined adequately. There is a famous book, What is Mathematics, and in it
the authors exhibit Mathematics but do not attempt to define it.
Once at a cocktail party a Bell Te lephone Laboratories Mathematics department head said three times to
a young lady,
Mathematics as Clear Thinking
- Mathematics is defined as the language of clear thinking, offering a precision that natural languages like Englishâwith its inherent contradictions and ambiguitiesâcannot match.
- While notations vary across cultures (such as Roman numerals versus binary), the underlying mathematical concepts, like the primality of seven, remain universal and independent of their representation.
- The author argues that any advanced extraterrestrial civilization would likely possess essentially the same mathematics as humans, as it is a prerequisite for mastering physical laws like Maxwell's equations.
- Platonism, the oldest school of mathematical thought, posits that numbers and theorems exist in an eternal world of ideas and are discovered rather than created.
- Despite its historical popularity, Platonism struggles to explain the evolution of mathematical definitions and concepts over time, which contradicts the notion of immutable, eternal truths.
You have only to look at the legal system and the income tax people, and their use of the natural language to express what they mean, to see how inadequate the English language is for clear thinking.
Mathematics is nothing but clear thinking.
I doubt she agreed, but she finally changed the subject; it made an impression on me. You might also say
Mathematics is the language of clear thinking.
This is not to say Mathematics is perfectânot at allâbut nothing better seems to be available. You have
only to look at the legal system and the income tax pe ople, and their use of the natural language to express
what they mean, to see how inadequa te the English language is for clear thinking. This simple statement, âI
am lying.â contradicts itself!
There are many natural languages on th e face of the earth, but there is essentially only one language of
Mathematics. True, the Romans wrote VII, the Arabic no tation is 7 (of course the 7 is in the Latin form and
not the Arabic) and the bina ry notation is 111, but they are all the same id ea behind the surface notation. A7
is a7 is a7, and in every notation it is a prime nu mber. The number 7 is not to be confused with its
representation.
Most people who have given the matter serious thought have agreed if we are ever in communication
with a civilization around some distant sun, then they will have essentially the same Mathematics as we do.Remember the hypothesis is we are in communi cation with them, which seems to imply they have
developed to the state where they have mastered the equivalent of Maxwellâs equations. I should note somephilosophers have doubted even their communication system, let alone any details of it, would resembleours in any way at all. But people who have their heads in the clouds all the time can imagine anything atall and are very seldom close to correct (witness so me of the speculation the surface of the moon would
have meters of dust into which the space vehicle would sink and suffocate the people).
The words âessentially equivalentâ are necessary beca use, for example, their Euclidean geometry may
include orientation and thus for the aliens two triangles may be congruent or anticongruent, Figure 23.I .
Similarly, Ptolemy in his Almagest on astronomy used the sin x where we would use 2sin( x/2), but
essentially the idea is the same.
Over the many years there has developed five main schools of what Mathematics is, and not one has
proved to be satisfactory.
The oldest, and probably the one most Mathematicians adhere to when they do not think carefully about
it, is the Platonic school . Plato (427â347 B.C.) claimed the idea of a chair was more real than any particular
chair. Physical chairs are subject to wear, tear, decay, and being lost; th e ideal chair is immutable, eternal,
so he said. Hence, he claimed, the world of ideas is more real than the physi cal world. The theorems of
Mathematics, and all other such results, belong in this world of ideas (so Plato claimed) along with thenumbers such as 7, and they have no existence in the physical world. Y ou never saw, heard, touched, tasted,
or smelled the abstract number 7. Yes, you have seen 7 horses, 7 cows, 7 chairs, but not the number 7 itself
âa pure 7 uncontaminated by any partic ular realization. In an image Plat o used, we see reality only as the
shadows it casts on a wall. The true reality is never visible, only the shadows of truth come to our senses. It
is our minds which transcend this limitation and reach the ideas which are the true reality, according toPlato.
Thus Platonic Mathematicians will say they âdiscove redâ a result, not they âcreatedâ it. I âdiscoveredâ
error correcting codes, rather than âc reatedâ them, if I am a Platonist. Th e results were always there waiting
to be discovered, they were always possible.
The trouble with Platonism is it fails to be ve ry believable, and certainly cannot account for how
Mathematics evolves, as distinct from expanding a nd elaborating; the basic ideas and definitions of
Mathematics have gradually changed over the centuries, and this does not fit well with the idea of the
immutable Platonic ideas.
Mathematical Foundations and Formalism
- Euler's concept of continuity differs significantly from modern interpretations, suggesting that mathematical ideas evolve as we perceive them more clearly.
- The Platonic view posits that all mathematical ideas and their logical consequences have existed eternally since the Big Bang.
- Formalists, led by David Hilbert, view mathematics as a mechanical game of symbol manipulation devoid of human interpretation to avoid error.
- A classic geometric fallacy proving all triangles are isosceles revealed fundamental gaps in Euclidean geometry regarding 'betweenness' and intersections.
- Hilbert's rigorous re-evaluation added numerous postulates to Euclid's work, yet remarkably, none of the original 467 theorems were found to be false.
- The fact that Euclid's theorems remained true despite lacking rigorous proofs suggests a mysterious alignment between intuition and formal logic.
For them all of Mathematics is a mechanical game where no interpretation of the meaning of the symbols is permitted lest you make an all too human error.
Eulerâs (1707â1793) idea of continuity is quite differet from the one you weretaught. You can, of course, claim the changes arise from our âseeing the ideas more clearlyâ with thepassage of time. But when one considers non-Euclidean geometry, which arose from tampering with onlythe parallel postulate, and then think of the many other potential geometries which must exist in thisPlatonic space-every possible Mathem atical idea and all the possible lo gical consequences from them must
all exist in Platoâs realm of ideas for all eternity ! They were all there wh en the Big Bang happened!
A second major school of Mathematicians is the formalists . To them Mathematics is a formal game of
starting with some strings of abst ract symbols, and making permitted formal transformations on the strings
much as you do when doing algebra. For them all of Mathematics is a mechanical game where no
interpretation of the meaning of the symbols is permitted lest you make an all too human error. This school
has Hilbert as its main protagonist. This approach to Mathematics is popular with the Artificial Intelligence
people since that is what machines do par excellence!164 CHAPTER 23
There was, probably, by the late Middle Ages (though I have never found just when it was first
discovered) a well known proof, using classical Euclidean geometry, every tr iangle is isosceles. You start with
a triangle ABC. Figure 23.II . You then bisect the angle at B and also make the perpendicular bisector of the
opposite side at the point D. These two lines meet at the point E. Working around the point E you establish
small triangles whose corresponding sides or angles are equal, and finally prove the two sides of the
bisected angle are the same size! Obviously the proof of the theorem is wrong, but it follows the style used
by classical Euclidean geometers so there is clearly something basical ly wrong. (Notice only by using
metaMathematical reasoning did we decide Mathemat ical reasoning this time cam e to a wrong conclusion!)
To show where the false reasoning of this result ar ose (and also other possible false results) Hilbert
examined, what Euclid had omitted to talk about, both betweeness and intersections. Thus Hilbert could
show the indicated intersection of the two bisectors met outside the triangle, not inside as the drawing
indicated. In doing this he added many more postulates than Euclid had originally given!
I was a graduate student in Mathematics when this fact came to my attention. I read up on it a bit, and
then thought a great deal. There are, I am told, some 467 theorems in Euclid, but not one of these theorems
turned out to be false after Hilbertâs added his postulates! Yet, every theorem which needed one of thesenew postulates could not have been rigorously âprovedâ by Euclid! Every theorem which followed, and
Figure 23.I
Figure 23.II
MATHEMATICS 165
The Illusion of Rigor
- Euclid and Hilbert did not discover truths through deduction, but rather worked backward from known results to find supporting postulates.
- The historical development of mathematics suggests that 'truth' often precedes the formal proof used to justify it.
- Formalism's claim that mathematics lacks inherent meaning fails to explain its immense utility in the physical world.
- If mathematics were merely an idle game like chess, there would be no logical reason for society to support its study.
- The logical school, including Whitehead and Russell, failed to successfully reduce all of mathematics to a branch of logic.
- Russell's definition of pure mathematics emphasizes the relationship between propositions over the actual truth of the premises.
Euclid did not lay down postulates and make deductions as it is commonly taught; he felt his way back from âknownâ results to the postulates he needed!
rested on such a theorem, was also not âprovedâ by Euclid. Yet the results in the improved system were still
the same as those Euclid regarded as being true. How could this be? How could it be Euclid, though he hadnot actually proved the bulk of his theorems, never made a mistake? Luck? Hardly!
It soon became evident to me one of the reasons no theorem was fa lse was that Hilbert âknewâ the
Euclidean theorems were âcorrectâ, and he had picked his added postulates so this would be true. But then Isoon realized Euclid had been in the same position ; Euclid knew the âtruthâ of the Pythagorean theorem,
and many other theorems, and had to find a system of postulates which would let him get the results heknew in advance. Euclid did not lay down postulates and make deductions as it is commonly taught; he felthis way back from âknownâ results to the postulates he needed!
To paraphrase one of Hilbertâs claims, âWhen rigor enters, meaning departs.â The formalists claim there
is no âmeaningâ in Mathematicsâbut if so why shou ld society support Mathematics and Mathematicians?
Why is it Mathematics has proved to be so useful? If th ere is no meaning in any place in all of Mathematics
then why is it postulates and definitions are altered in time? The formalists simply cannot explain whyMathematics is in fact more than an idle game with no more meaning than the moves of chess.
Closely related to the formalists is the logical school who have tried to reduce all of Mathematics to a
branch of logic. They, like every other school, have not been able to carry out their programâ and for themit is more painful than for the others since they ar e supposed to be logicians! The famous Whitehead and
Russell attempt, in three huge volumes, has generally been abandoned though large parts of their work has
been retained. To use a famous quote from Russell:
âPure Mathematics consists entirely of assertions to the effect that, if such and such a proposition is
true of anything, then such and such another proposition is true of that thing. It is essential not to
discuss whether the first proposition is really true, a nd not to mention what the thing is, of which it is
supposed to be true.â
The Foundations of Mathematics
- The author argues that mathematical foundations are often a 'penthouse' rather than a base, as assumptions are frequently chosen to support theorems we already believe are true.
- Mathematical rigor is a shifting standard, meaning that what is considered a 'proof' today may be seen as incomplete or flawed by future generations.
- The intuitionist school acknowledges that mathematics is a human creation, suggesting we are both the masters and servants of the systems we build.
- Proof should be viewed on a probabilistic scale from 0 to 1 rather than a binary of absolute certainty, as definitions and standards of logic evolve over time.
- Constructivists and computer scientists prioritize explicit methods of creation over the formalist view that consistency alone implies existence.
- Most practitioners treat mathematics as a practical tool, ignoring the philosophical contradictions inherent in its various schools of thought.
I will tell you to go back and get new assumptionsâI know Cauchyâs theorem is âtrueâ.
Here you see a blend of the logical and formalist sch ools, and the sterility of their views. The logicians
failed to convince people their approach was other than an idle exercise in logic. Indeed, I will stronglysuggest what is usually called the foundations of Mathematics is only the penthouse. A simple illustration ofthis is for years I have been saying if you come into my office and show me Cauchyâs theorem is false,meaning it cannot derived from the usual assumptions, then I will certainly be interested, but in the long runI will tell you to go back and get new assumptionsâI know Cauchyâs theorem is âtrueâ. Thus, for me at
least, Mathematics does not exclusively follow from the assumptions, but rather very often the assumptions
follow from the theorems we âbelieve are trueâ. I te nd, as do many others, to group the formalists and
logicians together.
Clearly, Mathematics is not the laying down of postulates and then making rigorous deduction from them
the formalists pretend. Indeed, almo st every graduate student in Math ematics has the experience they have
to âpatch upâ the proofs of ear lier great Mathematicians; and yet somehow the theorems do not change
much, though obviously the great Mathematician had not really âprovedâ the theorem which was beingpatched up. It is true (though seldom mentioned) defi nitions in Mathematics tend to âslideâ and alter a bit
with the passage of time, so previous proofs no longer apply to the same statement of a theorem now weunderstand the words slightly differently.
The fourth school is the intuitionists, who boldly face this dilemma and ignore rigor. If you want absolute
rigor, then, since we have had a rising standard of rigor, presumably no presently proved theorem is reallyâprovedâ, rather the future will have to patch up our results, meaning we will not have âprovedâ anything! Isuppose, if you want my position, I am partly an intuitionist. The above example about Cauchyâs theorem166 CHAPTER 23
illustrates my attitudeMathematics shall do what I want it to do. Contrary to Hermite (1822â1901) who
said, âWe are not the master but the servant of Mathemat icsâ, I tend to believe (some of the time) we are the
master. The postulates of Mathematics were not on the stone tablets Moses brought down from MountSinai; they are human made and hence subject to hu man changes as we please. Neither my view given
above nor Hermiteâs is exactly correct; the truth is a bl end of them, we are both the master and the servant of
Mathematics.
The nature of our language tends to force us into âyes-n oâ, something is or is not, you either have a proof
or you do not. But once we admit there is a changing standard of rigor we have to accept some proofs are
more convincing than other proofs. If you view proofs on a scale much like probability, running from 0 to1, then all proofs lie in the range and very li kely never reach the upper limit of 1, certainty.
The last major school is the constructivists . They insist you give explicit methods of constructing
everything you talk about, and not pr oceed as the formalists do who say if a set of postulates is not proved
to be inconsistent then the objects the postulates defi ne âexistâ. The constructivist âs approach can get you into
a lot of trouble. There is no really rigorous basis for Mathematics for any of the other four schools, but theconstructivists are too strict for many of our tastes since they exclude too much that we find valuable in
practice. Computer scientists, excluding the AI peop le, tend to belong to the constructivist school, if they
think about the matter at all.
Indeed, some numerical analysists te nd to believe the âreal number sy stemâ is the bit patterns in the
computerâthey are the true reality, so they say, and the Mathematicianâs imagined number system is
exactly that, âimaginedâ. Most users of Mathematics simply use it as a tool, and give little or no attention totheir basic philosophy.
The Nature of Mathematical Meaning
- The belief that software can be proven correct like mathematical theorems is flawed because theorems themselves are not strictly proven in the way many assume.
- Many programming problems are too ill-defined for formal proofs, as the resulting code often serves to define the problem itself.
- Mathematics is not a universal truth but a collection of different systems where symbols like 1+1 can equal 2 or 0 depending on the context.
- The meaning of mathematical symbols and language arises from how they are used and their relationships to other symbols rather than from inherent definitions.
- Mathematicians often adopt a formalist stance to avoid philosophical scrutiny, treating their work as a game with symbols despite its immense practical utility.
- The choice of which mathematical system to apply must be dictated by the specific field of application rather than a belief in absolute truth.
The professors are too busy doing the details of Mathematics to ever discuss what they are actually doingâa typical technicianâs behavior!
There is a group of people in software who believes we should âprove programs are correctâ much as we
prove theorems in Mathematics are corr ect. The two fallacies they commit are:
(1) we do not actually âproveâ theorems!
(2) many important programming problems cannot be defined sharply enough so a proof can be given,
rather the program which emerges defines the problem!
This does not mean there is nothing of value to their approach of proving programs are correct, only, as so often
happens, their claims are much inflated.
Most Mathematicans belong to the Platonic school when they are doing Mathematics from day to day,
but when pressed for a clear discussi on of what they are doing they usually take refuge in the formalist
school and claim Mathematics is an idle game with essentially no meaning to the symbols (not that they
believe this, but it is a nice defensible position to adopt). They pretend they believe in the above quotationfrom Russell.
As you know from your courses in Mathematics, what you are actually doing, when viewed at the
philosophical level, is almost never mentioned. The professors are too busy doing the details of
Mathematics to ever discuss wh at they are actually doingâa ty pical technicianâs behavior!
However, as you all know, Mathematics is remarkably useful in this world, and we have been using it
without much thought. Hence we need more discussion on this background material you have used withoutbenefit of thought.
The ancient Greeks believed Mathematic s was âtruthâ. There was little or no doubt on this matter in their
minds. What is more sure than 1+1=2? But recall when we discussed error correcting codes we said 1+1=0.This multiple use of the same symbols (you can claim the 1âs in the two statements are not the same things
if you wish) contradicts logical usage. It was prob ably when the first non Euclidean geometries aroseMATHEMATICS 167
Mathematicians came face to face with this matter th at there could be differen t systems of Mathematics.
They use the same words, it is true, such as points, lines, and planes, but apparently the meanings to be
attached to the words differ. This is not new to you; when you came to the topic of forces in mechanics and
to the addition of forces then you had to recognize scalar addition was not appropriate for vector addition. Andthe word âworkâ in physics is not the same as we generally mean in real life.
It would appear the Mathematics you choose to use mu st come from the field of application; Mathematics
is not universal and âtrueâ. How, th en, are we to pick the right Mathem atics for various applications? What
meanings do the symbols of Mathematics have in them selves? Careful analysis su ggests the âmeaningâ of a
symbol only arises from how it is used and not from th e definitions as Euclid, and you, thought when he defined
points, lines and planes. We now realize his definitions are both circular and do not uniquely defineanything; the meaning must come from the relationships be tween the symbols. It is just as in the interpretive
language I sketched out in Chapter 4 , the meaning of the instruction was contained in the subroutine it called
âhow the symbols were processedâand not in the name itself! In themselves the marks are just strings of
bits in the machine and can have no meaning except by how they are used.
The Mathematician Dodson (Lewis Carroll), who wrote Alice in Wonderland and Through the Looking
Glass, specialized in logic, and these two books are exte nsive displays of how meaning resides in the use.
Thus Humpty Dumpty asserted when he used a word it meant what he wanted it to mean, neither more nor
less; Alice felt words had meanings independent of their use, and should not be used arbitrarily.
By now it should become clear the symbols mean wh at we choose them to mean. You are all familiar
with different natural languages wher e different words (labels) are appare ntly assigned to the same idea.
Coming back to Plato; what is a chair?
The Paradox of Mathematical Meaning
- Language is inherently circular because words can only be defined by other words, making the initial acquisition of language by children a profound mystery.
- Meaning is not an absolute or prescriptive quality of words but arises dynamically from how they are used in specific contexts.
- Mathematics is revealed to be a collection of arbitrary human conventions rather than a repository of absolute, objective truth.
- The 'unreasonable effectiveness' of mathematics stems from our ability to map symbols onto reality through the recognition of analogies.
- Mathematics serves as a universal mental tool where meaning is injected during the translation of a problem into symbols and extracted during interpretation.
- The logical construction of the world remains a fundamental paradox, as it allows abstract symbolic manipulation to predict real-world outcomes.
We have passed from absolute certain truth in Mathematics to the state where we see there is no meaning at all in the symbolsâbut we still use them!
Is it always the same idea, or does it depend on context? At a picnic
a rock can be a chair, but you do not expect the use of a rock in someoneâs living room as a chair. You also
realize any dictionary must be circular; the first word you look up must be defined in terms of other wordsâthere can be no first definition which does not use words.
You may, therefore, wonder how a child learns a language. It is one thing to learn a second language
once you know a first language, but to learn the first language is another matterâthere is no first place toappeal for meaning. You can do a bit with gestures for nouns and verbs, but apparently many words are notso indicatable. When I point to a horse and say the word âhorseâ, am I indicating the name of the particular
horse, the general name of horses, of quadrupeds, of mammals, of living things, or the color of the horse?
How is the other person to know which meaning is mean t in a particular situation? Indeed, how does a child
learn to distinguish between the sp ecific, concrete horse, and the more abstract class of horses?
Apparently, as I said above, meaning arises from the use made of the word, and is not otherwise defined.
Some years back a famous dictionary came out and adm itted they could not prescr ibe usage, they could only
say how words were used; they had to be âdescriptiveâ and not âprescriptiveâ. Th at there is apparently no
absolute, proper meaning fo r every word made many people quit e angry. For example, both the New
Yorker book reviewer and the fictional detective Nero Wolfe were very irate over the dictionary.
We now see all this âtruthâ which is supposed to resi de in Mathematics is a mirage. It is all arbitrary,
human conventions.
But we then face the unreasonable effectiveness of Mathematics . Having claimed there was neither
âtruthâ nor âmeaningâ in the Mathematical symbols, I am now stuck with explaining the simple factMathematics is used and is an increas ingly central part of our society, es pecially in scien ce and engineering.
We have passed from absolute certain truth in Mathem atics to the state where we see there is no meaning at
all in the symbolsâbut we still use them! We put the meaning into the symbols as we convert theassumptions of the problem into Mathematical symbol s, and again when we interpret the results. Hence we168 CHAPTER 23
can use the same formula in many different situationsâM athematics is sort of a universal mental tool for
clear thinking.
A fundamental paradox of life, well stated by Einstein, is it appears the world is logically constructed .
This is the most amazing thing ther e isâthe world can be understood lo gically and Mathem atically. I would
warn you, however, recen t developments in basic physics casts some doubt on his remark, and this is
discussed in the next chapter.
Supposing for the moment the above remark of Einstein is true, then the problem of applying
Mathematics is simply to recognize an analogy be tween the formal Mathematical structure and the
corresponding part of ârealityâ. For example, for the er ror correcting codes I had to see for symbols of the
code, if I were to use 0 and 1 for the basic symbols, and use a 1 for the position of an error (the error was
simply a string of 0âs with one 1 where the error occurred), then I could âaddâ the strings if and only if Ichose 1+1=0 as my basic arithmetic. Two successive erro rs in the same position is the same as no error. I
had to see an analogy between parts of the problem an d a Mathematical st ructure which at the start I barely
understood.
Thus part of the effectiveness of Mathematics arises from the recogn ition of the analogy, and only in so
far as the analogy is extensive and accurate can we use Mathematics to predict what will happen in the real
world from the manipulation of the symbols at our desks.
You have been taught a large numb er of these identifications between Mathematical models and pieces of
reality.
The Future of Mathematical Models
- Future mathematical models will likely move away from simple correspondences toward complex systems where the whole is greater than the sum of its parts.
- The author defines mathematics broadly as any form of clear thinking, especially when utilizing symbols to solve complex organizational or technical problems.
- There are fundamental human experiences, such as music, painting, and poetry, that communicate truths which cannot be fully captured by words or discrete symbols.
- Concepts like truth, beauty, and justice remain elusive to formal definition, as evidenced by the gap between legal systems and the actual sense of justice.
- GĂśdel's theorem suggests inherent limitations in discrete symbol systems, implying that human language may have evolved its ambiguity and nuance specifically to bypass these logical constraints.
Indeed, a tone of voice, a lift of an eyebrow, the wink of an eye, or even a smile, can change the meaning of what is being said.
But I doubt these will cover all future developments. Rather, as we want, more and more, to do newthings which are now possible due to technical advancements of one kind or another, including
understanding ourselves better, we will need many other Mathematical models.
I suggest, with absolutely no proof, in the past we have found the easy applications of Mathematics, the
situations where there is a close correspondence between th e Mathematical structur e and the part being
modeled, and in the future you will have to be satisfied with poorer analogies between the two parts. Wewill, in time, I believe, want Mathematical models in which the whole is not the sum of the parts, but the
whole may be much more due to the âsynergismâ between the parts. You are all fa miliar with the fact the
organization you are in is often more than the total of the individualsâthere is morale, means of control,
habits, customs, past history, etc. which are indefina bly separate from the particular individuals in the
organization. But if Mathematics is cl ear thinking, as I said at the start of this chapter, then Mathematics
will have to come to the rescue for these kinds of pr oblems in the future. Or to put it differently, whatever
clear thinking you do, especially if you use symbols, then that is Mathematics!
I want to close with even more disturbing thoughts. It is not evident, though many people, from the early
Greeks on, implicitly act as if it were true, that all th ings, whatsoever they may be, can be put into wordsâ
you could talk about anything, the gods, truth, beauty, and justice. But if you consider what happens in amusic concert, then it is obvious what is transmitted to the audience cannot be put into wordsâif it couldthen the composer and musicians would probably have used words. All the music critics to the contrary,what music communicates cannot (apparently) be put into words. Similarly, but to a lesser extent, forpainting. Poetry is a curious field where words are used, but the true content of the poem is not in the
words!
Similarly, the three th ings of Classic Greece, truth, beauty and justice, though you all think you know
what they mean, cannot (apparently) be put into words. From the time of Hammurabi (1955â1913 B.C.) theattempt to put justice into words has produced the law, and often the law is not your conception of justice.
There is the famous question in the Bible, âWhat is truth?â And who but a beauty judge would dare to judgeâbeautyâ?MATHEMATICS 169
Thus I have gone beyond the limitations of Godelâs theorem, which loosely states if you have a
reasonably rich system of discrete symbols (the theorem does not refer to Mathematics in spite of the way itis usually presented) then there will be statements whose truth or falsity cannot be proved within thesystem. It follows if you add new assumptions to settle these theorems, there will be new theorems whichyou cannot settle within the new enlarged system. This indicates a clear limitation on what discrete symbol
systems can do.
Language at first glance is just a discrete symbol system. When you look more closely, Godelâs theorem
supposed a set of definite symbols with unchanging meaning (though some may be context sensitive), butas you all know words have multiple meanings, and degrees of meaning. For example the word âtallâ in a tall
building, a tall person, or a tall tale, has not exactly the same meaning each time it occurs. Indeed, a tone of
voice, a lift of an eyebrow, the wink of an eye, or ev en a smile, can change the meaning of what is being
said. Thus language as we actually use it does not fit into the hypotheses of Godelâs theorem, and indeed itjust might be the reason language has such peculiar features is in life it is necessa ry to escape the limitations
of Godelâs theorem.
Future Challenges in Computation
- The evolution of language remains a mystery, with current knowledge limited to guesswork regarding the selection forces of linguistic survival.
- Standard computers are currently restricted to handling discrete symbols, potentially limiting their ability to process complex, non-discrete phenomena.
- Neural networks may offer a different paradigm, where finite bandwidth and sampling rates define their operational equivalence.
- Past scientific progress has focused on 'easy problems,' leaving more complex, recalcitrant issues for future generations to solve.
- Addressing these remaining problems will likely require the invention of entirely new mathematical frameworks and novel ways of thinking.
- The potential for future discovery is vast, suggesting that what remains to be found far outweighs all past human knowledge.
The problems will not go awayâhence you will be expected to cope with themâand I am suggesting at times you may have to invent new Mathematics to handle them.
We know so little about the evolution of language and the forces which selected oneversion over another in the survival of the fittest language, that we simply cannot do more than guess at this
stage of knowledge of languages and the circumstances in which language developed and evolved.
The standard computers can presently handle discrete symbols (though what some neural networks
handle may be another matter), and hence, apparently, there may be many things they cannot handle. Asnoted in Chapter 19 , if you assume neural nets have a finite usable bandwidth then the sampling theorem
gives you the equivalence of bandwidth and sampling rate.
I think in the past we have done the easy problems, and in the future we will more and more face problems
which are left over and require new ways of thinking and new approaches. The problems will not go awayâhence you will be expected to cope with themâand I am suggesting at times you may have to invent new
Mathematics to handle them. Your future should be exciting for you if you will respond to the challenges in
correspondingly new ways. Obviously there is more for the future to discover than we have discovered inall the past! 170 CHAPTER 23
24
Quantum Mechanics
The Birth of Quantum Mechanics
- Science provides descriptions of how the universe functions but remains unable to explain the underlying 'why' behind physical laws.
- Classical physics at the turn of the 20th century failed to explain discrete atomic spectra, atomic stability, and black body radiation.
- Max Planck discovered his famous constant by accident when he realized his formula only worked if he refused to take the mathematical limit to zero.
- The author applies this historical lesson by choosing function classes that match a researcher's specific field rather than relying on standard polynomials.
- Quantum mechanics gained momentum only after Einstein used Planck's quanta to explain the photoelectric effect and Bohr modeled discrete electron orbits.
Fortunately for Planck, the formula fitted only so long as he avoided the limit, and no matter how he took the limit the formula disappeared.
Most physicists currently believe they have the basi c description of the univer se [though they currently
admit 90% to 99% of the universe is in the form of âdark matterâ of which they know nothing except it has
gravitational attraction]. You should realize in all of science there are only descriptions of how things
happen and nothing about why they happen. Newton gave us the formula for how gravity worked, and he
made no hypotheses as to what gravity really was, nor through what medium it worked, let alone why it
worked. Indeed, he did not belie ve in âaction at a distanceâ.
The reasons for discussing quantu m mechanics, QM, are: (1) it is basic physics, (2) it has many
intellectual repercussions, and (3) it provides a number of models for how to do things.
At the end of the 1800s and early 1900s physics was faced with a numb er of troubles. Among them were:
(1) classical physics dealt with co ntinuously varying things, and clearly the spectra of atoms came in
discrete lines, (2) electric charges, when moving, other than in a straight line, should radiate energy, hence
then the current picture of the atom with the electron going ar ound the center should radiate energy rapidly
and collapse into the nucleus, but obviously it was stable, (3) the black body radiation measured in thelaboratories had one shape, but the theories fitted one end or the other and each gave infinite energy for the
opposite end, and (4) many other troubles often centered around the discrete-continuous contradictions.
Max Planck (1858â1949) fitted the black body radiation experimental data with an empirical curve, and
it fitted so well he âknewâ it was âthe right formulaâ. He set out to derive it, but had troubles. Finally heused a standard method of breaking up the energy into finite sizes, and then going to the limit. In thecalculus course we do the same sort of thing; the integral is approximated by a finite number of small
rectangles, these rectangles are su mmed, and then the limit taken as the largest width approaches zero.
Fortunately for Planck , the formula fitted only so long as he avoided the limit, and no matter how he took
the limit the formula disappeared. He finally, being a very good, honest physicist, decided he had to stop
short of the limit, and that is what defines Planckâs constant!
The result was presented at a meeting (Dec. 1900) and later published, but was fairly well ignored. Even
Planck had little faith in it, until Einstein s howed how the finite pieces of energy, called quanta, would also
explain the photoelectric effect. This got quantum mechanics going. But it still drifted, even though Bohrdevised a model of the atom in which the electrons were confined to definite orbits and emitted energy onlywhen they changed orbits. This model came from the spectral line theory which had been built up based onarithmetical formulas with no known physical basis.
Before going on, let me discuss how this piece of hi story has affected my beha vior in science. Clearly
Planck was led to create the theory because the approximating curve fitted so well, and had the proper form.
I reasoned, therefore, if I were to help anyone do a similar thing I had better represent things in terms offunctions they believed would be proper for their field rather than in the standard polynomials. I thereforeabandoned the standard polynomial approach to approximation, which numerical analysts and statisticians
among others use most of the time, for the harder approach of finding which class of functions I should use.
I generally find the class of functions to use by asking the person with the problem, and then use the facts theyfeel are relevantâall in the hopes I will thereby, someday, produce a significant insight on their part. Well,I never helped find so large a contribution as QM, but often by fitting the problem to their beliefs I didproduce, on their part, smaller pieces of insight.
The Duality of Quantum Mechanics
- Quantum Mechanics was simultaneously pioneered in 1925 by Werner Heisenberg and Erwin SchrĂśdinger.
- Heisenberg's matrix mechanics focused strictly on measurable quantities like spectral lines.
- SchrĂśdinger developed a wave-based approach inspired by the earlier theories of de Broglie.
- Both mathematical frameworks successfully identified discrete eigenvalues corresponding to energy levels.
- Despite their visual and conceptual differences, the two theories were proven to be mathematically equivalent.
- The historical development suggests that a single body of observations can be explained by multiple theoretical forms.
It was quickly shown by SchrĂśdinger, Eckart, and others the two theories , though looking very much different we re, in many senses, equivalent to each other.
In 1925 the new QM was started by two people, Heisenberg and SchrĂśdinger. Heisenberg adopted the
position he would refer only to measurable quantities, the spectral lines for example, and was led to thematrix mechanics. SchrĂśdinger adopted a wave type a pproach based on the earlier work of de Broglie and
found a corresponding theory. Both Mathematical structures, as you know, admit discrete eigenvalues, to beidentified with the discrete energy levels of the spectral lines. It was quickly shown by SchrĂśdinger, Eckart,and others the two theories , though looking very much different we re, in many senses, equivalent to each
other.
Moral: there need not be a unique form of a theory to account for a body of observations, instead two
Limits of Scientific Theory
- Scientific data cannot produce a unique theory, as multiple internal structures can yield identical input-output results.
- Quantum Mechanics and Relativity replaced Newtonian physics by addressing phenomena at extreme scales of size, speed, and energy.
- The wave-particle duality remains a fundamental paradox that educators admit cannot be explained, only accepted through familiarity.
- The inability to resolve quantum paradoxes suggests there may be inherent biological limits to human thought and cognition.
- Quantum probability is an intrinsic property of individual particles rather than a statistical average of a larger set.
- The development of Quantum Mechanics was a process of 'groping around' and interpreting symbolic effects after the fact.
There are smells you can not smell, wave lengths of light you cannot see, sounds you cannot hear, all based on the limits of your sense organs, so why do you object to the observation given the wiring of the brain you have then there can be thoughts you cannot think?
rather different looking theories can agree on all the predicted details. You cannot go from a body of data toa unique theory! I noted this in the last chapter.
Another story will illustrate this point clearly. Some years ago when I took over a Ph.D. thesis from
another professor I soon found they were using random input signals and measuring the correspondingoutputs. I also found it was âwell knownââmeaning it was known, but almost never mentionedâquitedifferent internal structures of the black boxes they were studying could give exactly the same outputs,given the same inputs of course. There was no way, using the types of measurements they were using, todistinguish between the two quite different structures. Again, you cannot get a unique theory from a set ofdata.
The new QM dates from about 1925 and has had great success. It supposes energy, and many other things
in physics, come in discrete chunks, but the chunks are so small we, who are relatively large objects with
respect to the chunks, simply can no t perceive them other than with delicate experiments or in peculiar
situations.
The situation was, therefore, classical Newtonian m echanics, which had been ve ry well verified in so
many ways and had even successfully predicted the po sitions of unknown planets, was being replaced by two
theories, relativity at high speeds, large masses, and high energies, and QM at small sizes. Both theories
were at first found to be nonintuitive, but as time pa ssed they came to be accepte d widely, the special theory
of relativity being the more so. You may recall in Newtonâs time gravity (action at a distance) was not feltto be reasonable.
Newton had inferred light was particulate in nature, though he also had his âfitsâ of the parts. Initially
light was thought to be made of particles which trav elled in straight lines, but Youngâs wave picture of
light, which is the one you have probably been taught in optics courses, came to dominate the particlemodel. We now have to face the f act light apparently comes in quan ta, and the quanta appear to be both
particles and waves. Almost every professor when teaching QM is forced, one way or the other, to say, âIcannot explain this duality, you will get used to it!â
Again I stop and remark to you the obvious lessons to learn from this wave-particle duality. With almost
70 years, and no decent explanation of the duality, one has to ask, âIs it possible this is one of those things
we cannot think?â Or possibly it is on ly it cannot be put into words. There are smells you can not smell,
wave lengths of light you cannot see, sounds you canno t hear, all based on the limits of your sense organs,
so why do you object to the observation given the wiring of the brain you have then there can be thoughtsyou cannot think? QM offer a possible example. In almost 70 years and all the clever people who have172 284
taught QM, no one has found a widely accepted explanati on of the fundamental fact of QM, the wave-particle
duality. You simply have to get used to it, so they claim.
This in turn shows while they were developing the theory they were groping around not really âknowingâ
what they were doing. When they fo und an effect in the sy mbols they could interpret in the real world they
would then claim a step forward. Well along in the pr ocess of creating QM Born observed from the wave
function, in the SchrĂśdinger theory, the square of th e amplitude is to be interpreted as a probability of
observing something. Similarly for the matrix mechanics of Heisenberg. Complex numbers dominated thewhole theory from the beginning, hence the need to take the square of the absolute value to get a real
probability. Dirac observed a photon only interfered with itself, hence the probability was to be assigned tothe individual photons, hence in QM probability is not an average property of set of all photons (or
electrons from the Davisson-Germer experiment) as many probability books define probability.
Heisenberg and Conjugate Variables
- Werner Heisenberg derived the fundamental uncertainty principle governing quantum mechanics.
- The principle identifies a specific relationship between conjugate variables.
- Conjugate variables are mathematically defined as Fourier transforms of one another.
- This relationship implies a physical limit to the precision of simultaneous measurements.
Heisenberg derived the uncertainty principle that conjugate variables, mean ing Fourier transforms,
Heisenberg derived the uncertainty principle that conjugate variables, mean ing Fourier transforms,
Quantum Uncertainty and Free Will
- The author argues that the uncertainty principle in quantum mechanics is a mathematical property of linear models rather than a physical effect of nature.
- Debates over 'hidden variables' and the probabilistic nature of reality remain unresolved, with mathematical proofs often being found fallacious over time.
- The text posits that humans are 'rationalizing' rather than 'rational' animals, often choosing beliefs based on desire rather than logic.
- The author rejects the idea that quantum mechanics provides a basis for free will, noting that a probabilistic universe does not necessarily grant human agency.
- The 'atoms and void' perspective of modern physics is criticized for ignoring phenomena like self-awareness and consciousness.
- Recent experiments by Alain Aspect regarding particle polarization and entanglement present new, 'bothersome' challenges to classical physical intuition.
Man is not a rational animal, he is a rationalizing animal.
obeyed a condition which the product of the uncertainties of th e two had to exceed a fixed number,
involving Planckâs constant. I earlier commented, Chapter 17 , this is a theorem in Fourier transformsâany
linear theory must have a corresponding uncertainty principle, but among physicists it is still widelyregarded as a physical effect from Nature ra ther than a Mathematical effect of the model.
That the probability of events was all the theory s upplied made many people wonder if below this level
of these parts of Nature there might still be a perfectly definite structure, and we were seeing only thestatistical mechanics of it (but see Diracâs observation above). Von Neumann in his classic work on QM
proved there were no hidden variables, meaning there was no lower structure and Nature was essentially
probabilisticâa point Eins tein never would accept. But the proof wa s found to be fallacious, new proofs
found, and in their turn found to be fallaciousâthe current situation being a toss up as to what you want tobelieve.
Man is not a rational animal, he is a rationalizing animal.
Hence you will find that often what you believe is what you want to believe rather than being the result of
careful thinking.
This probabilistic basis of QM, with nothing de finite below it, attracted the attention of many
philosophers, and the general subject of free will was bandied about by them. The classical statement against
free will is the remark, âYou being what you are, the situation being what it is, can you do other than as youdo?â There is apparently no way th e question can be settled experi mentally, so the arguments go on.
Personally, and it is only my belief, I can see no connection between the twoâNature basically may be
probabilistic does not mean we are able to affect it in anyway, hence we cannot âchooseââthat is if you
accept the forces of official physics are all there is . Back in ancient Greek da ys, Democritus (about 460
B.C.) said, âAll is atoms and voidâ. This still the ba sic position of most physicistsâthey believe they know
everything there is (in the sense there are no unknown forces they have not detected).
It is a religious question to a great extentâyou can believe as you wish in this matter. If we have no free
will, then the wide spread belief in punishment by God (or gods) for our deeds seems a bit unfairâwe mustdo as we do if you accept the deterministic approach! On the other hand, if it is sensible to believe in justicefrom our God (or gods) then some sort of free will ought to be around. (Calvinists to the contrary.) And, ofcourse, âinfinite mercyâ implies being forgiven for ev erything you ever do; see the Amida Buddha sect in
Japan around the year 1000 A.D. for the extreme of such beliefs.
I do not believe it is reasonable to argue such questions based on QM. I doubt, between you and me, the
physicists know every thing. In my old age I have come to the belief there are such things as self-QUANTUM MECHANICS 173
awareness, self-consciousness, which cannot be ig nored as they are ignored in the âatoms and voidâ
theories. But how such things, if they exist (and in what senses they do exist) can interact with the real
world of atoms is not a bit clear to me. The psychophysical parallelism theory (the psychical and physicalworlds go on independent parallel tracks with no interconnections but they always agree perfectly) I wastaught in an early psychology course, seemed to me, even at then, to be utterly foolish. So I have nothing tooffer you in these matters, except not to depend on QM for much support for your beliefs.
But worse things were to come in QM. Alain Aspect, in Paris, has done some experiments which are
bothersome to say the least. Two particles with opposite spins are sent in opposite directions. The
polarization of them is not known, but it is believed when one is measured then the other will be found in
exactly the opposite polarization.
Quantum Non-Locality and Understanding
- Quantum mechanics asserts that measurement collapses the wave function, creating immediate effects across remote distances.
- The Aspect experiments demonstrate non-local effects where entangled systems interact instantaneously regardless of separation.
- While these effects seem to contradict relativity, they cannot be used for faster-than-light signaling, preserving a fragile theoretical harmony.
- Einstein and others resisted non-locality, yet Bell's inequalities and subsequent experiments have largely confirmed its reality.
- Human intuition, evolved for the macroscopic scale, fails to 'understand' quantum mechanics in the classical sense.
- Mathematical structures provide a vital tool for coping with and predicting phenomena that our brains are not wired to intuitively grasp.
QM is stranger than we ever believed, and seems to get stranger the longer we study it.
It is also a basic belief of QM it is only the act of measurement which puts
the wave function into some definite state; before measurement you have only the probability distribution.Thus the orientation of one measuring device at one end of the experiment will immediatelyâand weapparently mean immediately âaffect what is measured at the othe r, remote end of the experimentâsome
12 meters or so away! And this may at first seem to contradict both the speci al and general theories of
relativity! I said âseemâ because the theories predict you can do no useful signaling at faster than thevelocity of light. One can swing a bright light beam, as from a light house, so rapidly a point far out goesfaster than the velocity of light; but you cannot signal faster according to the two theories. The Aspectexperiments apparently force you to accept non-local effectsâwhat happens at one place is affected by
remote things and the effect which is transmitted does not, in any real sense, pa ss through the local areas in
between but gets there immediately . But apparently you cannot use the effect for useful signaling.
Others have done similar experiments showing the same kind of effect. There are, apparently, non-local
effects in QM. Two systems which were once âentangledâ, as they say, can forever interactthere is no such
thing as an isolated object, much as we talk about using them in classi cal experiments. Einstein could not
accept non-local effects, nor can many other people. But the experiments have been around for more than a
decade and many hypotheses have been devised to get around the conclusion of non-local effects, but fewof them have gotten much acceptance among physicists.
Einstein did not like the idea of non-local effects, and he produced the famous Einstein-Podolsky-Rosen
paper, (EPR), which showed there were restraints on what we could observe if th ere were non-local effects.
Bell sharpened this up into the famous âBell inequa litiesâ on the relationships of apparently independent
probability measurements, and this result in now widely accepted. Non-local effects seem to mean
something can happen instantaneously without requiring time to get from cause to effectâsimilar to the
states of polarization of the two pa rticles of the Aspect experiments.
So once more QM has flatly contradicted our beliefs and instincts which are, of course, based on the
human scale and not on the microscopic scale of atoms. QM is stranger th an we ever believed, and seems to
get stranger the longer we study it.
It is important to notice, while I have indicated mayb e we can never understand QM in the classical sense
of âunderstandâ, we have never-the-less created a fo rmal Mathematical structure which we can use very
effectively. Thus, as we go into th e future and perhaps meet many more things we cannot âunderstandâ, still
we may be able to create formal Ma thematical structures wh ich will enable us to cope with the fields.
Unsatisfactory? Yes! But it is amazing how you get used to QM after you work with it long enough. It ismuch the same story as your handling complex numbersâall the professorâs words about complexarithmetic, being equivalent to ordered pairs of real numbers with a peculiar rule for multiplication, meant
little to you; your faith in the ârealityâ of comp lex numbers came from using them for a long time and
seeing they often gave reasonable, useful predictions. Faith in Newtonâs gravitation (action at a distance)came the same way.174 284
I do not pretend to know in any detail what the futu re will reveal, but I believe since at every stage of
advance we tend to attack the easi er problems the future will includ e more and more things our brains,
being wired as they are, cannot âunderstandâ in the classical sense of understand. Still the future is nothopeless. I suspect we will need many different Mathematical models to help us, and I do not think this isonly a prejudice of a Mathematician.
The Frontier of Understanding
- Future opportunities in science require the intellectual courage to create and apply new forms of mathematics.
- The concept of 'understanding' remains elusive and difficult to define explicitly, much like St. Augustine's struggle to define time.
- Computers are central to human progress by providing tools for simulation and forcing us to confront new philosophical questions.
- Emerging experimental techniques in quantum mechanics are challenging long-held beliefs about particle indistinguishability and the uncertainty principle.
- Modern technology is shifting once-pure theoretical claims into the realm of experimental verification.
- Deeper experimental probing of tiny particles is likely to produce fundamentally new technologies for human use.
We all know what we mean by âunderstandâ until we try to say explicitly just what it meansâand then it sort of fades away!
Thus the future should be full of interesting opportunities for those
who have the intellectual courage to think hard and use Mathematical models as a basis for âunderstandingâ
Nature. Creating and using new, and different kinds of Mathematics seems to me, to be one of the thingsyou can expect to have to do if you are to get the âunderstandingâ you would like to have. The Mathematicsof the past was designed to fit the obvious situations, and as just mentioned we have tended to examine themfirst. As we explore new areas we can expect to need new kinds of Ma thematicsâand even to merely follow
the frontier you will have to learn them as they arise!
I have put the word âunderstandâ in quotes because I do not even pret end to know what I mean by it. We
all know what we mean by âunderstandâ until we try to say explicitly just what it meansâand then it sort offades away! St. Augustine (died 604 A.D.) observed he knew what âtimeâ was until you asked him about it,and then he did not know! I leave it to you in the future to try to explain (better than I can) what you mean
by the word âunderstandâ.
This brings me to anothe r theme of this book; progress is making us face ourselves in many ways, and
computers are very central in this process. Not only do they ask us questions never asked before, but theyalso give us new ways of answering them. Not just in giving numerical answers, but in providing a tool tocreate models, simulations if you prefer, to help us cope with the future. We are not at the end of theComputer Revolution, we are at the start, or possibly near the middle, of it.
I must make caveats if I am to be honest in these ma tters. It is traditional, and almost always assumed in
Quantum Mechanics, the probability distribution belongs to the particle. Long ago Landeâ suggested in the
two slit experiment the probability distribution belonged to the apparatus, not the photon, or the electron.This makes much of the mysticism, including Fe ynmanâs assertion the wave particle duality is
fundamentally a paradox, seem to disappear. Landeâ has been almost uniformly ignored, but experimentsnow planned, or already done, may re vive his opinion. We are currently successfully confining single atoms
for long periods so we think we know what we have, we are able to âtagâ a single atom by putting it in anexcited state and recognize it later, and hence the old statistics which assumed particles were
indistinguishable is coming under scrutiny. Long ago Davisson and Germer showed electrons also reveal aninterference pattern, and there is not a fundamental diff erence between photons and electrons in this matter.
We are now able to do the two slit experiment with so me of the lighter atoms, with, of course, much finer
interference patterns. There is a propos al to âtagâ an atom in the two slit experiment, and set things up so in
going through a slit a photon will be emitted, and hence we will know which path the atom took through theapparatus. Such experiments make the uncertainty pr inciple a subject for experi mental verification rather
than just a theoretical cl aim. Modern technology is making possible many such experimental refinements,
hence, broadly speaking, what was on ce pure theory becomes subject to ex perimental verification. It seems
to me as a result we will probably have to revise a lot of our beliefs, though it seems likely much of QMwill remain.
I can only speculate a result of this deeper experimental probing of our theories will, in the long run,
produce fundamentally new things to be adapted for human use, though the experiments themselves involveonly the tiniest of particles. Certainly, past history suggest this, so you cannot afford to remain totallyignorant of this exciting frontier of human knowledge. QUANTUM MECHANICS 175
25
Creativity
Defining Creative Novelty
- The terms creativity, originality, and novelty are often used interchangeably despite having distinct nuances.
- Defining these terms is essential for understanding how we value new ideas versus established traditions.
- Primitive societies and modern large organizations often prioritize ancestral methods over individual innovation.
- Novelty alone does not equate to value, as demonstrated by the triviality of multiplying two random large numbers.
- The author suggests that true creativity requires more than just doing something that has never been done before.
This is also true in many large organizations today; the elders are sure they know how the future should be handled and the younger members of the tribe when they do things differently are not appreciated.
Creativity, originality, novelty, and such words are regarded as âgood thingsâ, and we often fail to
distinguish between themâ indeed we find them hard to define. Surely we do not need three words with
exactly the same meaning, hence we should try to differentiate somewhat between them as we try to definethem. The importance of definitions ha s been stressed before, and we will use this occasion to illustrate an
approach to defining things, not that we will succeed perfectly or even well.
It should be remarked in primitive societies creati vity, originality, and novelt y are not appreciated, rather
doing as oneâs ancestors did is the proper thing to do. This is also true in many large organizations today;the elders are sure they know how the future should be handled and the younger members of the tribe whenthey do things differen tly are not appreciated.
Long ago a friend of mine in computing once remarked he would like to do something original with a
computer, something no one else had ever done. I promptly replied, âTake a random 10 decimal digitnumber and multiply it by another random 10 digit number and it will almost certainly be something no oneelse has ever doneâ. There are, using back of the envelop computing about (81/2)Ă10
18 such products, and
Defining Creativity and Originality
- Originality requires more than just novelty; a random act or a simple calculation lacks the effort or coincidence necessary to be considered significant.
- The art world often confuses shock value and novelty with true creativity, leading to a disconnect between modern artists and the public.
- Creativity implies a sense of value, though this value is subjective and may only be recognized by a small group of experts or future generations.
- The delayed acceptance of theories like continental drift and Mendelian genetics proves that science, like art, often fails to recognize creative breakthroughs in their own time.
- Societal definitions of creativity are often contradictory, such as in fashion where it means being different but not too different.
By a kind of inverted logic it does allow many people to believe because they are unappreciated therefore they must be a great artist!
with only around 3Ă1016 nanoseconds in a year you can estimate the odds of it being an original product.
Naturally he was not pleased with the suggestion, but he would have gladly settled for computing thelargest known prime number up to that time! Why th e difference? Why would one number go into a record
book, at least temporarily, and not the other? For one th ing, records require either a great deal of effort to
accomplish or else a remarkable co incidence, and the random multipli cation had neither so far as the
average person can see. Evidently ânot done beforeâ is hardly enough to make anything important ororiginal. âOriginalityâ seems to be more than not having been done before.
The Art world, especially painting, has had a great deal of trouble with the distinction between creativity
and originality for most of this century. Modern artists, and Museum Directors, offer to the public things
which are certainly novel and new, but which many of the potential paying public often does not like. For manypeople the shock value of various forms of art has finally worn off, and the average person no longerresponds to the current âm odern artâ. After all, I could paint a picture and it would be new and novel, but I
would hardly consider it as a âcreative work of Artââwhatever that means.
Evidently we want the word âcreat iveâ to include the concept of value âbut value to whom? A new
theorem in some branch of mathematics may be a cr eative act, but the number of people who can appreciate
it may be very few indeed, so we must be careful no t to insist the created thing be widely appreciated. We
also have the fact many of the current highly valued works of Art were not appreciated during the artistâslifetimeâindeed the phenomenon is so common as to be discouraging. By a kind of inverted logic it doesallow many people to believe becau se they are unappreciated theref ore they must be a great artist!
I hope the above has disentangled some of the confusion between creativity, novelty, and originality, but
I am not able to say just what this word âcreativityâ, which we value so much in our society, actually means.
In womenâs fashions it seems to mean âdifferentâ, but not âtoo differentâ!
I must continue for now using your intuitive feelings as to what the creative act is and how to recognize
it. In 1838 Thomas Dick published a book in which what is now called âcontinental driftâ was clearlymentioned, and in the early 1900s Wegener published a book devoted to the topic of continental drift but itwas only after WWII continental drift was accepted in official circles. So Art is not the only field in which
creativity is not recognized when it happensâScience has its failings t oo. One can also cite Mendel (1822â
1884) and his experiments with peas, which were ignored until three people in 1900 simultaneouslyrediscovered genetics, and then still later found Mendelâ s paper! In genetics Mendel now generally gets the
Defining Creative Synthesis
- Creativity is often the act of combining elements from established fields rather than inventing entirely new concepts.
- The perceived degree of creativity is not necessarily proportional to the difficulty of the execution.
- Applying standard methodologies from one discipline to another can result in highly influential and frequently cited work.
- A key component of creative success is the 'useful' integration of ideas previously thought to be unrelated.
- The psychological distance between the combined elements may be the primary measure of a creative act's significance.
Creativity seems, among other things, to be âusefull yâ putting together things which were not perceived to be related before, and it may be the initial psychological distance between the things which counts most.
public credit, but with continental drift it is often credited to the post WW-II creators.
In a discussion about creativity some one observed to me if he took parts of three extensively developed
fields and combined them simply then it would be a large creative act, the de gree of creativity does not
depend on how hard the actual act is to doâso far as it appears to later generations. I once applied the wellknown method of least squares to a problem in magnetics. The other person wrote it up, with me as jointauthor, and sent it to me for my signature (for release for publication). I went to a shrewd physicist friendand said I could not publish a paper which merely applied least squares. He observed to me his mostrequested reprint was for a paper in solid state phys ics which applied standard circuit analysis to the
problem; and since the paper awaiting my signature was new in the area I should sign and let it be published.
Creativity seems, among other things, to be âusefull yâ putting together things which were not perceived
to be related before, and it may be the initial psychological distance between the things which counts most.
How difficult was it for me to discard L
2 and use L1 when considering the distance between two strings of
The Mechanics of Creativity
- Creativity often arises from a specific set of the mind rather than formal brainstorming sessions, which have generally proven ineffective when strictly scheduled.
- While the author admits creativity cannot be taught through a simple formula, he argues that an individual's creative style can be improved through awareness and experience.
- The creative process typically begins with a deep emotional involvement and a period of problem refinement to avoid falling into conventional solutions.
- A period of temporary abandonment or 'gestation' is often essential, allowing the subconscious to work on the problem away from monomaniacal pursuit.
- The moment of insight is frequently followed by a cycle of failure and revision, where false starts serve to sharpen the eventual successful approach.
I often suspect creativity is like sex; a young lad can read all the books you have on the topic, but without direct experience he will have little chance of understanding what sex isâbut even with experience he may still not understand what is going on!
bits? All that can be said was it had apparently not been done before and doing so advanced the fieldsignificantly (at the same time maximum likelihood occurred in Shannonâs Information Theory papers, andit is equivalent to L
1).
It appears to be the âset of the mindâ at the creat ive moment enables creativity to be done. Can we do
anything to increase creativity? There are training courses, and books, as well as âbrain storming sessionsâwhich are supposed to do this. Taking the âbrain stormi ng sessionsâ first, while they were very fashionable
at one time, they have generally been found to be not much good when formally done, when a brainstorming session is carefully scheduled. But we all ha ve had the experience of âtossing an idea aroundâ with
a friend, or a few friends (but not a large group, generally) from which insight, creativity, or whatever youcare to call it, arises and we make progress. As for the many other approa ches to creativity, again the record
does not show any one approach has been so successful as to produce a great number of dominant figures inScience or any other field.
It should be evident, from the fact I am using a whole chapter on the topic, I think creativity in an
individual can probably be improved. Indeed, it has been a topic in much of the course, though I have oftencalled it âstyleâ. I believe the future will have even greater need for new, creativ e, ideas than had the past,
hence I must do what I can to in crease the probability you will form your own effective style and have
âgreat ideasâ. But except for discussi ng the topic, making you aware of it, and indicating what we think we
know about it, I have no real suggestions (I can put into concrete words) on how to make you, magically,
more creative in your careers. The to pic is too important to ignore, even if I do not understand the creative
act very well. Better I should try to do it, a person you know who has experienced it many times, than youget it from some people who themselves have never do ne a significant creative act . I often suspect creativity
is like sex; a young lad can read all the books you have on the topic, but without direct experience he willCREATIVITY 177
have little chance of understanding what sex isâbut even with experience he may still not understand what
is going on! So we must continue, even if we are not at all sure we know what we are talking about.
Introspection, and an examination of history and of reports of those who have done great work, all seem
to show typically the pattern of creativity is as follow s. There is first the recognition of the problem in some
dim sense. This is followed by a longer or shorter period of refinement of the problem. Do not be too hastyat this stage, as you are likely to put the problem in the conventional form and find only the conventionalsolution. This stage, more over, requires your emotional involvement, your commitment to finding a
solution since without a deep emotio nal involvement you are not likely to find a really fundamental, novel
solution.
A long gestation period of intense thinking about the problem may result in a solution, or else the
temporary abandonment of the problem. This temporary abandonment is a common feature of many greatcreative acts. The monomaniacal purs uit often does not work; the tempor ary dropping of the idea sometimes
seems to be essential to let the subconscious find a new approach.
Then comes the moment of âinsightâ, creativity, or what ever you want to call itây ou see the solution. Of
course it often happens that you are wrong; a closer examination of the problem shows the solution isfaulty, but might be saved by some suitable revision. But maybe the problem needs to be altered to fit thesolution! That has happened! More usually it is back to the drawing board, as they say, more mulling thingsover.
The false starts and false solutions often sharpen the next approach you try. You now know how not to do
it!
Managing the Creative Subconscious
- Problem-solving efficiency increases as you eliminate failed approaches and sharpen the definition of what a potential solution must look like.
- The subconscious can be managed by saturating it with a single problem for days or weeks, depriving it of other distractions until a solution emerges.
- Creativity relies heavily on reasoning by analogy, where a current problem is linked to previously stored knowledge from diverse fields.
- Effective knowledge storage requires examining ideas from multiple angles and focusing on fundamentals to create mental 'hooks' for future retrieval.
- Analogies do not need to be perfect to be valuable; even a poor or partial analogy can suggest the necessary next step in a complex process.
- The final stage of creativity involves logical cleaning and reorganization to translate idiosyncratic insights into a form others can understand.
My method, and it is implied above, is to saturate the subconscious with the problem, try to not think seriously about anything else for hours, days, or even weeks.
You have a smaller number of approaches left to explore. You have a better idea of what will not work
and possibly why it will not work.
When stuck I often ask myself, âIf I had a solution, what would it look like?â This tends to sharpen up the
approach, and may reveal new ways of looking at the problem you had subconsciously ignored but you nowsee should not be excluded. What must the solution involve? Are there conservation laws which must
apply? Is there some symmetry? How does each assump tion enter into the solution, and is each one really
necessary? Have you recognized all the relevant factors?
Out of it all, sometimes, comes the solution. So far as anyone understands the process it arises from the
subconscious, it is suddenly there! There is often a lot of further work to be done on the idea, the logical
cleaning up, the organizing so others can see it, the public presentation to others which may require newways of looking at the problem and your solution, not just your idiosyncratic way which gave you the firstsolution. This revision of the solution often brings clarity to you in the long run!
If the solution does come from the subconscious, what can we do to manage our subconscious? My
method, and it is implied above, is to saturate the subconscious with the problem, try to not think seriouslyabout anything else for hours, days, or even weeks, and thus the subconscious which, so far as we know,depends heavily upon live experiences to form its dreams, etc. is then left with only the problem to mullover. We simply deprive it of all else as best we can! Hence, one day, we have the solution, either as we
awake, or it pops into our mind without any preparation on our part, or as we pick up the problem againthere the solution is! In a way, I am repeating Pasteu r, âLuck favors the prepar ed mindâ. You prepare your
mind for success âby thinking on it constant lyâ (Newton), and occas ionally you are lucky.
Probably the most important tool in creativity is the use of an analogy . Something seems like something
else which we knew in the past. Wide acquaintance with various fields of knowledge is thus a helpâprovided you have the knowledge filed away so it is available when needed, rather than to be found only
when led directly to it. This flexible access to pieces of knowledge seems to come from looking at
knowledge while you are acquiring it from many different angles, turnin g over any new idea to see its many
sides before filing it away. This implies effort on your part not to take the easy, immediately useful178 CHAPTER 25
âmemorizing the materialâ path, but pr epare your mind for the future. It is for this reason I have urged you
in many of the chapters to get down to the fundamentals of a field, since it implies you must examine thingsmany ways before you can decide what is fundamental an d what is frills. In fact, for one person they may be
in one order, and for another in the opposite order. What is fundamental partly depends on the individualand their mental makeup. It is obvious you need many âhooksâ on the knowledge if you are to use it in newsituations.
We reason mainly by analogy. But it is curious a valuable analogy need not be closeâit need only be
suggestive of what to do next. A dream by Kekule about snakes biting their own tails suggested to him,
when he awoke, the ring structure of carbon compounds! Many a poor analogy has proved useful in thehands of experts. This implies the analogy you use is only partial and you need to be able to abandon itwhen it is pressed too far; analogies are seldom so perf ect that every detail in one situation exactly matches
those of the other. We find the an alogies when something reminds us of something elseâis it only a matter
of the âhooksâ we have in our minds?
Over the years of watching and working with John Tukey I found many times he recalled the relevant
information and I did not, until he pointed it out to me.
Cultivating the Creative Mind
- Effective information retrieval depends on creating mental 'hooks' by repeatedly mulling over new ideas and imagining future applications.
- Creativity is often spurred by the 'constant impinging of reality' and external pressure rather than ideal, peaceful environments.
- Self-management techniques, such as setting firm deadlines to induce a 'cornered rat' state, can force creative breakthroughs through pride and necessity.
- Creativity is not an innate gift but a skill that can be taught by consciously changing one's habits and self-perception.
- Personal transformation requires starting with small behavioral changes to build the self-confidence necessary for larger reformations.
- In a world of rapidly evolving technology, taking charge of one's own mental organization is essential for leadership rather than mere following.
In the past I have deliberately managed myself in this matter by promising a result by a given date, and then, like a cornered rat, having at the last minute to find something!
Clearly his information retrieval system had many moreâhooksâ than mine did. At least more useful ones! Ho w could this be? Probably b ecause he was more in the
habit than I was of turning over new information ag ain and again so his âhooksâ for retrieval were more
numerous and significantly better than mine were. Hence wishing I could similarly do what he did, I startedto mull over new ideas, trying to make significant âhooksâ to relevant information so when later I wentfishing for an idea I had a better chance of finding an analogy. I can only advise you to do what I tried to doâwhen you learn something new think of other applications of itâones which have not arisen in your pastbut which might in your future. How easy to say, but how hard to do! Yet, what else can I say about how toorganize your mind so useful things will be recalled readily at the right time?
Many books are written these days on the topic of creativity; we often talk about it, and we even have
whole conferences devoted to it, yet we can say so little! There is much talk about having the right surroundingatmosphereâas if that mattered much! I have seen the creative act done under the most trying circumstances.
Indeed, I often suspect, as I will later discuss more fully, what the individual regards as ideal conditions forcreativity is not what is needed, but rather the constant impinging of reality is often a great help.
In the past I have deliberately managed myself in th is matter by promising a result by a given date, and
then, like a cornered rat, having at the last minute to find something! I have been surprised at how often this
simple trick of managing myself has worked for me. Of course it depends on having a great deal of prideand self-confidence. Without self-confidence you are not likely to create great, new things. There is a thinline between having enough self-confidence and being over-confident. I suppose the difference is whetheryou succeed or fail; when you wi n you are strong willed, and wh en you lose you are stubborn!
Back to the topic of whether we can teach creativ ity or not. From the above you should get the idea I
believe it can be taught. It cannot be done with simple tricks and easy methods; what must be done is you
must change yourself to be more creative. As I have thought about it in the past I realize how often I have
tried to change myself so I was more as I wished I were and less as I had been. (Often I did not succeed!)
Changing oneself is not easy, as anyone who has gone on a diet to lose weight can testify; but that you canindeed change yourself is also ev ident from the few who do succeed in dieting, quitting smoking, and other
changes in habits. We are, in a very real sense, the sum total of our habits, and nothing more; hence bychanging our habits, once we understand which ones we should change and in what directions andunderstand our limitations in changing ourselves, then we are on the path along which we want to go.
In planning to change yourself clearly the old Greek saying applies, âKnow thyself.â and do not try
heroic reformations which are almost certain to fail. Practice on small ones until you gradually build upCREATIVITY 179
your ability to change yourself in the larger things. You must learn to walk before you run in this matter of
being creative, but I believe it can be done. Furthermor e, if you are to succeed (t o the extent you secretly
wish to) you must become creative in the face of th e rapidly changing technology which will dominate your
career. Society will not stand still for you, it will evolve more and more rapidly as technology plays anincreasing role at all levels of the organization. My job is to make you one of the leaders in this changingworld, not a follower, and I am trying my best to alter you, especially in getting you to take charge of
yourself and not to depend on others, such as me, to help. The many small stories I have told you about
Managing a Creative Career
- The ability to drop a wrong or unsolvable problem is essential to prevent career-long stagnation.
- Over-confidence from early success can lead highly creative individuals to waste decades on sterile pursuits.
- Different fields value age differently, with raw creativity peaking early in mathematics and physics while experience benefits literature and statesmanship.
- Significant achievements usually occur early in a career, suggesting that aspiring creators must start immediately rather than waiting for a perfect moment.
- Success is a combination of luck and preparation, where 'creativity' is the label given to historical breakthroughs achieved by prepared minds.
If you cannot drop a wrong problem then the first time you meet one you will be stuck with it for the rest of your career.
myself are partly to convince you that you can be creative when your turn comes for guiding our society toits possible future. The stories have also been included to show you some possible models of how to dothings.
I have not yet discussed the delicate topic of dropping a problem. If you cannot drop a wrong problem
then the first time you meet one you will be stuck with it for the rest of your career. Einstein was tremendouslycreative in his early years, but once he began, in mid- life, the search for a unified theory then he spent the
rest of his life on it and had about nothing to show fo r all the effort. I have seen this many times while
watching how Science is done. It is most likely to happen to the very creative people; their previoussuccesses convince them they can so lve any problem; but there are other reasons besides over-confidence
why, in many fields, sterility sets in with advancing age. Managing a creative career is not an easy task, or else
it would often be done. In mathematics, theoretical physics and astrophysics, age seems to be a handicap
(all characterized by high, raw creativity) while in music composition, literature, and statesmanship, age andexperience seem to be an asset. As valued by Bell Telephone Laboratories in the late 1970s, the first 15
years of my career included all they listed, and for my second 15 years they listed nothing I was veryclosely associated with! Yes, in my areas the really great things are generally done while the person is
young, much as in athletics, and in old age you can turn to coaching (teach ing) as I have done. Of course I
do not know your field of expertise to say what effect age will have, but I suspect really great things will berealized fairly young, though it may take years to get them into practice. My advice is if you want to dosignificant things, now is the time to start thinking (if you have not already done so) and not wait until it isthe proper momentâwhich may never arrive!
In closing I want to remind you yet again of Pasteurâs remark, âLuck favors the prepared mindâ. Yes it is
a matter of luck just what you do, it is much less luck you will do something if you prepare yourself tosucceed. âCreativityâ is just anothe r name for the great successes whic h make a difference in history. 180 CHAPTER 25
26
Experts
As remarked in an earlier chapter, as our knowledge grows exponentially we cope with the growth mainly
by specialization. It is increasingly true:
Experts, Generalists, and Paradigms
- Experts often win arguments by using unintelligible jargon and citing irrelevant specialist results to silence generalists.
- Scientific progress typically operates within a 'paradigm,' a set of unexamined assumptions and methods taught to students as absolute truth.
- Significant breakthroughs occur when contradictions in the current paradigm can no longer be ignored, leading to a sudden shift in belief systems.
- Established experts frequently resist new paradigms because they have significant personal and professional investment in the old approach.
- The history of continental drift illustrates how experts can ignore obvious physical evidence for decades until new measurement methods force acceptance.
An expert is one who knows everything about nothing; A generalist knows nothing about everything.
An expert is one who knows everything about nothing; A generalist knows nothing about everything.
In an argument between a specialist and a generalis t the expert usually wins by simply: (1) using
unintelligible jargon, and (2) citing their specialist results which are often completely irrelevant to thediscussion. The expert is, therefore, a potent factor to be reckoned with in our society. Since experts are
both necessary, and also at times do great harm in blocking significant progress, they need to be examinedclosely. All too often the expert misunderstands the problem at hand, but the generalist cannot carry thoughtheir side to completion. The person who thinks they understand the problem and does not is usually more
of a curse (blockage) than the person who knows they do not understand the problem.
Kuhn, in his book Scientific Revolutions examined the structure of scientific progress and introduced the
concept of paradigm (pattern, example) as a description of the no rmal state of Science. He observed most of
the time any particular science has an accepted set of a ssumptions, often not mentio ned or discussed, whose
results are taught to the students , and which the students in turn accept without being aware of how
extensive these assumptions are. There is also an accep ted set of problems and methods of attacking them.
The workers in the field proceed in this fashion, extending and elaborat ing the field endlessly, and simply
ignoring any contradictions which may come up.
Occasionally, usually because of the contradictions mo st of the people in the field choose to ignore or
simply forget, there will ar ise a sudden change in the paradigm, and as a result a new pattern of beliefs
comes into dominance, along with the ability to ask new kinds of questions and get new kinds of answers toolder problems. These changes in the dominant paradi gm of a science usually represent the great steps
forward. For example, both special relativity and QM represent such changes in the field of physics.
At first the change is resisted by the establishment, which has so much of their past effort invested in the
old approach, but usually, so Kuhn and others like to believe, the new triumphs over the old. I suppose ifyou allow enough time, then that is right, but the num ber of years may be more than the initiatorâs lifetime!
For example, I earlier mentioned that continental drift was discussed by Thomas Dick in 1838, and later in a
book by Alfred Wegener written in the early 1900s. As children both my wife and I (independently, as wedid not know each other at that time) read Wegenerâs book and noted yes, the shapes of Africa and South
America fit very well, but we were not convinced until Wegener also observed along certain correspondingparts of the two coasts the sequence of rock formations agreed in detail! Never mind it was obvious to eventhe untrained eye of a child, the experts would have no part of it, and it was ridiculed regularly by theexperts in geology.
There is another source for continental drift, namely the distribution of forms of life over the aeons of
history. The mutually common form s of life found in widely separate d places necessitated the creation of
âland bridgesâ which were supposed to have risen and sunk againâand the number of these, plus their
various placements, seemed unbelievable to me as a child, particularly as there were no observations of
their traces in the depths of the oceans to justify th em. The biologists studying the past, in trying to account
for what they saw, had also postulated both a Pang aea and Gonwanaland as successive arrangements of the
continents, not apparently caring for the âland bridgesâ which seem ed necessary otherwise, yet the
geologists still resisted. The concep t of continental drift was accepted by the oceanographer s only after WW-
II when by studying the o cean bottom they found, by magnetic meth ods, the actual cracks and the spreading
of the land on the ocean floor.
The Fallibility of Experts
- Scientific paradigms are often resisted by established experts until a mechanism is proven, at which point they claim to have always believed it.
- The history of innovation is filled with experts declaring feats like supersonic flight or heavier-than-air travel impossible just before they were achieved.
- Impossibility proofs are only as valid as their underlying assumptions, which experts frequently fail to re-examine in new contexts.
- A patent applicant successfully defied the 33-foot water-lifting limit by using standing waves, a method not considered by textbook-reliant officials.
- True innovation rarely comes from field experts because they are conditioned to dismiss or force-fit new data into existing frames of reference.
- A reliable rule of thumb is to trust an expert's positive prediction but seek a second opinion when they declare something impossible.
The record of the experts saying some thing is impossible just before it is done is amazing.
Of course geologists now claim they had always sort of believed in it (the textbooks they used to the
contrary) and it was only necessary to exhibit the actual mech anism in detail before they would accept the
continental drift theory, which is now âthe truthâ. This is the typical pattern of a change in the paradigm of a
field. It is resisted for a shorter or longer time (and I do not know how many theories were permanently lostâhow could I?) before being accepted as being right, and those concerned then saying they had not activelyopposed the change. You have probably heard many past examples such as the avia tion expert saying, just
before the Wright brothers flew, heavier than air flying was impossible, the old claim if you went too fast inan automobile or train that you would lose your breath and die, faster than sound flight (supersonic flight) wasimpossible, etc. The record of the experts saying some thing is impossible just before it is done is amazing.
One of my favorite ones was you cannot lift water more than 33 feet. But when the patent office rejected apatent which claimed his method could, the man demonstrated it by lifting water to the roof of theirbuilding, which was much more than 33 ft. How? He used, Figure 26.I , a method of standing waves which
they had not thought- about. When a low pressure of the standing wave appeared at the bottom then water wasadmitted into the column, and when a high pressure appeared at the top water exited due to the valves which
were installed. All the Patent Office experts knew was the text books said it could not be done, and theynever looked to see on what basis this was stated.
All impossibility proofs must rest on a number of assumptions which may or may not apply in the
particular situation.
Experts in looking at something new always bring their ex pertise with them as well as their particular way of
looking at things. Whatever does not fi t into their frame of reference is di smissed, not seen, or forced to fit
into their beliefs. Thus really new ideas seldom arise from the experts in the fi eld. You can not blame them
too much since it is more economical to try the old, successful ways before trying to find new ways of
looking and thinking.
All things which are proved to be impossible must obviously rest on some assumptions, and when one or
more of these assumptions are not true then the impo ssibility proof failsâbut the expert seldom remembers
to carefully inspect the assumptions before making their âimpossibleâ stat ements. There is an old statement
which covers this aspect of the expert. It goes as follows:
âIf an expert says something can be done he is probably correct, but if he says it is impossible then
consider getting another opinion.â
The Limitations of Experts
- The author argues that paradigm shifts occur on both large and small scales, often driven by adopting new perspectives like the frequency approach over traditional polynomial methods.
- Experts frequently resist innovation because they are conditioned to believe the established way of doing things is the only correct way.
- The rate of progress and innovation is accelerating, requiring modern professionals to endure more frequent paradigm changes than previous generations.
- Great innovations often originate from outsiders rather than field insiders, as seen in carbon dating coming from physics rather than archaeology.
- Understanding the rigid characteristics of experts is essential for both dealing with them and avoiding becoming a barrier to progress yourself.
They simply kept the polynomial approach, though under questioning they could give no real reason for doing so-simply that was the way things had been done, hence was the right way to do things.
Kuhn, and the historians of Science, have concentrated on the large changes in the paradigms of Science; it
seems to me much the same applies to smaller changes. For example, working for Bell Telephone182 CHAPTER 26
Laboratories it was fairly natural I should meet the frequency approach to numerical analysis, and hence
apply it to the numerical methods I used on the various problems I was asked to solve. Using the kinds of
functions the clients are familiar with means insight can arise from the solution details which suggest other
things to do than what they had originally thought. I found the frequency approach very useful, but some ofmy close friends, not at Bell Telephone Laboratori es, regularly twitted me about the frequency approach
every time they met me for all the years we have been meeting at various places. They simply kept thepolynomial approach, though under questioning they could give no real reason for doing so-simply that wasthe way things had been done, hence was the right way to do things.
It is not just for the pleasure of poking fun at the experts I bring this up. There are at least four other
reasons for doing so.
First, as you go on you will have to deal with experts many times, and you should understand their
characteristics.
Second, in time many of you will be experts, and I am hoping to at least modify the behavior of some of
you so that you will, in your turn, not be such a block on progress as many experts have been in the past.
Third, it appears to me the rate of progress, the rate of innovation and change of the dominant paradigm,
is increasing, and hence you will have to endure more changes than I did.
Fourth, if only I knew the right things to say to you then when a paradigm change occurs fewer of you
would be left behind in your careers than usually happens to the experts.
In discussing the expert let me introduce another as pect which has barely been mentioned so far. It
appears most of the great innovations come from outside the field, and not from the insiders. I cited above
continental drift. Consider archaeology. A central problem is the dating of the remains found. In the pastthis was done by elaborate, unreliable stratigraphy , by estimating the time needed to bury the material
where it was found. Now carbon dating is used as the main tool. Where did it come from? Physics! None ofthe archaeology experts would have ever thought of it. So far as I can make out, the first automatictelephone came from an undertaker who thought he was not getting fair treatment from the telephonecompany and designed a machine which would be fair. Si milar examples occur in mo st fields of work, but
Figure 26.I
EXPERTS 183
The Expert's Dilemma
- Albert Einstein's early career illustrates how revolutionary ideas often originate from outside the official university circle.
- Experts face a strategic dilemma between ignoring all 'crackpots' or risking their careers pursuing rare, innovative breakthroughs.
- Most professionals choose to ignore outsiders to avoid wasting time, effectively opting out of participating in future paradigm shifts.
- Insiders are often blinded by their own certainty, heavy investment in current methods, and mental laziness.
- Individuals must consciously decide whether they want to be a standard contributor or one of the few who fundamentally change their field.
- The same patterns of resistance to new ideas found in science likely apply to most other fields of human thought and technology.
Outs ide the field there are a large number of genuine crackpots with their craz y ideas, but among them may also be the crackpot with the new, innovative idea which is going to triumph.
the text books seldom, if ever, discuss this aspect. At the time of Einsteinâs famous âfive papers in one
yearâ he was working in the Swiss patent office! He had not been able to find an official position within thecircle of University physics. In fa irness to the system, in a few years he was recognized and offered various
prestigious positions, ending up in Berlin. The Nazis later drove him out of Berlin to the Institute ofAdvanced Study, Princeton.
Thus the expert faces the following dilemma. Outs ide the field there are a large number of genuine
crackpots with their craz y ideas, but among them may also be the crackpot with the new, innovative idea
which is going to triumph. What is a rational strategy for the expert to adopt? Most decide they will ignore,
as best they can, all crack pots, thus ensuring they will not be part of the new paradigm, if and when it
comes.
Those experts who do look for the possible innovative crackpot are likely to spend their lives in the futile
pursuit of the elusive, rare crackpot with the right idea, the only idea which really matters in the long run.
Obviously the strategy for you to adopt depends on how much you are willing to be merely one of thosewho served to advance things, vs. the desire to be one of the few who in the long run really matter. I cannot
tell you which you should choose that is your choice. But I do say you should be conscious of making thechoice as you pursue your career. Do not just drift along; think of what you want to be and how to get there.Do not automatically reject every crazy idea, the moment you hear of it, especially when it comes from outside
the official circle of the insidersâit may be the grea t new approach which will ch ange the paradigm of the
field! But also you cannot afford to pursue every âcrackpotâ idea you hear about. I have been talking aboutparadigms of Science, but so far as I know the same applies to most fields of human thought, though I have
not investigated them closely. And it probably happens for about the same reasons; the insiders are too sureof themselves, have too much invest ed in the accepted approaches, and are plain mentally lazy. Think of the
history of modern technology you know!
I have covered the two main problems of dealing with the experts. They are: (1) the expert is certain they
The Curse of the Expert
- Experts often fail to re-evaluate their core beliefs when technological shifts invalidate the original reasoning behind those beliefs.
- The 'closed world of theory' inhabited by experts leads to an intolerance of new opinions and a lack of humility regarding potential errors.
- During the early days of digital computing, established mathematicians at Bell Labs viewed machines as inferior tools or direct competition rather than assets.
- Progress is frequently hindered by experts who rely on past successes to justify their opposition to emerging paradigms.
- To avoid becoming a 'drag on progress,' one must regularly ask what evidence would be required to prove their current beliefs wrong.
- The author advocates for a 'man and machine' collaborative approach rather than viewing new technology as a threat to human expertise.
In some respects the expert is the curse of our society with their assurance they know everything, and without the decent humility to consider they might be wrong.
are right, and (2) they do not consider the basis for their beliefs an d the extent to which they apply to new
situations. I told you about the FFT and why it is not the Tukey-Hamming algorithm. That was not the onlytime I made such a mistake, forget ting there had been a technological ch ange which invalidated my earlier
reasoning, as well as the many other cases where I have observed it happen. To my embarrassment I told the
story in order to get the point vividly across to you. I made the mistake; how are you going to avoid it whenyour turn comes? No one ever told me about the problem, while I have told you about it, so maybe you willnot be as foolish as I have been at times.
With the rapid increase in the use of technology this type of error is going to occur more often, so far as I
can see. The experts live in their cl osed world of theory, cer tain they are right and are intolerant of other
opinions. In some respects the expert is the curse of our society with their assurance they know everything,
and without the decent humility to consider they might be wrong. Where the question looms so important I
suggested to you long ago to use in an argument, âWhat would you accept as evid ence you are wrong?â Ask
yourself regularly, âWhy do I believe whatever I doâ. Especially in the areas where you are so sure youknow; the area of the paradigms of your field.
The opposition of the expert is often not as direct as indicated above. Consider my experience at Bell
Telephone Laboratories during the earliest years of the coming of digital computers. My immediate bossesall had succeeded in the mathematical areas by us ing analytical methods, and during their heyday
computing had been relegated to some high school graduate girls with desk calculators. The bosses knew the
right way to do mathematics. It was useless to argue their basic assumptions w ith them``âthey might even
have denied they held themâsince th ey, based on their own experiences knew they were right! They saw,
every one of them, the computer as being inferior, beneath the considerat ion of a real mathematician, and in184 CHAPTER 26
the final analysis possibly in direct competition with themâthis later giving rise to fear and hatred. It was
not a discussible topic with them. I had to do computing in spite of all their (usually unstated) opposition, inspite of all the times they said they had done someth ing I could not do with the machines I had available at
the time, and in spite of all my polite replies I was not concerned with direct comp etition, rather I was solely
interested in doing what they could not do, I was c oncerned with what the team of man and machine could
do together. I hesitate to guess the number of times I gave that reply to a not direct but a covert attack oncomputers in the early days. And this in a highly enlightened place like Bell Telephone Laboratories.
The second point I want to make is many of you, in your turn, will become experts, and I am hoping to
modify in you the worst aspects of the know-it-all expert. About all I can do is to beg you to watch and seefor yourself how often the above descriptions occur in your career, and hope thereby you will not be thedrag on progress the expert so often is. In my own case, I vowed when I rose to near the top I would becareful, and as a result I have refused to take part in any decision processes involving current choices ofcomputers. I will give my opinion when asked, but I do not want to be the kind of drag on the nextgeneration I had to put up with from the past generation. Modesty? No, pride!
To put the situation in the form of a picture we draw a line in n-dimensional space to represent,
symbolically, the path of progress in time, Figure 26.II , which is drawn, of course, in 2-dimensions. At the start
of the picture, say 1935 and earlier, the direction wa s as indicated by the tang ent arrow, and those who
The Trap of Success
- Historical methods of success are becoming obsolete at an accelerating rate due to technological shifts like computing.
- Established leaders often struggle to admit that the very strategies that brought them to power are no longer appropriate.
- The rate of progress is likely accelerating, meaning future leaders will face even faster cycles of obsolescence.
- Applying past successful behaviors to new contexts is frequently counterproductive and can actively hinder progress.
- Current leaders should strive to step aside and allow the next generation to innovate without being blocked by outdated expertise.
- Even monumental figures like Einstein eventually became obstacles to the fields they helped create, such as Quantum Mechanics.
What you did to become successful is likely to be counterproductive when applied at a later date.
sensed what to do and how to do it (then) were the successful people, and were, therefore, my bosses. Then
computers came in and at the later da te the curve is now pointed in anot her direction, almost perpendicular
to the past one. It is asking a lot of them to admit the very methods they earlier used to succeed are not
appropriate present! But it is true, if this pi cture is at all like reality (remember it is in n-dimensional space).
If my claim progress has no t stopped miraculously at present, but rather there is prob ably an accelerating
rate of progress, then it will be even more true when you are in charge that:
What you did to become successful is likely to be counterproductive when applied at a later date.
Please remember this when you have risen to the top and are in charge; do as I have tried to do and let thenext generation have a cleaner chance at success than you were granted by your management while youwere rising to the top. I observed to you some lectures ago, a friend behind my back remarked he doubtedHamming understood error correcting codesâand I admitted probably he was right! I do believe in what Iam telling you; the old expert is all too often wrong and a block to progress. Consider the case of Einstein,who gave QM such a start with his photoelectric paper, and was in his turn a plain drag on QM when he so
Figure 26.II
EXPERTS 185
The Perils of Expertise
- Even foundational figures like Einstein can be left behind when they refuse to accept new paradigms in their own fields.
- The history of computing shows that experts often aggressively oppose innovations like symbolic languages and FORTRAN before being marginalized.
- Failing to keep up with technological and theoretical shifts leads to professional obsolescence and the eventual 'squeezing out' of once-prominent individuals.
- Professional stagnation creates a psychological toll, often tainting the memories of an entire career with bitterness and distaste.
- Civilization requires the conscious suppression of immediate, instinctive responses in favor of self-aware, calculated actions.
- The ultimate goal of self-awareness is to avoid the common trap where the expert of today becomes the obstacle of tomorrow.
If you are passed over for an important (to you) promotion in an organization, then it will tend to affect all the relevant memories of a great career and taint them darker.
aggressively opposed the theory of QM as it developed. Physicists are polite about this point as they hate to
admit their tin god Einstein could be so definitely wrong; they excuse him this way and that, but underpressure they have to admit once again the person who opened up the field did not understand what he haddone, and is best ignored at a later date!
There is the final, and overwhelming, reason for tel ling you these things. I have observed again and again
most experts are left behind as their field progresses and new paradigms come in. Taking only the history ofcomputing as I observed it, I have told you in Chapter 4 of the great opposition of the programmers to: (1)
symbolic languages (what you call machine language but is not absolute binary coding), (2) higher levelsoftware, and (3) FORTRAN when it first came in. What happened to many of them? Most of them
gradually dropped out of the field and disappeared! They could not keep up.
A very good friend of mine was a great analog enthusia st and it was from him I learned a lot about analog
computers when I acquired the management of the one at Bell Telephone Laboratories. When digitalmethods came in, he constantly emphas ized the advantages, at that time, of the analog computers. Well, he
was gradually squeezed out by his own behavior and fell back on other skills he had. But when I retiredearly to go to teaching, as I had long planned to do (sin ce I felt old research peop le mainly get in the way of
the young), he also retired. But I left with pleasant memories of Bell Telephone Laboratories and later, intalking with him, I found his memories are not so pleasant!
If you do not keep up in your field that is almost certainly what will happen to you. While living in
California I have met and talked with a number of ex-Navy officers of the rank Captain, and the stories theytell often reveal a degree of distaste in their careers. How could it be otherwise? If you are passed over for
an important (to you) promotion in an organization, then it will tend to affect all the relevant memories of agreat career and taint them darker. It is this social, as well as the economic, consequence I care about and
why I am preaching this lessonâyou must keep up or else things will overtake you and may spoil thememories of your career.
I have used isolated stories many times in these Lect ures. They are illustrative of situations, and I know
many other stories which would illustrate the same point s. I began to formulate many of these âtheoriesâ
long ago, and as time went on experience illustrated their truth many times over, though some turned out to
be false and had to be abandoned. These are not absolute truths, they are summaries of many observationswhich tend to âproveâ the points made. Of course, you can say I looked for confirmations, but being ascientist I tried also to look for falsifications an d in the face of counter ev idence had to abandon some
theories. When you think over many of the stories, they often have an element of âtruthâ based more onhuman traits than anything else. We are all human, but that does not prevent us from trying to modify ourinstincts which were evolved over the long span of history. Civilization is merely a thin veneer we have puton top of our anciently derived instincts, but the veneer is what makes it possible for modern society tooperate. Being civilized means, among other things, st opping your immediate response to a situation, and
thinking whether it is or is not the appropriate thing to do. I am merely trying to make you more self-aware
so you will be more âcivilizedâ in your responses and hence probably, but not certainly, more successful inattaining the things you want.
In summary, I began by warning you about dealing with experts; but towards the end I am warning you
about yourself when in your turn you are the expert. Please do not make the same foolish mistakes I did! 186 CHAPTER 26
27
Unreliable Data
The Fallibility of Data
- Data accuracy is frequently lower than advertised, which is a critical issue because it serves as the foundation for both human decisions and computer simulations.
- The author recounts a project for a 20-year submarine cable where the reliability of the test equipment itself was never properly questioned or verified.
- Accelerated life testing relies on shaky foundations like increasing temperature or voltage, yet it remains the industry standard due to time and budget constraints.
- The shrinking gap between scientific invention and engineering deployment often forces the use of new technology before its long-term reliability can be proven.
- Organizations often fail to invest in proactive research for testing methods, leading to a culture where there is never time to do a job right but always time to fix it later.
- A fundamental engineering paradox exists: how to validate highly reliable devices using less reliable equipment within a very limited timeframe.
As the saying goes, âThere is never time to do the job right, but there is always time to fix it later.â
It has been my experience, as well as many others who have looked, data is generally much less accurate
than it is advertised to be. This is not a trivial poi ntâwe depend on initial data for many decisions, as well
as for the input data for simulations which result in decisions. Since the errors are of so many kinds, and I
have no coherent theory to explain them all, I have th erefore to resort to isolat ed examples and generalities
from them.
Let me start with life testing . A good example is my experience with th e life testing of the vacuum tubes
which were to go into the first voice carrying subm arine cable with the hoped for life time of 20 years.
(After 22 years we simply removed the cable from se rvice since it was then too expensive to operateâ
which gives a good measure of technical progress these days.) The tubes for the cable first became availablesomething like 18 months before the cable was to go down. I had a moderate computer facility, including aspecial IBM 101 statistical sorter, and I made it available to the people who were processing the data, as
well as helping them do the more t echnical aspects of the computing. I wa s not, however, in any way involved
in the direct work of the project. Nevertheless, one day one of the higher ups in the project showed me thetest equipment in the attic. Being me, after a time I asked, âWhy do yo u believe the test equipment is as
reliable as what is being tested?â The answer I got convinced me he had not really thought about it, but
seeing pursuit of the point was fruitless, I let it drop. But I did not forget the question!
Life testing is increasingly important and increasin gly difficult as we want more and more reliable
components for larger and larger en tire systems. One basic principle is accelerated life testing, meaning
mainly if I raise the temperature 17° Centigrade then mo st, but not all, ch emical reactions double their rate.
There is also the idea if I increase the working voltage I will find some of the weaknesses sooner. Finally,
for testing some integrated circuits, increasing the fr equency of the clock pulses will find some weaknesses
sooner. The truth is, all three combined are hardly a firm foundation to work from, but in reply to thiscriticism the experts say, âWhat else can we do, given the limitations of time and money?â More and more,the time gap between the scientific creation and the engineering development is so small there is no time togain real life testing experience with the new device before it is put into the field for widespread use. If you
want to be certain then you are apt to be obsolete.
Of course there are other tests for other things besides those mentioned above. So far as I have seen the
basis of life testing is shakey; but there is nothing else available. I had long ago argued at Bell TelephoneLaboratories we should form a life testing department whose job is to prepare for the testing of the next
device which is going to be invented, and not just test after the need ar ises. I got nowhere, though I made a
few, fairly weak, suggestions about ho w to start. There was not time in th e area of life testing to do basic
researchâthey were under too much pressure to get the needed results tomorrow. As the saying goes,
âThere is never time to do the job right, but there is always time to fix it later.â
especially in computer software!
The question I leave with you is still, âHow do you pr opose to test a device, or a whole piece of
equipment, which is to be highly reliable, when all y ou have is less reliable test equipment, and with very
limited time to test, and yet the device is to have a very long lifetime in the field?â That is a problem whichwill probably haunt you in your future, so you might as well begin to think about it now and watch for clues
for rational behavior on your part when your time comes and you are on the receiving end of some life tests.
Let me turn now to some simpler aspects of measurements.
The Fallibility of Data
- Data accuracy is often compromised by human ego and institutional trust in equipment labels over empirical verification.
- Even automated systems designed to record their own operations can produce nonsensical data, such as calls to nonexistent central offices.
- Professional scientific and regulatory bodies often find it necessary to recalibrate every new instrument to ensure accuracy regardless of manufacturer claims.
- Data consistency should never be assumed, as even 'cleaned' datasets often contain residual errors that can ruin a project's results.
- Pilot studies can fail to predict large-scale errors because different organizational units may handle the main workload differently than the test case.
- The author advocates for a mandatory pre-testing phase for all data to identify outliers and inconsistencies before any processing begins.
You cannot even trust a machine to gather data about itself correctly!
For example, a friend of mine at Bell
Telephone Laboratories, who was a very good statisticia n, felt some data he was analyzing was not accurate.
Arguments with the department head they should be measured again got exactly nowhere since thedepartment head was sure his people were reliable and furthermore the instruments had brass labels on them
saying they were that accurate. Well, my friend came in one Monday morni ng and said he had left his brief
case on the railroad train going home the previous Friday and had lost everything. There was nothing elsethe department head could do but call for remeasur ements, whereupon my frie nd produced the original
records and showed how far off they were! It did no t make him popular, but did expose the inaccuracy of
the measurements which were going to play a vital role at a later stage.
The same statistician friend was once making a study for an outside company on the patterns of phone
calling of their headquarters. The data was being re corded by exactly the same central office equipment
which was placing the calls and writing the bills for ma king the calls. One day he chanced to notice one call
was to a nonexistent central office! So he looked more closely, and found a very large percentage of thecalls were being connected for some minutes to nonexi stent central offices! The data was being recorded by
the same machine which was placing the calls, but th ere was bad data anyway. You cannot even trust a
machine to gather data about itself correctly!
My brother, who worked for many years at the Los Angles Air Pollution, once said to me they had found
it necessary to take apart, reassemble, and recalibrate every new instrument they bought! Otherwise they
would have endless trouble with accuracy, and never mind the claims made by the seller!
I once did a large inventory study for Western Electric. The raw data they supplied was for 18 months of
inventory records on something like 100 different items in inventory. I asked the natural question of why Ishould believe the data was consistentâfor example, could not the records show a withdrawal when therewas nothing in inventory? They claimed they had thought of that and had in fact gone through the data andadded a few pseudotransactions so such things would not occur. Like a fool I believed them, and only late inthe project did I realize there were still residual inconsistencies in the data, and hence I had first to find them,then eliminate them, and then ru n the data all over again. Fr om that experience I learned never to process
any data until I had first examined it carefully for errors. There have been complaints that I would take too
long, but almost always I found errors and when I showed the errors to th em they had to admit I was wise in
taking the precautions I did. No matter how sacred the data and urgent the answer, I have learned to pretestit for consistency and outliers at a minimum.
I once became involved as an instigat or and latter as an advisor to a large AT&T personnel study using a
UNIVAC in NYC which was rented for the job. The data was to come from many different places, so I
thought it would be wise to have a pilot study run first to make sure the various sources understood whatwas going to happen and just how to prepare the IBM cards with the relevant data. This we did. But whenthe main study came in some of the sources did not punch the cards as they had been instructed. It took onlya little thought on my part to realize of course the pilot study being small in size went to their local keypunch specialty group, but the main study had to be done by the central group. Unfortunately for me theyhad not understood the purpose of the pilot study! Once more I was not as smart as I thought I was; I didnot appreciate the inner workings of a large organization.188 CHAPTER 27
But how about basic scientific data? In an NBS publication on the 10 fundamental constants of physics,
The Illusion of Accuracy
- Statistical analysis of historical physical constants reveals that new measurements frequently fall far outside the error margins of previous data.
- The average error in one 24-year period was over five times larger than the claimed accuracy of the original measurements.
- This phenomenon of over-optimistic accuracy is not limited to laboratory physics but extends to fundamental cosmological values like Hubble's constant.
- Experimentalists often inadvertently manipulate data by fine-tuning equipment until they achieve low variance, rather than true accuracy.
- The data provided to statisticians is often pre-selected for consistency, leading to a false sense of reliability in the final results.
Now you are ready to gather data, but first you fine tune the equipment. How? By adjusting it so you get consistent runs!
the velocity of light, Avagad roâs number, the charge on the electron, etc, there were two sets of data with
their errors. I promptly noted if the second set of data were taken as being right (and the point of the table
was how much the accuracy had improved in the 24 y ears between compilations), then the average amount
the new values fell outside the old errors was 5.26 7 as far, the last column which was added by me,
Figure 27.I . Now you would suppose the values of the physical constants had been carefully computed, yet
how wrong they were! The next compilation of physical constants showed an aver age error almost half as
large, chapter. Figure 27.II . One can only wonder what another 20 or so of years will reveal about the last
cited accuracy! Care to bet?
This is not unusual. I very recentl y saw a table of measurements of H ubbleâs constant (the slope of the
line connecting the red shift with distance) which is fundamental to most of modern cosmology. Most of thevalues fell outside of the given errors announced for most of the other values.
Figure 27.l
By direct statistical measurement, therefore, the best physical consta nts in the tables are not any where
near as accurate as they claim to be. How can this be? Carelessness and optimism are two major factors.
Long meditation also suggests the present experimental techniques you are taught are also at fault and
contribute to the errors in the cl aimed accuracies. Consider how you, in fact as opposed to theory, do an
experiment. You assemble the equipment and turn it on, and of course the equipment does not functionproperly. So you spend some time, often weeks, getting it to run properly. Now you are ready to gatherdata, but first you fine tune the equipment. How? By adjusting it so you get consistent runs! In simple
words, you adjust for low variance; what else can you do? But it is this low variance data you turn over to
the statistician and is used to estimate the variabil ity. You do not supply the correct data from the correct
adjustmentsâyou do not know how to do thatâyou supply the low variance data, and you get from thestatistician the high reliability you want to claim! That is common laboratory pr actice! No wonder the data
is seldom as accurate as claimed.UNRELIABLE DATA 189
Figure 27.II
I offer you Hammingâs rule:
The Illusion of Measurement Accuracy
- Published measurement accuracies are frequently exaggerated, with subsequent independent measurements often falling far outside previous confidence limits.
- As measurement precision improves, the inherent errors and assumptions within the underlying model become the dominant source of inaccuracy.
- In large organizations, highly precise engineering data is often combined with wild guesses, yet the final decision is treated with the reliability of the engineering part alone.
- Economic data is notoriously unreliable, exemplified by gold flow reports between countries that can differ by a factor of two to one.
- Changes in legal reporting rules, such as inventory tax laws, can create false signals in economic indices that are misinterpreted as shifts in market sentiment.
- The reliability of a sum is limited by its most uncertain component, a fact frequently ignored in both corporate and economic forecasting.
Careful estimates are combined with wild guesses, and the reliability of the whole is taken to be the reliability of the engineering part.
90% of the time the next independent measurement will fall outside the previous 90% confidence
limits!
This rule is in fact a bit of an exaggeration, but stated that way it is a memorable rule to recallâmost
published measurement accuracies are no t anywhere near as good as clai med. It is based on a lifetime of
experience and represents later disappointments with claimed accuracies. I have never applied for a grant tomake a properly massive study, but I have li ttle doubts as to the outcome of such a study.
Another curious phenomenon you may meet is in fitting data to a model there are errors in both the data
and the model. For example, a normal distribution may be assumed, but the tails may in fact be larger or
smaller than the model predicts, and possibly no negative values can occur although the normal distributionallows them. Thus there are two sources of error. As your ability to make more accurate measurements
increases the error due to the model be comes an increasing part of the error.
I recall an experience I had while I was on the Boar d of Directors of a computer company. We were
going to a new family of computers and had prepared very careful estimates of costs of all aspects of thenew models. Then a salesman estimated if the selling price were so much then he could get orders for 10, ifanother price 15, and another 20 sales. His guesses, and I do not say they were wrong, were combined withthe careful engineering data to make the decision on what price to charge for the new model! Much of thereliability of the engineering gue sses was transferred to the sum, a nd the uncertainty of the salesmanâs
guesses was ignored. That is not uncommon in big or ganizations. Careful estimates are combined with wild
guesses, and the reliability of the whole is taken to be the reliability of the engineering part. You may justlyask why bother with making the accurate engineering estimates when they are to be combined with other
inaccurate guesses; but that is wi de spread practice in many fields!
I have talked first about Science and Engineering so when I get to economic data you will not sneer at
them too much. A book I have read several times is Morgensternâs On the Accuracy of Economic
Measurements, Princeton Press, 2nd ed. He wa s a highly respected Economist.
My favorite example from his book is the official figures on the gold flow from one country to another,
as reported by both sides. The figures can differ at times by more than two to one! If they cannot get thegold flow right what data do you suppose is right? I can see how electrical gear shipped to a third world190 CHAPTER 27
country might get labeled as medical gear because of different import duties, but gold is gold, and is not
easily called anything else.
Morgenstern points out at one time DuPont Chemical held about 23% of the General Motors stock. How
do you suppose this appeared when the Gross National Product (GNP) figure was computed? Of course it wascounted twice!
As an example I found for myself, there was a time, not too long ago, when the tax rules for reporting
inventory holdings were changed, and as a result many companies changed their methods of inventory
reporting to take advantage of the new reporting rule s, meaning they now could show smaller inventory and
hence get less tax. I watched in vain in the Wall Street Journal to see if this point was ever mentioned. No,
it never was that I saw! Yet the inventory holdings are one of the main indices which are used to estimatethe expectations of the manufactur ers, whether we are headed up or down in the economy. The argument
goes when manufacturers think sales will go down they decrease inventory, and when they expect sales togo up they increase inventory so they will not miss some sales. That the legal rules had changed forreporting inventory and was part of what was behind the measurements was never mentioned, so far as I
could see.
This is a problem in all time series.
The Fallacy of Economic Data
- Definitions of metrics like poverty and unemployment are constantly shifting, making long-term historical comparisons unreliable.
- Institutions often prefer using irrelevant but consistent indicators over updating definitions to reflect modern shifts from manufacturing to service economies.
- Economic data is frequently gathered for unrelated purposes or intentionally falsified, leading to systemic inaccuracies.
- Hidden practices like secret customer discounts create biased pricing data that government economists cannot accurately track during market fluctuations.
- The author argues that economics lacks the rigor of a true science because its practitioners often refuse to acknowledge the fundamental flaws in their data.
- If engineering and scientific data are prone to significant error, social science data is likely even less reliable due to these compounding factors.
What is now called âpovertyâ is in many respects better than what the Kings of England had not too long ago!
The definition of what is being measured is constantly changing. For
perhaps the best example, consider poverty. We are co nstantly upgrading the level of poverty, hence it is a
losing game trying to remove itâthey will simply change the definition until there are enough of peoplebelow the poverty level to continue the projects they manage! What is now called âpovertyâ is in many
respects better than what the Kings of England had not too long ago!
In a Navy a Yeoman is not the same Yeoman over the years, and a ship is not a ship, etc, hence any time
series you study to find the trends of the Navy will have this extra factor to confound you in yourinterpretations. Not that you should not try to unders tand the situation using past data (and while doing it
apply some sophisticated signal processing Chapters 14 â17) but there are still troubles awaiting you due to
changing definitions which may never have been spelled out in any official records! Definitions have ahabit of changing over time without any formal statement of this fact.
The forms of the various economic indices you see published regularly, including unemployment (which
does not distinguish between the unemployed and the unemployable but should be in my opinion), weremade up, usually, long ago. Our society has in recen t years changed rapidly from a manufacturing to a
service society, but neither Washington, D.C. nor the economic indicators have realized this to any
reasonable extent. Their reluctance to change the definitions of the economic indicators is based on theclaim a change, as indicated in the above paragra ph, makes the past noncomp arable to the presentâ better to
have an irrelevant indicator than an inconsistent one, so they claim. Most of our institutions (and people)
are slow to react to changes such as the shift to service from manufacturing, and even slower to ask
themselves how what they were doing yesterday should be altered to fit tomorrow. Institutions and peopleprefer to go along smoothly, and hence lag far behind, than to make the effort to be reasonably abreast ofthe times. Institutions like people, tend to move only when forced to.
If you add to the above the simple facts most economic data is gathered for other purposes and is only
incidentally available for the economic study made, and there are often strong reasons for falsifying theinitial data which is reported, then you see why economic data is bad.
As another source for inaccuracy me ntioned by Morgenstern, consider discounts to favored customers is
a common practice, and these are jealously guarded secr ets. Now it happens in times of depression the
company will grant larger discounts, and decrease them when things are improving, but the Governmentfigures of costs must be based on the listed sales prices since the discounts are unknowable. Thus economicdown times and up times are systematically biased in different directions in the data gathered. UNRELIABLE DATA 191
What can the Government Economists use for their ba sic data other than much of this inaccurate,
systematically biased data? Yes, they may to a lesser or greater extent be aware of the biases, but they have
no way of knowing how much the data is in error. So it should not surprise you many economic predictionsare seriously wrong. There is little else they can do, hence you should not put too much faith in theirpredictions.
In my experience most Economists are simply unwilling to discuss the basic inaccuracy in the economic
data they use, and hence I have little faith in them as Scientists. But who said Economic Science is a
Science? Only the Economists!
If Scientific and Engineering data are not at all as ac curate as they are said to be, by factors of 5 or more
at times, and economic data can be worse, how do you suppose Social Science data fares? I have no
comparable study of the whole field, but my little, limited experience does suggests it is not very good.
The Fallibility of Data
- Human beings are fundamentally unreliable at repetitive tasks and accurate counting, making manual data collection inherently prone to error.
- Large-scale surveys are often less accurate than small, carefully selected samples due to the difficulty of maintaining quality control over massive datasets.
- The phrasing and sequencing of questionnaire items frequently manipulate respondents into providing the specific answers desired by the surveyors.
- Averages derived from non-homogeneous groups are often meaningless and fail to represent any actual individual within the population.
- Data accuracy is unlikely to improve significantly in the future due to the dismissive attitudes of many experts toward these systemic flaws.
Small samples carefully taken are better than large samples poorly done.
Again, there may be nothing better available, but that does not mean what data is available is safe to use.
It should be clear I have given a go od deal of attention to this matter of the accuracy of data during most
of my career. Due to the attitudes of the experts I do not expect anything more than a slow improvement in
the long future.
If the data is usually bad, and you find that you have to gather some data, what can you do to do a better
job? First, recognize what I have repeatedly said to you, the human animal was not designed to be reliable;it cannot count accurately, it can do little or nothing repetitive with great accuracy. As an example, considerthe game of bowling. All the bowler needs to do is throw the ball down the lane reliably every time. Howseldom does the greatest expert roll a perfect game! Drill teams, precision flying, and such things are
admired as they require the utmost in careful training and execution, and when examined closely leave a lot
to be improved.
Second, you cannot gather a really large amount of data accurately. It is a known fact which is constantly
ignored. It is always a matter of limited resources and limited time. The management will usually want a
100% survey when a small one, co nsisting a good deal less, say 1% or even 1/10%, will yield more accurate
results! It is known, I say, but ignored. The telephone companies, in order to distribute the income to the
various companies involved in a single long distance phone call, used to take a very small, carefully
selected sample, and on the basis of this sample they distributed the money among the partners. The same isnow done by the airlines. It took them a long while before they listened, but they finally came to realize thetruth of: Small samples carefully taken are better than large samples poorly done . Better, both in lower cost
and in greater accuracy.
Third, much social data is obtained via questionnaires. But it a well documented fact the way the
questions are phrased, the way they are ordered in sequence, the people who ask them or come along and
wait for them to be filled out, all have serious effects on the answers. Of course, in a simple black and white
situation this does not apply, but when you make a surv ey then generally the situation is murky or else you
would not have to make it. I regret I did not keep a survey by the American Mathematical Society it oncemade of its members. I was so indignant at the questi ons, which were framed to get exactly the answers they
wanted, I sent it back w ith that accusation. How few mathematicians faced with questions, carefully led up
to in each case, such as: is there enough financial support for mathematics, enough for publications, enough
for graduate scholarships, etc, would say there was more than enough money av ailable? The Mathematical
Society of course used the results to claim there was a need for more suppor t for Mathematics in all
directions.
I recently filled out a long, importa nt questionnaire (important in the consequence management actions
which might follow). I filled it out as honestly as I could, but realized I was not a typical respondent.Further thought suggested the class of people being surveyed was not homogeneous at all, but rather was a192 CHAPTER 27
collection of quite different subclasses, and hence any computed averages will apply to no group. It is much
like the famous remark, the average American family has 2 and a fraction children, but of course no family
has a fractional child! Averages are meaningful for homogeneous groups (homogeneous with respect to the
actions that may later be taken) but for diverse groups averages are often meaningl ess. As earlier remarked,
the average adult has one breast and one testicle, but that does not represent th e average person in our
society.
If the range of responses is highly skewed we have recently admitted public ly the median is often
preferable to the average (mean) as an indicator.
The Perils of Unreliable Data
- Organizational hierarchies distort data because subordinates often provide information they believe will please their superiors rather than the truth.
- Questionnaires are frequently compromised by selection bias and 'made up' reports submitted by low-level employees to meet deadlines.
- Social data, such as rates of adultery, are notoriously difficult to measure accurately because they rely on self-reporting for sensitive behaviors.
- The 'randomized response' method using coin tosses is a clever but unproven technique designed to protect anonymity and encourage honest reporting of crimes.
- Historical polling failures, like the Literary Digest disaster, demonstrate how sampling bias can lead to catastrophic organizational failure.
- Designing and evaluating surveys is a specialized field that requires expert advice to avoid the inherent traps of data collection.
Those under you will often do what they think you want, and often it is not at all what you want!
Th us they often now publish the median income and
median price of houses, and not the average amounts.
Fourth, there is another aspect I urge you to pay atte ntion to. I have said repeat edly the presence of a high
ranking officer of an organization wi ll change what is happening in the organization at that place and at that
time, so while you are still low enough to have a chance please observe for yourself how questionnaires arefilled in. I had a clear demonstration of this effect when I was on the Board of Directors of a computercompany. I saw underlings did what they thought would please me, but in fact angered me a good deal,though I could say nothing to them about it. Those under you will often do what they think you want, andoften it is not at all what you want! I suggest, among other things, you will find when headquarters, in yourorganization, sends out a questionnaire, then those who think they will rate high will more often than notpromptly fill them out, and those who do not feel so will tend to delay, until there is a dead line and then
some low level person will fill them out from hunc hes without making the measurements which were to be
takenâit is too late to do it right, so send in what you can! What these âmade upâ reports do the reliabilityof the whole is anyoneâs guess. It may make the results too high, too low, or even not change the resultsmuch. But it is from such surveys the top management must make their decisionsâand if the data is bad itis likely the decisions will be bad.
A favorite pastime of mine, when I read or hear about some data, is to ask myself how people could have
gathered itâhow their conclusions could be justified? For example, years ago when I was remarking on thispoint at a dinner party, a lovely widow said she could not see why data could not be gathered on any topic.After some moments of thought I replied, âHow would you measure the amount of adultery per year on the
Monterey Peninsula?â Well, how would you? Would you trust a questionnaire? Would you try to followpeople? It seems diffic ult, and perhaps impossible, to make any reasonably accurate estimate of the amount
of adultery per year. There are many other things like th is which seem to be very hard to measure, and this
is especially true in social relationships.
There is a clever proposed method whose effectiveness I do not know in practice. Suppose you want to
measure the amount of murder which escapes detection. You interview people and tell them to toss a coin
without anyone but themselves seeing the outcome, and then if it is heads they should claim they havecommitted a murder, while if tails they should tell the truth. In the arrangem ent there is no way anyone
except themselves can know the outcome of the toss, hence no way they can be accused of murder if they
say so. From a large sample the slight excess of murders above one half gives the measure you want. Butthat supposes the people asked, and given protection, will in fact respond accurately. Variations on this
method have been discussed widely, but a serious study to find the effectiveness is st ill missing, so far as I
know.
In closing, you may have heard of the famous election where the newspapers announced the victory for
President to one man when in fact th e other won by a land slide. There is also the famous Literary Digest poll
which was conducted via the telephone, and was amazingly wrong afterwardsâso far wrong the LiteraryDigest folded soon afterâsome people say because of this faulty poll. It has been claimed at that time the
ownership of a telephone was correlated with wealth and wealth with a politi cal partyâhence the error.UNRELIABLE DATA 193
Surveys are not a job for an amateur to design, ad minister and evaluate. Yo u need expert advice on
questionnaires (not just a run-of-the -mill statistician) when you get invo lved with a questionnaires, but there
seems little hope questionnaires can be avoided.
Systems Engineering and Global Vision
- Modern leadership requires navigating treacherous social data and personal attitudes rather than just hard material facts.
- The parable of the cathedral builder illustrates the difference between focusing on isolated tasks and contributing to a grander purpose.
- Most professionals suffer from a myopic view, focusing on technical details like 'teaching partial fractions' rather than the ultimate goal of education.
- Systems engineering is defined as the constant effort to keep larger goals in mind and translate local actions into global results.
- The scope of one's responsibility is ever-expanding, moving from personal output to departmental, corporate, and eventually global impact.
- There is no sharp boundary where a professional's obligations end, as every system is nested within a larger context.
It is characteristic of most people they keep a myopic view of their work and seldom, if ever, connect it with the larger aims they will admit, when pressed hard, are the true goals of the system.
More and more we want not mere facts about hard materialthings, but we want social and other attitudes surveyedâand this is indeed very treacherous ground.
In summary, as you rise in your organization you will need more and more of this kind of information
than was needed in the past since we are becoming more socially oriented and subject to law suits for trivial
things. You will be forced, again and again, to make surveys of personal attitudes of people, and it is forthese reasons I have spent so much time on the topic of unreliable data. You need reliable data to makereliable decisions, but you will seldom have it with any reliability! 194 CHAPTER 27
28
Systems Engineering
Parables are often more effective than is a straight statement, so let me begin with a parable. A man was
examining the construction of a cathedral. He asked a stone mason what he was doing chipping the stones,
and the mason replied, âI am making stonesâ. He aske d a stone carver what he was doing, âI am carving a
gargoyleâ. And so it went, each pers on said in detail what they were doing. Finally he came to an old
woman who was sweeping the ground. She said, âI am helping build a cathedralâ.
If, on the average campus, you asked a sample of professors what they were going to do the next class
hour, you would hear they were going to: âteach partial fractionsâ, âshow how to find the moments of anormal distributionâ, âexplain Youngâs modulus and how to measure itâ , etc. I doubt you would often hear
a professor say, âI am going to educate the stud ents and prepare them for their future careersâ.
You may claim in both cases the larg er aim was so well understood there was no need to mention it, but I
doubt you really believe it. Most of the time each person is immersed in the details of one special part of the
whole and does not think of how what they are doing relates to the larger picture. It is characteristic of mostpeople they keep a myopic view of th eir work and seldom, if ever, connect it with the larger aims they will
admit, when pressed hard, are the true goals of the syst em. This myopic view is the chief characteristic of a
bureaucrat. To rise to the top you should have the larger viewâat least when you get there.
Systems engineering is the attempt to keep at all times the larger goals in mi nd and to translate local
actions into global results. But there is no single larger picture . For example, when I first had a computer
under my complete control I thought the goal was to get the maximum number of arithmetic operationsdone by the machine each day. It to ok only a little while before I gr asped the idea it was the amount of
important computing, not the raw volume, that mattered. Later I realized it was not the computing for the
Mathematics department, where I was located, but th e computing for the research division which was
important. Indeed, I soon r ealized to get the most value out of the ne w machines it would be necessary to get
the scientists themselves to use the machine directly so they would come to understand the possibilitiescomputers offered for their work and thus produce less actual number crunching, but presumably more ofthe computing done would be valuable to Bell Tele phone Laboratories. Still later I saw I should pay
attention to all the needs of the Laboratories, and no t just the Research Depart ment. Then there was AT&T,
and outside AT&T the Country, the scientific and en gineering communities, and indeed the whole world to
be considered. Thus I had obligations to myself, to the department, to the division, to the company, to theparent company, to the country, to the world of scientists and engineers, and to everyone. There was nosharp boundary I could draw and simply ignore everything outside.
The obligations in each case were of: (1) immediat e importance, (2) longer range importance, and (3)
The Systems Engineering Paradox
- The author's role shifted from solving immediate problems to developing methodologies and educating others for long-term research sustainability.
- True systems engineering is defined by tangible output rather than the ability to articulate theoretical concepts.
- A fundamental rule of the field is that optimizing individual components often leads to the degradation of the overall system performance.
- The common assumption that improving an isolated part benefits the whole is a logical fallacy in complex engineering.
- A practical example involved a differential analyzer where 'improvements' to a second unit caused the combined system to fail its first test.
If you optimize the components you will probably ruin the system performance.
very long term importance. I also r ealized under (2) and (3) one of my functions in the research department
was not so much to solve the existing problems as to develop the methods for solving problems, to expand
the range of what could be done, and to educate others in what I had found so they could continue, extend,
and improve my earlier efforts.
In systems engineering it is easy to say the right wo rds, and many people have learned to say them when
asked about systems engineering, but as in many sports such as tennis, golf, and swimming it is hard to do
the necessary things as a whole. Henc e systems engineers are to be judged not by what they say but by what
they produce. There are many people who can talk a good game but are not able to play one.
The first rule of systems engineering is:
If you optimize the components you will probably ruin the system performance.
This is a very difficult point to get across. It seems so reasonable if you make an isolated component better
then the whole system will be betterâbut this is not true, rather the system performance will probably
degrade! As a simple example, I was running a diff erential analyzer and was so successful in solving
important problems there was need for both a bigger one and second one. Therefore we ordered a secondone which was to be connected with the first so the tw o could be either operated separately or together.
They built a second model and wanted to make improvements, which I agreed to only if it would not
interfere with the operation of th e whole machine. Came the day of acceptance on the shop floor before
dismantling and moving it to our location. I started to test it with the aid of a reluctant friend who claimed I
was wasting time. The first test and it failed miserably! The test was the classic one of solve the differential
equation
The System Performance Rule
- Improving a single component in a complex machine can inadvertently ruin the performance of the entire system.
- A technical flaw in an analog computer was traced back to inadequate grounding caused by upgraded amplifiers.
- The 'system approach' suggests that optimizing individual parts often leads to detrimental back-circuit leakage or systemic failure.
- Cramming for individual courses is a form of component optimization that is counter-productive to a student's total education.
- True learning requires focusing on the long-term retention of knowledge rather than short-term grades or pleasing professors.
As I said, the improvement of a component in such a machine, even where each component is apparently self-standing, still ruined the system performance!
whose solution is, of course, y=cost . You then plot y(t) against y´(t) and you should get a circle. How well it
closes on itself, loop after loop, is a measure of the accuracy.
So we tried the test with other components, and the sa me result. My friend had to admit there was something
seriously wrong, so we called in the people who constructed it and pointed out the flawâwhich was so
simple to exhibit they had to admit there was something wrong. They tinkered and tinkered while we
watched, and finally my friend and I went to lunch together. When we came back they had located the trouble.
They had indeed improved the amplifiers a great deal, but now currents through the inadequate grounding
was causing back circuit leakage! They had merely to put in a much heavier copper grounding and all was
well. As I said, the improvement of a component in such a machin e, even where each component is
apparently self-standing, still ruined the system perfor mance! It is a trivial example, but it illustrates the
point of the rule. Usually the effect of the component improvement is less dramatic and clear cut, but
equally detrimental to the performance of the whole system.
You probably still do not believe the statement so let me apply this rule to you. Most of you try to pass
your individual courses by cramming at the end of the term, which is to a great extent counter-productive,
as you well know, to the total education you need. You look at your problem as passing the courses one at a
time, or a term at a time, but you know in your hearts what matters is what you em erge with at the end, and
what happens at each stage is not as important. Duri ng my last two undergraduate college years when I was
the University of Chicago, the rule was at the end you had to pass a single exam based on 9 courses in your
major field, and another exam based on 6 in your mi nor field, and these were mainly what mattered, not
what grades you got along the way. I, for the first time, came to understand what the system approach to
education means. While taking any on e course, it was not a matter of passing it, pleasing the professor, or
anything like that, it was learning it so at a later date, maybe two years later, I would still know the things
which should be in the course.196 CHAPTER 28
The Failure of Sub-Optimization
- Cramming for exams is a form of sub-optimization where students prioritize short-term grades over the long-term goal of education.
- Systems engineering is difficult because practitioners often lose sight of the whole while becoming obsessed with individual components.
- Mathematics education has been damaged by optimizing individual courses, leading to significant gaps in student knowledge regarding complex numbers and induction.
- Educational reforms often fail because they focus on adding specific tools, like computers, rather than defining the total mathematical education required.
- The Venetian arsenal (1200â1400) serves as a historical example of successful systems engineering through its 'just in time' production of ships and crews.
Systems engineering is a hard trade to follow; it is so easy to get lost in the details!
Cramming is clearly a waste of time. You really know it is, but the behavior of most of you is a flat
denial of this truth. So, as I said above, words mean little in judging a systems engineering job, it is what is
produced that matters. The professors believe, as do those who are paying the bill for your education, and
probably some of you also, what is being taught will probably be very useful in your later careers, but youcontinue to optimize the components of the system to the detriment of the whole! Systems engineering is a
hard trade to follow; it is so easy to get lost in the details! Easy to say; hard to do. This example should showyou the reality of my remark many people know the words but few can actually put them into practice whenthe time comes for action in the real world. Most of you cannot!
As another example of the effects of optimizing the compone nts of a system, consid er the teaching of the
lower level Mathematics courses in co llege. Over the years we have opt imized both the calculus course and
linear algebra, and we have strippe d out anything not immedi ately relevant to each course. As a result the
teaching of Mathematics, viewed as a whole, has large gaps. We barely mention: (1) the important methodof Mathematical induction, (2) after a brief mention in algebra in connection with quadratic equations weignore, almost in holy dread, any mention of complex numbers until the fatal day, late in the linear algebracourse, when complex eigenvalues a nd eigenfunctions arise and the poor student is faced with two new,
difficult concepts at once and is naturally baffled, (3) the important, useful method of undeterminedcoefficients is briefly mentioned, (4) impossibility proofs are almost totally ignored, (5) discreteMathematics is ignored, (6) little or no effort goes into trying to convert what to many of the students are
just âchicken tracks on paperâ into meaningful concepts which are appli cable to the real world; and so it
goes, large parts of any reasonable Mathematical education are omitted in the urge to optimize the
individual courses. Usually the inner structure of the calculus and the central role of the limit is glossed over
as not essential.
All the proposed reformations of the standard calculus course I have examined, and there are many, never
begin by asking, âWhat is the total Mathematical edu cation and what therefore s hould be in the calculus
course?â They merely try to include computers, or some such idea, without examinin g the system of total
Mathematical education which the cour se should be a part of. The syst ems approach to education is not
flourishing, rather the enthusiasts of various aspects tr y to mold things to fit their local enthusiasms. The
question, as in so many situations, âWhat is the total probl em in which this part is to fit?â is simply regarded
as too big, and hence the sub-optimization of the courses goes on. Few people who set out to reform anysystem try first to find out the total system problem, but rather attack the first symptom they see. And, of
course, what emerges is what ever it is, and is not what is needed.
I recently tried to think about the history of systems engi neeringâand just because a system is built it
does not follow the builder had the system rather than the components in mind. The earliest system I recall
reading about in its details is the Venetian arsenal in its heyday around 1200â1400. They had a productionline and as a new ship came down the line, the ropes, masts, sails, and finally th e trained crew, were right
there when needed and the ship sail ed away! At regular inte rvals another ship came out of the arsenal. It
was an early âjust in timeâ production line which included the people properly trained as well theequipment built.
The early railroads were surely systems, but it is no t clear to me the first bu ilders did not try to get each
The Evolution of Systems Engineering
- Systems engineering emerged from the necessity of intermeshing complex parts into a reliable, functioning whole rather than optimizing components in isolation.
- The telephone company pioneered this field because they provided a service rather than just equipment, requiring every part to interconnect with high reliability.
- Large-scale systems often face a 'diseconomy of scale' where adding new nodes increases complexity and expense exponentially, requiring shrewd design.
- Modern systems must be designed for flexibility to accommodate constant upgrades and field changes that cannot be predicted during the initial design phase.
- Education should focus on enduring fundamentals rather than transient technical details, as half of current engineering specifics become obsolete within 15 years.
I have already observed I did not immediately grasp the systems approach as I was running the computers, but at least I gradually realized the computers were but a part of a researchâdevelopment organization, vital to be sure, but it was their value to the system which mattered in the long run.
part optimized and really did not think, until after the whole was going, there was a system to considerâhow the parts would intermesh to attain a decent operating system.
I suspect it was the telephone company which first had to really face the problem s of systems engineering.
If decent service was to be supplied then all the parts ha d to interconnect, and work at a very high reliability
per part. From the first the company provided a service, not just equipment. That is a big difference. If youmerely construct something and leave it to others to keep it running it is one thing; if you are also going toSYSTEMS ENGINEERING 197
operate it as a service then it is an other thing entirely! Others had clearl y faced small systems as a whole, but
the telephone system was larger and more complex than anything up to that point. They also found, perhapsfor the first time, in expanding ther e is not an economy of scale but a diseconomy; each new customer must
be connected with all the previous customers, and each new one is therefore a larger expense, hence the
system must be very shrewdly designed.
I do not pretend to understand how I, with a clas sical pure Mathematics education, was converted to
being a systems engineer, but I was. I suppose it started quietly with my college edu cation, but it really got
started at Los Alamos where it was obvious to all of us we were constructing a design for which everycomponent had to be properly coordinated if the whole was to do what it had to doâincluding fit into thebomb bay of the current airplane. And to do the job rapidly before the enemy, who was known to beworking on it too, reached success.
The Nike guided missile systems, th e computer systems I ran, and many other aspects of the work at Bell
Telephone Laboratories all taught me the facts of systems engineeringânot abstractly, but in hard lessonsdaily illustrated by idiots who did not understand the whole as a whole, but only the components. I havealready observed I did not immediately grasp the systems approach as I was running the computers, but atleast I gradually realized the computers were but a part of a researchâdevelopment organization, vital to besure, but it was their value to the system which mattered in the long run, how well the computers helpedreach the organizationâs goals, as well as societyâ s goals, and not how comfortable it was for the staff
operating the computers.
That brings up another point, which is now well reco gnized in software for co mputers but it applies to
hardware too. Things change so fast part of the system design problem is the system will be constantlyupgraded in ways you do not now know in any detail! Flexibility must be part of modern design of thingsand processes. Flexibility built into the design means not only you will be better able to handle the changeswhich will come after installation, but it also contributes to your own work as the small changes whichinevitably arise both in the later stages of design and in the field installation of the system. I had not realizedhow numerous these field changes were until the early Nike field test at Kwajalain Island. We wereinstalling it and still there was a constant stream of field changes going out to them!
Thus rule 2:
Part of systems engineering design is to prepare for changes so they can be gracefully made and still
not degrade the other parts.
Returning to your education, our real problem is not to prepare you for our past, or even the present, but to
prepare you for your future. It is for this reason I have stressed the importance of what currently is believedto be the fundamentals of various fields, and have deliberately neglected the current details which will
probably have a short lifetime. I cited earlier the half -life time of engineering details as being 15 yearsâ
half of the details you learn now will probably be useless to you in 15 years.
Rule 3:
Systems Engineering and Graceful Decay
- Strict adherence to design specifications can lead to catastrophic failure when systems are overloaded.
- Bridges and telephone central offices serve as primary examples of systems that degrade rapidly if not designed for excess capacity.
- Effective systems engineering requires planning for 'graceful decay' rather than immediate collapse under stress.
- The author draws heavily from H.R. Westerman's philosophical essays on the nature and methodology of systems engineering.
- Systems engineering is defined not just by technical execution but by the 'what, how, and why' of complex organizational frameworks.
The closer you meet specifications the worse the performance will be when overloaded.
The closer you meet specifications the worse the performance will be when overloaded.
The truth of this is obvious when building a bridge to carry a certain load; the slicker the design to meet the
prescribed load the sooner the collap se of the bridge when the load is exceeded. One sees this also in a
telephone central office; when you design the system to carry the maximum load then with a slight overload198 CHAPTER 28
of traffic performance degrades immediately. Hence good design generally includes the graceful decay of
performance when the specifications are exceeded.
In preparation for writing this chapter I reread once more an unpublished set of essays on: One Manâs
Systems Engineering, by H.R.Westerman (1975), then of Bell Telephone Laboratories. They are the only
deeply philosophical discussion I know of the âwhat, how, and whyâ of systems engineering. While I willmake small differences at various points from what he says I am in fundamenta l agreement with him. I can
only summarize, all too briefly, what he says in 10 essays whose titles are:
1. One Manâs Systems Engineering.
2. What is Systems Engineering?3. On the Objective.4. What Does a Systems Engineer Do?5. The Framework of Systems Engineering.6. Organization and Systems Engineering.7. Objectives and Policy Makers.8. On the Methodology of Systems Engineering.9. Evaluation and (Un)Common Sense.
10. Envoy.
The Nature of Systems Engineering
- Systems engineering relies on interdisciplinary teams of specialists who must return to their core fields to maintain expertise.
- The introduction of a solution fundamentally alters the environment, often creating new problems or unintended behaviors.
- A systems engineer must prioritize global optimization over the local optimization strategies of individuals.
- The design process is never truly finished because solutions generate deeper insights and new dissatisfactions.
- Engineers must look beyond a client's reported symptoms to identify and address the underlying causes of a problem.
- Unlike academic problems with fixed answers, systems engineering is an evolutionary process with no final solution.
The optimal strategy for the individual was clearly opposed to the optimal strategy for the whole of the laboratories, and it is one of the functions of the systems engineer to block most of the local optimization of the individuals of the system and reach for the global optimization for the system.
The list shows clearly his breadth of vision, which arose from many years on both military projects and
telephone systems problems.
He believes more in the group which attacks systems engineering problems than in the individual
problems attacked, whereas I, from my limited experience in computing where I had no one near by to talk
to about the proper use of computers, had to do it single handed. Of course his problems were far moredifficult than mine.
He believes specialists brought together to make a team are the basis of systems engineering, and
between jobs they must go back to their specialties to maintain their expertise. Using the group too often to
fight fires is detrimental in the long run since then the individuals do not keep their skills honed up in theirareas.
We both agree a systems engineering job is never done. One reason is the presence of the solution
changes the environment and produces new problems to be met. For example, while running the computing
center in the early days I came to the belief small prob lems were relatively more important than large ones;
regulardependable service was a desirable thing. So I institute d a 1 hour period in each morning and each
afternoon during which only 3 minute (or less) problems were to be run (mainly program testing) and if youran over 5 minutes you got off the machines no matter how much you had claimed you were practicallyfinished. Well, people with 10 minute problems broke th em up into three small pi eces with different people
for each piece and ran them under the rules-thus incr easing the load in the i nput/output facilities. My
solutionâs very presence alters the systemâs response. The optimal strategy for the individual was clearly
opposed to the optimal strategy for the whole of the laboratories, and it is one of the functions of the
systems engineer to block most of the local optimization of the individu als of the system and reach for the
global optimization for the system.
A second reason the systems engineers design is neve r completed is the solution offered to the original
problem usually produces both deeper insight and dissatisfactions in the engineers themselves. Furthermore,while the design phase continually goes from proposed solution to evaluation and back again and again,SYSTEMS ENGINEERING 199
there comes a time when this process of redefinement must stop and the real problem coped withâthus
giving what they realize is, in th e long run, a suboptimal solution.
Westerman believes, as I do, while the client has some knowledge of his symptoms, he may not
understand the real causes of them, and it is foolish to try to cure the symptoms only. Thus while the systemsengineers must listen to the client they should also try to extract from the client a deeper understanding of
the phenomena. Therefore, part of the job of a systems engineer is to define, in a deeper sense, what the
problem is and to pass from the symptoms to the causes.
Just as there is no definite system within which the so lution is to be found, and the boundaries of the problem
are elastic and tend to expand with each round of solu tion, so too there is ofte n no final solution, yet each
cycle of input and solution is worth the effort. A solution which does not prepare for the next round withsome increased insight is hardly a solution at all.
I suppose the heart of sy stems engineering is the acceptance here is neither a definite fixed problem nor a
final solution, rather evolution is the natural state of af fairs. This is, of course, not what you learn in school
where you are given definite problems which have definite solutions.
How, then, can the schools adapt to this situation and teach systems engineering, which because of the
elaboration of our society, becomes ever more important? The idea of a laboratory approach to systems
engineering is attractive until you examine the c onsequences. The systems engineering described above
The Art of Systems Engineering
- Traditional education focuses on definite techniques for definite problems, whereas systems engineering requires formulating problems from a background of indefiniteness.
- The complexity of real-world systemsâincluding organizational habits and personnel characteristicsâis nearly impossible to replicate in a classroom setting.
- Systems engineering is often learned through apprenticeship in teams or through simplified 'toy' stories that capture the essence of complex compromises.
- A common failure in engineering is solving the wrong problem correctly; systems engineering prioritizes solving the right problem, even if the initial solution is imperfect.
- The ultimate goal of a system engineer is to gain deep insight into the problem's nature, while the client typically seeks immediate relief from symptoms.
- The evolution of the Nike missile project illustrates how a system's purpose can shift from shooting down a single target to a broader strategy of economic attrition.
In a sense systems engineering is trying to solve the right problem, perhaps a little wrongly, but with the realization the solution is only temporary.
depends heavily on the standard school teaching of definite techniques for solving definite problems. Thenew element is the formulation of a definite problem from the background of indefiniteness which is the
basis of our society. We cannot elide the traditional training, and the schools have not the time, nor theresources, except in unusual cases, to take on the new topic, systems engineering. I suppose the best that can
be done is regular references to how the class room so lutions we teach differ fr om the reality of systems
engineering.
Westerman believes, apparently, the art of systems en gineering must be learned in a team composed of
some old hands and some new ones. He recognizes the old hands have to be gradually removed and newpeople brought into the team. I have no answer for how to teach my âlone wolf â experiences except what I
have done so far, by stories of what happened to me in given situations. Usually the actual circumstancesare so complex it takes a long, long time to get across the outside policies, organization habits,characteristics of personnel that will run the final system, operating conditions in the field, tradition, etc. which
surround, and to a great extent circumscribe, the solution to be offered to the systems problem. The solutionis usually a great compromise between conflicting goal s, and the student seldom appreciates the importance
of the intangible parts of the boundary which shape th e form of the answer. Thus real systems engineering
problems are almost impossible to exhibit in proper realistic detail; instead toy situations and stories mustbe used which, while eliminating much detail, do not distort things too much.
If you will look back on these chapters you will find a great deal of just this-the stories were often about
systems engineering situations which were greatly si mplified. I suppose I am a dedicated systems engineer
and it is inevitable I will always lean in that direction. But let me say again, systems engineering must be builton a solid ground of classical simplification to definite problems with definite solutions. I doubt it can betaught ab initio .
Let me close with the observation I have seen many, many solutions offered which solved the wrong
problem correctly. In a sense systems en gineering is trying to solve the right problem, perhaps a little wrongly,
but with the realization the solution is only temporary and later on during the next round of design theseaccepted faults can be caught provided insight has been obtained. I said it before, but let me say it again, a
solution which does not provide greater insight than you had when you began is a poor solution indeed, butit may be all that you can do given the time constraints of the situation. The deeper, long term understanding200 CHAPTER 28
of the nature of the problem must be the goal of th e system engineer, whereas the client always wants
prompt relief from the symptoms of his current pr oblem. Again, a conflict leading to a meta systems
engineering approach!
As an example of the deepening of our understanding of a system and its problems, consider the Nike
guided missile project. At first it was to build a missile which would shoot down a single target. Thisaccomplished, we began to think of a battery of Nike missiles and how to coordinate the individual missiles
when under attack by a fleet of enemy airplanes. Then came the day when we began to think about whattargets to defend, which cities to defend and which not to. We began to realize the answer is all targetsshould be equally expensive to the enemyâthere should be no under-defended or over-defended target,each should be defended in proportion to the damage that could be done by the enemy. Thus we began to
see the Nike missile is merely a device to make the en emy pay a price for the damage he can inflict, with no
âcheapâ targets available. Ho w different this view is from the one w ith which we began! It illustrates the
The Evolution of Systems Engineering
- Effective solutions should lead to a deeper understanding of the underlying problem.
- Initial symptoms of a problem are often temporary and shift as success is achieved.
- Project goals are dynamic and must evolve alongside the customer's deepening insight.
- There is a critical distinction between true practitioners and those who only talk about the field.
- The profession faces a significant need to replace ineffective talkers with skilled engineers.
- Measurement systems fundamentally dictate the outcomes and behaviors within a project.
There is a great need for real systems engineers, as well as perhaps a greater need to get rid of those who merely talk a good story but cannot play the game effectively.
point each solution sh ould bring further understandi ng of the problem; the first sy mptoms they tell you will
not last long once you begin to succeed; the goal will be constantly changing as your and the customerâsunderstanding deepen.
Systems engineering is indeed a fascinating profession , but one which it hard to practice. There is a great
need for real systems engineers, as well as perhaps a greater need to get rid of those who merely talk a good
story but cannot play the game effectively. SYSTEMS ENGINEERING 201
29
You Get What You Measure
The Influence of Measurement
- The choice of measurement tools and metrics fundamentally dictates the behavior and outcomes of a system.
- Quarterly profit tracking often incentivizes short-term gains at the expense of long-term corporate health.
- High initial performance ratings discourage risk-taking because employees have more to lose than to gain by deviating from safety.
- Organizations often prioritize 'hard' accurate data over 'soft' relevant data, leading to a focus on training rather than education.
- Statistical distributions, such as the normal distribution of IQ, are often artifacts of how the tests are calibrated rather than inherent natural laws.
Accuracy of measurement tends to get confused with relevance of measurement, much more than most people believe.
You may think the title means if you measure accurately you will get an accurate measurement, and if not
then not; but it refers to a much more subtle thingâthe way you choose to measure things controls to a
large extent what happens. I repeat the story Eddington told about the fishermen who went fishing with anet. They examined the size of the fish they caught and concluded there was a minimum size to the fish in
the sea. The instrument you us e clearly affects what you see.
The current popular example of this effect is the us e of the bottom line of the profit and loss statement
every quarter to estimate how well a company is do ing, which produces a company interested mainly in
short term profits and has little regard to long term profits.
If in a rating system every one starts out at 95% then there is clearly little a person can do to raise their
rating but much which will lower the rating; hence the obvious strategy of the personnel is to play thingssafe, and thus eventually rise to the top. At the higher levels, much as you might want to promote for risktaking, the class of people from whom you may select them is mainly conservative!
The rating system in its earlier st ages may tend to remove exactly those you want at a later stage.
Were you to start with a rating system in which the average person rates around 50% then it would be more
balanced; and if you wanted to emphasize risk taking then you might start at the initial rating of 20% orless, thus encouraging people to try to increase their ra tings by taking chances since there would be so little
to lose if they failed and so much to gain if they succeeded. For risk taking in an organization you must
encourage a reasonable degree of risk taking at the early stages, together with promotion, so finally somerisk takers can emerge at the top.
Of the things you can choose to measure some are ha rd, firm measurements, such as height and weight,
while some are soft such as social attitudes. There is always a tendency to grab the hard, firm measurement,
though it may be quite irrelevant as compared to the soft one which in the long run may be much morerelevant to your goals. Accuracy of measurement tends to get confused with relevance of measurement,
much more than most people believ e. That a measurement is accurate, re producible, and easy to make does
not mean it should be done, instead a much poorer on e which is more closely related to your goals may be
much preferable. For example, in sc hool it is easy to measure training and hard to measure education, and
hence you tend to see on final exams an emphasis on th e training part and a great neglect of the education
part.
Let me turn to another effect of a measurement system, and illustrate it by the definition and use of IQs.
What is done is a plausible list of questions, plausible from past experience, is made, and then tried out on asmall sample of people. Those questions which show an internal correlation with others are kept and those
which do not correlate well are dropped. Next, the revised te st is calibrated by using it on a much larger sample.
How? Simply by taking the accumulated scores (the number of peopleâs scores which are below the given
amount) and plotting these revised numbers on probab ility paperâmeaning the cumulative probabilities of
a normal distribution are the horizontal lines. Next the points where the cumulative actual scores fall at
given percentage points are related, via a calibration table, to the corresponding points on the cumulative
normal probability curve. As a result it is observed intelligence has a normal distribution in the population!
Of course it has, it was made to be that way! Furthe rmore, they have defined intelligence to be what is
measured by the calibrated exam, and if that is the definition of intelligence then of course intelligence is
normally distributed. But if you think maybe intelligen ce is not exactly what the calibrated exam measures,
The Artifact of Measurement
- The normal distribution of intelligence or grades is often an artifact of the measurement method rather than a reflection of reality.
- An instructor can manipulate the distribution of exam grades by adjusting the ratio of easy, moderate, and hard questions.
- Exams are frequently designed to create clarity around the pass-fail threshold rather than to measure absolute knowledge.
- Rating systems are limited by the dynamic range used by the individuals providing the data.
- Discrepancies in how people use a scale can lead to skewed results where one person's extreme rating outweighs another's moderate one.
If I could make up an exam which was uniformly hard, then each student would tend either to get all the answers right or all wrong.
then you are entitled to doubt intelligence is normally distributed in the population. Again, you get what
was measured, and the normal distribution announced is an artifact of the method of measurement and
hardly relates to reality.
In giving a final exam in a course, say in the calculus, I can get almost any distribution of grades I want.
If I could make up an exam which was uniformly hard, then each student would tend either to get all the
answers right or all wrong. Hence I will get a distribution of grades which peaks up at both ends,
Figure 29.I . If, on the contrary, I asked a few easy questions, many moderately hard, and a few very hard ones,
I would get the typical normal dist ribution; a few at each end and most of the grades in the middle,
Figure 29.II . It should be obvious if I know the class then I can get almost any distribution I want. Usually,
at the final exam time I am most worried about the pass-fail dividing point, and design the exam so I will
have little doubt as to how to act, as well as have the hard evidence in case of a complaint.
Still another aspect of a rating system is its dynamic range . Suppose you are given a scale of 1 to 10, with
5 being the average. Most people will give ratings of 4, 5, and 6, and seldom venture, if ever, to the
extremes of 1 and 9. If you give a 6 to what you like, but I use the entire dynamic range and assign a 2 to
what I do not like, then the effect of the two of us is while we may differ equally in our opinion, the sum of
Figure 29.I
Figure 29.II
YOU GET WHAT YOU MEASURE 203
Information Theory and Selection
- Using the full dynamic range of a rating scale maximizes the information communicated, whereas grade inflation reduces entropy.
- Ranking systems prevent grade inflation but risk unfairly penalizing high-performing individuals in exceptional cohorts.
- Academic fields attract individuals whose existing psychological peculiarities align with perceived rather than actual field features.
- The early focus on technical detail in STEM often filters out the creative minds needed for high-level conceptual work later on.
- Standardized recruitment processes often fail to identify top researchers because originality in science frequently correlates with non-conformity.
Hence, as at Bell Telephone Laboratories, usually the research people go out to do the hiring for the research area, and the personnel department shudders!
the ratings will be 6+2=8, and the average will be 4âthe effect of my opinion more than wipes out yours!
In using a rating scheme you should try to use the entire dynamic range, and if you do you will have a muchlarger effect on the final averageâpr ovided it is done, as most such cas es are, by blind averaging of the
ratings assigned. Remember Coding Theory says the en tropy (the average surpri se) is maximum when the
distribution is uniform. You have the most information when all the grades are used equally, as you mayrecall from Chapter 13 on Information Theory.
If you regard giving grades in a course as a communication channel then, as just noted, the equally
frequency use of all the grades will communicate the maximu m amount of informationâwhilst the typical
use in Graduate Schools of mainly the two high est grades, A and B, greatly reduces the amount of
information sent. I understand the Naval Academy uses rank in class, and in some sense this is the only
defense against âgrade inflationâ and the failure to us e the whole dynamic range of the scale uniformly, thus
communicating the maximum amount of information, give n a fixed alphabet for gr ades. The main fault with
using rank as the grade is by chance there may be all very good people in a particular class, but some one ofthem will have to be at the bottom!
There is also the matter of how you initially attract peop le to the field. It is easy to see in psychology the
people who enter the field are mixed up in their head s more than the average professor and average student
in a collegeâit is not so much the courses do this, though I suspect they help to mix the student up further,but the initial selection does it. Similarly, the hard an d soft sciences have their attractions and repulsions
based on initially perceived features of the fields, and not necessarily on the actual features of the field. Thus
people tend to go into the fields which will favor their peculiarities, as they sense them, and then once in the
field these features are often further strengthened. Resultâpoorly balanced, but highly specialized, peopleâwhich may often be necessary to succeed in the present situation.
In Mathematics, and in Computer Science, a similar effect of initial selection happens. In the earlier
stages of Mathematics up through the Calculus, as well as in Computer Science, grades are closely related
to the ability to carry out a lot of details with high reliability. But later, especially in Mathematics, thequalities needed to succeed change and it becomes mo re proving theorems, patterns of reasoning, and the
ability to conjecture new results, new theorems, and new definitions which matter. Still later it is the abilityto see the whole of a field as a whole, and not as a lo t of fragments. But the grading process has earlier, to a
great extent, removed many of those you might want, and indeed are needed at the later stage! It is very
similar in Computer Science where the ability to cope with the mass of programming details favors one kind
of mind, one which is often negatively correlated with seeing the bigger picture.
The personnel employment department also has an effect on who is recruited into the system. If there is
recruiting for research then the typical member of th e personnel department in a big organization is not
likely to want the right people. Good researchers, beca use the criterion is they have originality in Science
and Engineering, also means typically they are original in other aspects of their behavior and dressâ
meaning they do not appeal to the typical recruite r from the personnel department. Hence, as at Bell
Telephone Laboratories, usually the research people go out to do the hiring for the research area, and the
personnel department shudders! This is not a trivial po int, the recruiting of one generation determines the
organizationâs next generation.
There is also the vicious feature of promotion in most systems.
The Hazards of Inbreeding
- Higher-level organizational members tend to select successors who resemble themselves to ensure comfort and harmony.
- This self-selection process creates a distinct organizational personality but often leads to intellectual inbreeding and a lack of innovation.
- Historical safeguards against inbreeding, such as departments refusing to hire their own graduates, are increasingly being ignored.
- The 'you get what you measure' principle applies to promotion criteria, where unconscious biases shape the future of the institution.
- Despite the inherent flaws and complexity of human nature, ranking and measurement are unavoidable necessities in any hierarchical society.
- Effective leadership requires conscious thought about measurement systems rather than passive acceptance of biased selection processes.
Never mind humans are at least as complex as vectors, and probably even more complex than matrices or tensors of numbers; the complex human, plus the effect of the environment they operate in, must somehow be reduced to a simple measure which makes an ordered array of choices.
At th e higher levels the current members
choose the next generationâand they tend strongly to select people like themselvesâ people with whom
they will feel comfortable. The Board of Directors of a company has a strong cont rol of the officers and next
Board members who are put up for election (the results of which is often more or less automatic). You tendto get inbreedingâbut also you tend to get an organization personality. Hence the all too common methodof promotion by self-selection at the higher levels of an organization has both good and bad features. This is204 CHAPTER 29
still on the topic you get what you measure as there is a definite matter of evaluation, and the criteria used,
though unconscious, are still there.
In the distant past to combat this inbreeding most Mathematics Departments (a topic I am more familiar
with than for other Departments) had a general rule they did not employ their own graduates. The rule is notnow widely applied so far as I can seeâquite the contra ry, there seem to be a tendency to hire their own
graduates over outsiders. There have been several occasions when Economics De partments were so inbred
the top management of the University had to step in and do the hiring over the professor's dead bodies as itwere, in order to gain a reasonable balance in the University of differing opinions. The same has happenedin Psychology Departments, Law, and no doubt in others.
As just mentioned, a rating system which allows those who are in to select the next generation has both
good and bad features, and needs to be watched closely for too much inbreeding. Some inbreeding means acommon point of view and more harmonious operation from day to day, but also it will probably not havegreat innovations in the future. I suspect in the futu re, where I believe change will be the normal state of
things, this will become a more se rious matter than it has been in the pastâand it has definitely been a
problem in the past!
I trust you realize I am not trying to be too censorious about things, rather I am trying to illustrate the
topic of this chapterâ you get what you measure. This is seldom thought about by people setting up a
rating, measuring, or other schemes of recording things , and yet in the long run it has enormous effects on
the entire systemâusually in directions in which they never thought about at all!
Although measuring is clearly bad when done poorly, there is no escape from making measurements,
rating things, people, etc. Only one person can be the head of an organizati on at one time, and in the
selection there has to be a reduction to an simple s cale of rating so a comparison can be made. Never mind
humans are at least as complex as vectors, and probably even more complex than matrices or tensors ofnumbers; the complex human, plus th e effect of the environment they operate in, must somehow be reduced
to a simple measure which makes an ordered array of choices. This may be done internally in the mind,without benefit of conscious thinking, but it must be done whether you believe in rating people or not-thereis no escape in any society in which there are differences in rank, powe r to manage, or what ever other feature
you wish. Even on a program of entertainment, there has to be a first and a last performerâall cannot beequally placed. You may hate to rate people, as I do, but it must be done regularly in our society, and in any
society which is not exactly equal at all points this must happen very often. You may as well realize this andlearn to do the job more effectively than most people doâthey simply make a choice and go on, rather thangive the whole process a good deal of careful thought, as well as watching others doing it and learning fromthem.
By now you see, I hope, how the various scales of measurement effect what happens. They are
fundamental yet they normally receive very little atte ntion. To strengthen what I have been saying, I will
The Impact of Measurement Scales
- Nonlinear scale transformations, such as the Richter scale for earthquakes, fundamentally alter the distribution and perception of data compared to linear scales.
- The choice of measurement scaleâwhether additive or percentage-basedâshould depend on the specific context and the type of conclusion one aims to draw.
- Individuals and groups will actively optimize their behavior to exploit any rating system, often at the expense of the system's overall performance.
- Traditional measurement systems in military and business settings often create a facade of readiness that fails to reflect real-world capabilities.
- Training based on idealized reports rather than reality leads to leaders who are prepared for simulated games but not for actual crises.
Any change in the rating system you think will improve the system performance as a whole is apt to not work out well unless you have thought through the response of the individuals to the changeâthey will certainly change their behavior.
simply tell you more examples of how th e measurement scale affects the system.
Earthquakes are almost always measured in the Richter scale, which effect ively uses the log of the estimated
amount of energy in the earthquake. I am not saying this is the wrong measuring scal e, but its effect is you
have few really large earthquakes, 7 and 8, and lots of small ones, 1 and 2. Think about it. I do not know the
distribution on Mother Natureâs scale, but I doubt Sh e uses the Richter scale. Linear transformations, as
from feet to meters, are not serious, but nonlinear s cale transformations are anothe r matter. Most of the time
we measure stimuli applied to humans on a log scale, but for weight an d height we use linear scales. Linear
ones allow additivity easily, but for nonlinear scales you do not have this. For example, in measuring thesize of a herd you are apt to count the number of animals in the herd. Thus you have additivityâadding twoherds together gives the proper amount of the combined herd. If you have a herd of 3 and add 3 that is oneYOU GET WHAT YOU MEASURE 205
thing, but if you have a herd of 1000 and add 3 it is quite another thingâ hence the additivity of the number
in the herd is not always the proper measure to use. In this case the percentage change might be more
informative.
How, then, do you decide which scale to use in m easuring things? I have no easy answer. Indeed, I have
the awful observation while one scale of measurement is suitable for on e kind of conclusion in a field,
another scale of measurement may be more appropriate for some other ki nd of decision in exactly the same
field! But how seldom is this recognized and used! Of course you may observe sometimes we quietly make
a transformation when we apply a given formula, but wh ich scale of measurement to use is a difficult thing
to decide in any particular case. Much depends on th e field and the existing theories, as well as the new
theories you hope to find! All of which is not much help to you in any particular situation.
There is another matter I mentioned in an earlier chapter, and must now come back to. It is the rapidity
with which the people respond to changes in a rating system. I told you how there was a constant battlebetween me and the users of the computer, me trying to optimize the performance for the system as a whole,
and they trying to optimize their own use . Any change in the rating system you think will improve the
system performance as a whole is apt to not work out well unless you have thought through the response ofthe individuals to the changeâthey will certainly change their behavior. You have only to think of yourown optimization of your careers, of how changes in the rating system in the past have altered some of yourplans and strategies.
Some systems of measurement clearly have bad feat ures, but tradition, and other niceties, keep them
going. For example the state of read iness of a branch of the military. In the Navy ships are inspected on a
regular routine, one feature after another, and the skipper gets the ship and crew ready for each one, prettymuch neglecting the others until th ey come up. The skipper scores hi gh, to be sure. But when we face
simulating war games, what is the true readiness of the fleet? Surely not what the reports sayâas you caneasily imagine. But what do we have to use? Of course we must use the reported figuresâwe would not bebelieved if we used other data! So we train people in war games to use an idealized fleet and not the realone! It is the same in business games; we train the executives to win in the simulated game, and not in the
real world. I leave it to you to think about what you will do when you are in charge and want to know thetrue readiness of your organization. Will random inspections solve everything? No! But they would improvethings a bit.
All organizations have this problem. You are now at the lower levels in your organization and you can
The Distortion of Organizational Measurement
- Inspections are often compromised by informal communication networks, allowing commanders to prepare for supposedly random evaluations.
- The popularity of a specific metric within an organization rarely correlates with its actual accuracy or relevance to the mission.
- Individual employees at every level tend to bend data to improve their personal appearance, though these distortions may partially cancel each other out.
- Top management is often willfully complicit in data distortion, reinterpreting capability assessments as probabilities to meet desired targets.
- Physical verification, such as 'nosing around' loading docks, often reveals systemic bad habits like scavenging parts to meet quarterly shipping quotas.
If the whole organization is working together to fool the top, there is little the top can do about it.
see for yourself how things are reported and how the reports differ from realityâit will still be the sameunless you, when you are in charge, change things dras tically. The Air Force uses what are supposed to be
random inspections, but as a retired Navy Captain friend of mine once observed to me, every basecommander has a radar and knows what is in the air a nd if he is surprised by an inspection team then he
must be a fool. But he has less time to prepare than for scheduled inspections, so presumably the inspection
reports are closer to reality than when inspections only occur at times known far in advance. Yes,
inspections are measurements, and you get what you measure. It is often only a little different in other
organizationsâthe news of a coming measurement (inspect ion) gets out on the grape vine of gossip, and the
receivee, while pretending to be surprised, has often prepared that very morning for it.
Another thing which is obvious, but seems necessary to mention; the popularity of a form of measurement
has little relationship to its accuracy or relevance to the organization.
Still another thing to ment ion is all up and down the organization each person is bending things so they
themselves will look goodâso they think! About the only thing which saves top management is the variouslower levels can each only bend things a bit, and often the various levels have different goals and hence the
many bending of the truth tend to partially annul each other due to the weak law of large numbers. If the206 CHAPTER 29
whole organization is working together to fool the top, there is little the top can do about it. When I was on
a Board of Directors I was so conscious of this I frequently came either a day early or else stayed a day late,
and simply wandered around asking questions, looking, and asking myself if things were as reported. For
example, once when inventory was very high, due to the change in the line of computers we were producingwhich forced us to have parts of both lines on hand at the same time, I walked along, suddenly turnedtowards the supply crib, and simply walked in. I then eyed things to decide if, in my own mind, there wasany great discrepancy or were the reported amounts r easonably accurate.
Again were the computing machines we were supposed to be shipping actually on the loading dock, or
were they mythicalâas has happened in many a co mpany? Nosing around I found at the end of each
quarter the machines to be shipped were really shippe d, but often by the process of scavenging the later
machines on the production line, and hence the next few weeks were spent in getting the scavengedmachines back to proper state. I never could stop that bad habit of the employees, though I was on the Boardof Directors! If you will but look around in your organization you will find lots of strange things whichreally should not happen, but are regarded as customary practice by the personnel.
Another strange thing that happens is what at one level is regarded as one thing, is differently regarded at
a higher level. For example, it often happens the evaluations of capability of the organization at one level
are interpreted as probabilities at a higher level! Why does this happen? Simply because the lower level
cannot deliver what the higher one wants and hence delivers what it can do, and the higher level willfully,because it wants its numbers, chooses to alter the meaning of the reports.
I have already discussed the matter of life testsâwhat can be done and what is needed are not the same at
all! At the moment we do not know how to deliver what is needed; reliability for years of operation at ahigh level of confidence for parts which were first delivered to us yesterday. That problem will not goaway, but a lot can be done to design into things the needed reliability.
The Consequences of Measurement
- Engineering designs should prioritize built-in reliability to avoid the cycle of constant repairs.
- Removing human judgment from processes can eliminate random variation and reveal underlying patterns previously hidden.
- While human judgment handles infinite complexity, its subjectivity can also obstruct systematic progress.
- Measurement systems often create counter-incentives, such as bloated software caused by measuring productivity through lines of code.
- The design of any metric must account for how it will inevitably influence and potentially distort human behavior.
- The fundamental principle of organizational management is that you get exactly what you measure.
There is never time to do the job right, but there is always time to fix things later.
One of my first problems at BellTelephone Laboratories was the design of a series of concentric rings of copper and ceramic such that forthe choice of the radii, as temperatures changed, th e ceramic would always be in compression and never in
tension where it has little strength. The design has a degree of reliability built into it! Too little has been
done in this direction in my opinion, but as I remarked before, when they said there was no time to do it,âThere is never time to do the job right, but there is always time to fix things laterâ.
There are rating systems that have built into them a degree of human judgmentâand that sounds good.
But let me tell you a story which made a big impression on me. I had produced a computing machinemethod of evaluating the phase shifts from the measured gains at vari ous frequencies in a signal which
replaced a human, hand method. I am not claiming it was better, only th e hand method could not do the new
job when we passed from voice to TV band widths. A smart man said to me one day, âBefore, whenhumans did things, we could not make further improv ements because of the rand om human variations; now
that you have removed the random element we can hope to learn things which were not apparent beforeâ.Methods of rating that do not have human judgment have some advantagesâbut do not conclude Iam against putting in an element of human judgment. Most formal methods are n ecessarily finite, and the
complexity of reality is almost infinite, hence human judgment, wisely applied, is often a good thingâthough, as just noted, in a way it stands in the path of further progress with its subjective aspects.
From all of this please do not conclude measurem ent cannot be doneâit can clearly canâbut the
question of the relevance and effects of a form of measurement should be thought through as best you canbefore you go a head with some new measurement in you r organization. The inevit able changes that will
come in the future, and the increasing power of co mputers to automatically monitor things, means many
new measuring systems will come into useâ ones you yo urself may have to design, organize, and install.
So let me tell you yet another story of the effect of measurement.YOU GET WHAT YOU MEASURE 207
In computing, the programming effort is often measured by the number of lines of codeâwhat easier
measure is there? From the coderâs poi nt of view there is ab solutely no reason to try to clean up a piece of
code; quite the contrary, to get a higher rating on the productivity scale there is every reason to leave theexcess instructions in thereâindeed include a few âbells and whistl esâ if possible. That measure of
software productivity, which is widely used, is one of the reasons why we have such bloated softwaresystems these days. It is a counter in centive to the production the clean, co mpact, reliable coding we all want.
Again, the measure used influences the result in wa ys which are detrimental to the whole system! It also
establishes habits which at a later time are hard to remove.
When your turn comes to install a m easuring system, or even comment on one someone else is using, try
to think your way through to all the hidden consequences which will happen to the organization. Of course,in principle, measurement is a good thing, but it can often cause more harm than good. I hope the messagecame through to you loud and clear:
You get what you measure.208 CHAPTER 29
30
You and Your Research
You and Your Research
- The author argues that leading a life of significant accomplishment is superior to merely surviving or seeking amusement.
- Setting high, first-class goals is a personal responsibility, even if society discourages vocalizing such ambitions.
- While luck plays a role in success, it primarily favors the 'prepared mind' that has been cultivated through consistent effort.
- Great works are rarely isolated incidents of chance, as evidenced by the repeated breakthroughs of individuals like Shannon, Einstein, and Newton.
- The difference between those who succeed and those who do not often lies in their internal preparation and willingness to confront difficult questions early on.
- Relying on luck for one's life outcome is a mistake; significant achievement requires years of hard work and 'perspiration'.
Our society frowns on those who say this too loudly, but I only ask you say it to yourself!
I have given a talk with this title many times, and it turns out from discussions after the talk I could have just
as well have called it âYou and Your Engineering Career â, or even âYou and Your Careerâ. But I left the
word âResearchâ in the title because that is what I have most studied.
From the previous chapters you have an adequate background for how I made the study, and I need not
mention again the names of the famous people I have st udied closely. The earlier chapters are, in a sense,
just a great expansion, with much more detail, of the original talk. This chapter is, in a sense, a summary of
the previous 29 chapters.
Why do I believe this talk is important? It is impor tant because as far as I know each of you has but one
life to lead, and it seems to me it is better to do significant things than to just get along through life to its
end. Certainly near the end it is nice to look back at a life of accomplis hments rather than a life where you
have merely survived and amused yourself. Thus in a real sense I am preaching th e message: (1) it is worth
trying to accomplish the goals y ou set yourself, and (2) it is wo rth setting yourself high goals.
Again, to be convincing to you I will talk mainly about my own experience, but there are equivalent
stories I could use involving others. I want to get you to the state where you will say to yourself, âYes, Iwould like to do first class work. If Hamming could, then why not m e?â Our society frowns on those who
say this too loudly, but I only ask you say it to yourself! What you consider first class work is up to you;you must pick your goals, but make them high!
I will start psychologically rather than logically. The major objection cited by people against striving to
do great things is the belief it is all a matter of lu ck. I have repeatedly cited Pasteurâs remark, âLuck favors
the prepared mindâ. It both admits there is an element of luck, and yet claims to a great extent it is up to you.You prepare yourself to succeed, or not, as you choose, from moment to moment, by the way you live yourlife.
As an example related to the âluckâ aspect, when I first came to Bell Telephone Laboratories I shared an
office with Claude Shannon. At about the same time he created Information Theory and I created Coding
Theory . They were âin the airâ you can say, and you are right. Yet, why did we do it and the others who
were also there not do it? Luck? So me, perhaps, but also because we were what we were and the others
were what they were. The differences were we were more prepared to find , work on, and create the
corresponding theories.
If it were mainly luck then great things should no t tend to be done repeatedly by the same people.
Shannon did lot of important things besides Information Theoryâhis Masterâs Thesis was applying Boolean
Algebra to switching circuits! Einstein did many great things, not just one or two. For example when he was
around 12â14 years old he asked himself what light would look like if he went at the velocity of light. Hewould, apparently, see a local peak, yet the corresponding mathematical equations would not support astationary extreme! An obvi ous contradiction! Is it surprising he later discovered Special Relativity which was
in the air and many people were working on it at that time? He had prepared himself long ago, by that early
question, to understand better than the others what was going on and how to approach it.
Newton observed if others would think as hard as he did then they would be able to do the same things.
Edison said genius was 99% perspiration and 1% inspiration. It is hard work, applied for long years, whichleads to the creative act, and it is ra rely just handed to you without any serious effort on your part. Yes,
sometimes it just happens, and then it is pure luck. It seems to me to be folly for you to depend solely onluck for the outcome of this one life you have to lead.
Pursuing Important Problems
- Great achievements are often born from persistence and activity rather than raw IQ or early academic success.
- Intelligence manifests in diverse forms, and unconventional thinkers are frequently undervalued by their immediate peers.
- The story of Bill Pfann and zone melting illustrates how a 'mediocre' individual with a great idea can revolutionize a field like transistor production.
- A fundamental requirement for greatness is the conscious decision to work on problems that actually matter.
- Directly questioning the importance of one's daily work can lead to significant career shifts and professional recognition.
- Self-confidence and courage are essential psychological traits for those who wish to tackle difficult, high-impact challenges.
âIf what you are working on is not important and not likely to lead to important things, then why are you working on it?â
One of the characteristics you see is great people when young were generally activeâthough Newton did
not seem exceptional until after well into undergraduate days at Cambridge. Einstein was not a greatstudent, and many other great people were not at the top of their class.
Brains are nice to have, but many people who seem not to have great IQs have done great things. At Bell
Telephone Laboratories Bill Pfann walked into my office one day with a problem in zone melting . He did
not seem to me, then, to know much ma thematics, to be articulate, or to ha ve a lot of clever brains, but I had
already learned brains come in many forms and flavors, and to beware of ignoring any chance I got to workwith a good man. I first did a little analytical work on his equations, and soon realized what he needed wascomputing. I checked up on him by asking around in his department, and I found they had a low opinion ofhim and his idea for zone melting. But that is not the first time a person has not been appreciated locally,and I was not about to lose my chan ce of working with a great ideaâwhic h is what zone melting seemed to
me, though not to his own department! There is an old saying; âA prophet is without honor in his owncountryâ. Mohammed fled from his own city to a nearby one and there got his first real recognition!
So I helped Bill Pfann, taught him how to use the computer, how to get numerical solutions to his
problems, and let him have all the machine time he needed. It turned out zone melting was just what weneeded to purify materials for transistors, for example, and has proved to be essential in many areas of work.He ended up with all the prizes in the field, much more articulate as his confidence grew, and the other day
I found his old lab is now a part of a National Monument! Ability comes in many forms, and on the surfacethe variety is great; below the surf ace there are many common elements.
Having disposed of the psychological objections of luck and the lack of high IQ type brains, let us go on
to how to do great things. Among the important properti es to have is the belief you can do important things.
If you do not work on important problems how can you expect to do important work? Yet, directobservation, and direct questioning of people, shows most scientists spend most of their time working onthings they believe are not important nor are they likely to lead to important things.
As an example, after I had been eating for some years with the Physics table at the Bell Telephone
Laboratories restaurant, fame, promotion, and hiring by other companies ruined the average quality of thepeople so I shifted to the Chemistry table in another corner of the restaurant. I began by asking what theimportant problems were in chemistry, then later wh at important problems they were working on, and
finally one day said, âIf what you are working on is not important and not likely to lead to important things,then why are you working on it?â After that I was not welcome and had to shift to eating with theEngineers! That was in the spring, and in the fall one of the chemists stopped me in the hall and said, âWhat
you said caused me to think for the whole summer about what the important problems are in my field, and
while I have not changed my research it was well worth the effortâ. I than ked him and went onâand
noticed in a few months he was made head of the group. About 10 y ears ago I saw he became a member of
the National Academy of Engineering. No other person at the table did I ever hear of, and no other person was
capable of responding to the question I had asked, âWhy are you not working on and thinking about theimportant problems in your area?â If you do not work on important problems then it is obvious you havelittle chance of doing important things.210 CHAPTER 30
Confidence in yourself, then, is an essential property. Or if you want to you can call it âcourageâ.
Shannon had courage.
The Psychology of Great Work
- Courage and confidence are essential traits for researchers to endure long periods of failure and discouragement.
- A clear vision of excellence prevents the 'drunken sailor' effect, where random efforts cancel each other out over time.
- While mathematicians and physicists often peak early, fame can become a curse that prevents scientists from planting 'small acorns' of new ideas.
- The Institute for Advanced Study is cited as a cautionary example where prestige and comfort can lead to intellectual stagnation.
- Optimal working conditions are counterintuitive; for instance, an open-door policy may reduce immediate productivity but ensures one works on the right problems.
While playing chess Shannon would often advance his queen boldly into the fray and say, âI ainât scaird of nothingâ.
Who else but a man with almost infinite courage would ever think of averaging overall random codes and expect the average code would be good? He knew what he was doing was importantand pursued it intensely. Courage, or confidence, is a property to develop in yourself. Look at your
successes, and pay less attention to failures than you are usually advised to do in the expression, âLearn
from your mistakesâ. While playing chess Shannon would often advance his queen boldly into the fray andsay, âI ainât scaird of nothingâ. I learned to repeat it to myself when stuck, and at times it has enabled me togo on to a success. I deliberately copied a part of the style of a great scientist. The courage to continue isessential since great research of ten has long periods wi th no success and many discouragements.
The desire for excellence is an essential feature for doing great work. Without such a goal you will tend
to wander like a drunken sailor. The sailor takes one step in one direction and the next in some independentdirection. As a result the steps tend to cancel each othe r, and the expected distan ce from the starting point is
proportional to the square root of the number of steps taken. With a vision of excellence, and with the goalof doing significant work, there is te ndency for the steps to go in the sa me direction and thus go a distance
proportional to the number of steps taken, which in a lifetime is a large number indeed. As noted before,chapter 1 , the difference between having a vision and not having a vision, is almost everything, and doing
excellent work provides a goal which is steady in this world of constant change.
Age is a factor physicists and mathematicians worry about. It is easily observ ed the greatest work of a
theoretical physicist, mathematician, or astrophysicis t, is generally done very early. They may continue to
do good work all their lives, but what society ends up valuing most is almost always their earliest greatwork. The exceptions are ve ry, very few indeed. But in literature, music composition, and politics, age seems
to be an asset. The best compositions of a composer are usually the late ones, as judged by popular opinion.
One reason for this is fame in Science is a curse to quality productivity, though it tends to supply all the
tools and freedom you want to do great things. Another reason is most famous people, sooner or later, tend
to think they can only work on important problemsâhence they fail to plant the little acorns which growinto the mighty oak trees. I have seen it many times, fr om Brattain of transistor fame and a Nobel Prize to
Shannon and his Information Theory . Not that you should merely work on random thingsâbut on small
things which seem to you to have the possibility of future growth. In my opinion the Institute for AdvancedStudy at Princeton, N.J has ruined more great scient ists than any other place has createdâconsidering what
they did before ore and what they did after going there. A few, like von Ne umann, escaped the closed
atmosphere of the place with all its physical comforts and prestige, an d continued to contribute to the
advancement of Science, but most remained there and continued to work on the same problems which gotthem there but which were generally no longer of great importance to society.
Thus what you consider to be good working conditions may not be good for you! There are many
illustrations of this point. For example, working with oneâs door closed lets you get more work done peryear than if you had an open door, but I have observed repeatedly later those with the closed doors, whileworking just as hard as others, seem to work on slightly the wrong problems, while those who have let theirdoor stay open get less work done but tend to work on the right problems! I cannot prove the cause andeffect relationship, I only observed the correlation.
Inverting Problems and Personal Drive
- The author argues that an open mind and an open door are mutually reinforcing qualities that lead to greater opportunities.
- Resource constraints, such as lacking an 'acre of programmers,' can be transformed into assets by inverting the problem and seeking automated solutions.
- Significant scientific progress often occurs when a researcher shifts focus from merely finding an answer to demonstrating a broader principle or methodology.
- Harsh reality and practical limitations are often superior to 'pure research in a vacuum' because they force researchers into significant discoveries.
- Great achievement is frequently the result of immense personal drive and hard work rather than innate genius alone.
What had seemed to be a defect now became an asset and pushed me in the right direction!
I suspect the open mind leads to the open door, and theopen door tends to lead to the ope n mind; they reinforce each other.
A similar story from my own experience. In the early days of programming computers in absolute binary
the usual approach was usually through an âacre of programmersâ. It was soon evident to me BellTelephone Laboratories would never give me an acre of programmers. What to do? I could go to a WestCoast airframe manufacturer and get a job and have the proverbial acre, but Bell Telephone Laboratorieshad a fascinating collection of great people from whom I could learn a lot, and the airframe manufacturersYOU AND YOUR RESEARCH 211
had relatively fewer such people. After quite a few weeks of wondering what to do I finally said to myself,
âHamming, you believe machines can do symbol manipulation, why not get them to do the details of theprogramming?â Thus I was led directly to a frontier of Computer Science by simply inverting the problem.What had seemed to be a defect now became an asset and pushed me in the right direc tion! Grace Hopper
had a number of similar stories from Computer Science, and there are many other stories with the samemoral: when stuck often inverting the problem, and realizing the new formulation is better, represents asignificant step forward. I am not asserting all bloc kages can be so rearranged , but I am asserting many
more than you might at first suspect can be so changed from a more or less routine response to a great one.
This is related to another aspect of changing the problem. I was once solving on a digital computer the
first really large simulation of a system of simultane ous differential equations which at that time were the
natural problem for an analog computerâbut they had not been able to do it and I was doing it on an IBM701. The method of integration was an adaptation of the classical Milneâs method, and was ugly to say theleast. I suddenly realized of course, being a military problem, I would have to file a report on how it was
done, and every analog installation would go over it trying to object to what was actually being proved asagainst just getting the answersâ I was showing convincingly on some large problems the digital computercould beat the analog computer on its own home ground. Realizing this, I realized the method of solutionshould be cleaned up, so I developed a new method of integration which had a nice theory, changed themethod on the machine with a change of comparatively few instructions, an d then computed the rest of the
trajectories using the new formula. I published the new method and for some years it was in wide use and
known as âHammingâs methodâ. I do not recommend the method now further progress has been made andthe computers are different. To re peat the point I am making, I changed the problem from just getting
answers to the realization I was demo nstrating clearly for the first time the superiority of digital computers
over the current analog computers, thus making a significant contribution to the science behind the activityof computing answers.
All these stories show the conditions you tend to want are seldom the best ones for youâthe interaction
with harsh reality tends to push you into significant discoveries which otherwise you would never havethought about while doing pure research in a vacuum of your private interests.
Now to the matter of drive . Looking around you can easily observe great people have a great deal of
drive to do things. I had worked with John Tukey for some years before I found he was essentially my age,so I went to our mutual boss and asked him, âHow can anyone my age know as much as John Tukey does?âHe leaned back, grinned, and said, âYou would be surprised how much you would know if you had workedas hard as he has for as many yearsâ.
The Strategy of Greatness
- Intellectual investment functions like compound interest, where a small amount of extra daily effort leads to a massive accumulation of output over a lifetime.
- Hard work alone is insufficient; success requires 'style,' which involves working on the right problems at the right time and in the right way.
- Dedicate a fixed portion of time, such as 'Great Thoughts' Friday afternoons, to step back from details and examine the long-term direction of your field.
- Great achievers possess a high tolerance for ambiguity, maintaining a paradoxical state of both believing in their work and doubting it enough to seek improvements.
- Maintain a mental list of ten to twenty important, unsolved problems and be prepared to drop everything the moment a clue for a solution appears.
- A problem's importance is defined not just by its inherent value, but by the existence of a viable path or 'attack' to solve it.
Great people can tolerate ambiguity, they can both believe and disbelieve at the same time.
There was nothing for me to do but slink out of his office, which I did.I thought about the remark for some weeks and decided, while I could never work as hard as John did, Icould do a lot better than I had been doing.
In a sense my boss was saying intellectual investment is like compound interest, the more you do the
more you learn how to do, so the more you can do, etc. I do not know what compound interest rate to assign,but it must be well over 6%âone extra hour per day over a lifetime will much more than double the totaloutput. The steady application of a bit more effort has a great total accumulation.
But be carefulâthe race is not to the one who works hardest! You need to work on the right problem at
the right time and in the right wayâwhat I have been calling âstyleâ. At the urging of others, for someyears I set aside Friday afternoons for âgreat thoughtsâ. Of course I would answer the telephone, sign a letter,and such trivia, but essentially, once lunch started, I would only think great thoughtsâwhat was the natureof computing, how would it affect the development of sc ience, what was the natural role of computers in Bell
Telephone Laboratories, what effect will computers have on AT&T, on Science generally? I found it waswell worth the 10% of my time to do this careful ex amination of where computin g was heading so I would212 CHAPTER 30
know where we were going and hence could go in the right direction. I was not the drunken sailor
staggering around and canceling many of my steps by random other steps, but could progress in a more orless straight line. I could also keep a sharp eye on the impo rtant problems and see th at my major effort went
to them.
I strongly recommend this taking the time, on a regular basis, to ask the larger questions and not stay
immersed in the sea of detail wher e almost every one stays almost a ll of the time. These chapters have
regularly stressed the bigger picture, and if you are to be leader into the future, rather than to be a follower
of others, I am now saying it seems to me to be necessary for you to look at the bigger picture on a regular,frequent basis for many years.
There is another trait of great people I must talk aboutâand it took me a long time to realize it. Great
people can tolerate ambiguity, they can both believe and disbelieve at the same time. You must be able to
believe your organization and field of research is th e best there is, but also there is much room for
improvement! You can sort of see why this is a necessary trait If you believe too much you will not likely
see the chances for significant improvements, you will see believe enough you will be filled with doubts andget very little chances for only the 2%, 5%, and 10% improvements; if you do not done. I have not the
faintest idea of how to teach the tole rance of ambiguity, both belief and disbelief at the same time, but great
people do it all the time.
Most great people also have 10 to 20 problems they regard as basic and of great importance, and which
they currently do not know how to solve. They keep them in their mind, hoping to get a clue as to how tosolve them. When a clue does appear they generally drop other things and get to work immediately on theimportant problem. Therefore they tend to come in first, and the others who come in later are soon forgotten.
I must warn you however, the importan ce of the result is not the measur e of the importance of the problem.
The three problems in Physics, antig ravity, teleportation, an d time travel are seldom worked on because we
have so few clues as to how to startâa problem is im portant partly because there is a possible attack on it,
and not because of its inherent importance.
There have been a number of times in the book when I came close to the point of saying it is not so much
what you do as how you do it.
The Art of Professional Style
- Doing a job with style means recasting work in its most fundamental form to maximize its range of application.
- Work should be performed in a way that allows others to build upon it rather than making the creator indispensable.
- Sharing ideas freely often leads to greater recognition and collaboration rather than theft of intellectual property.
- A professional should focus on the main task rather than wasting energy on reforming trivial organizational blemishes.
- Success requires mastering the ability to sell ideas through formal presentations, written reports, and informal interactions.
As the old song says, âIt ainât what you do if s the way thatyou do itâ.
I just told you about the changing of the problem of solving a given set ofdifferential equations on an analog machine to doing on a digital computer, changing progamming from anacre of programmers to letting the machine do much of the mechanical part, and there are many similar
stories. Doing the job with âstyleâ is important. As the old song says, âIt ainât what you do if s the way thatyou do itâ. Look over what you have done, and recast it in a proper formâI do not mean give it falseimportance, nor propagandize for it, nor pretend it is not what it is, but I do say by presenting it in its basic,fundamental form, it may have a larger range of application than was first thought possible.
Again, you should do your job in such a fashion others can build on top of it. Do not in the process try to
make yourself indispensable; if you do then you cannot be promoted because you will be the only one who
can do what you are now doing! I have seen a number of times where this clinging to the exclusive rights tothe idea has in the long run done much harm to the individual and to the organization. If you are to getrecognition then others must use your results, adopt, adapt, extend, and elaborate them, and in the process
give you credit for it. I have long held the attitude of telling every one freely of my ideas, and in my longcareer I have had only one important idea âstolenâ by another person. I have found people are remarkably
honest if you are in your turn.
It is a poor workman who blames his tools. I have always tried to adopt the philosophy I will do the best I
can in the given circumstances, and after it is all over maybe I will try to see things are better next time.
This school is not perfect, but for each class I try to do as well as I can and not spend my effort trying to
reform every small blemish in the system. I did chan ge Bell Telephone Laboratories significantly, but didYOU AND YOUR RESEARCH 213
not spend much effort on trivial detailsâI let others do that if they wanted toâbut I got on with the main
task as I saw it. Do you want to be a reformer of the trivia of your old organization or a creator of the neworganization? Pick your choice, but be clear which path you are going down.
I must come to the topic of âsellingâ new ideas . You must master three things to do this ( Chapter 5 ):
1. giving formal presentations,
2. producing written reports,3. master the art of info rmal presentations as they happen to occur.
The Art of Excellence
- Effective presentation is essential because good ideas do not automatically win out and are often resisted by the establishment.
- Mastering the delivery of ideas requires a habit of privately critiquing others and adapting successful techniques to your own style.
- Professional freedom is earned by establishing a reputation for expertise on your own time before being granted autonomy by an organization.
- Navigating superiors who are less capable than yourself is a necessary part of the journey for those destined for the top.
- The true value of striving for excellence lies in the personal transformation and struggle rather than the final achievement.
Yes, it is nice to end up where you wanted to be, but the person you are when you get there is far more important.
All three are essentialâyou must learn to sell your ideas, not by propaganda, but by force of clear
presentation. I am sorry to have to point this out; ma ny scientists and others think good ideas will win out
automatically and need not be carefully presented. They are wrong; many a good idea has had to berediscovered because it was not we ll presented the first time, years before! New ideas are automatically
resisted by the establishment, and to some extent ju stly. The organization cannot be in a continual state of
ferment and change; but it should respond to significant changes.
Change does not mean progress, but progress requires change.
To master the presentation of ideas, while books on the topic may be partly useful, I strongly suggest you
adopt the habit of privately critiquing all presentations you attend and also asking the opinions of others.Try to find those parts which you think are effective and which also can be adapted to your style. And thisincludes the gentle art of telling j okes at times. Certainly a good after dinner speech requires three well told
jokes, one at the beginning, one in the middle to wake them up again, and the be st one at the end so they
will remember at least one thing you said!
You are likely to be saying to yourself you have not the freedom to work on what you believe you should
when you want to. I did not either for many yearsâI had to establish the reputation on my own time that I
could do important work, and only then was I given the time to do it. You do not hire a plumber to learnplumbing while trying to fix your trouble, you expect he is already an expert. Similarly, only when youdeveloped your abilities will you generally get the freed om to practice your expertise, whatever you choose
to make it, including the expertise of âuniversalityâ as I did. I have already discussed the gentle art of
educating your bosses, so will not go into it again. It is part of the job of those who are going to rise to thetop. Along the way you will generally have superiors who are less able than you are, so do not complainsince how else could it be if you are going to end up at the top and they are not?
Finally, I must address the topic of: is the effort required for excellent worth it? I believe it isâthe chief
gain is in the effort to change yourself, in the struggle with yourself, and it is less in the winning than you mightexpect. Yes, it is nice to end up where you wanted to be, but the person you are when you get there is farmore important. I believe a life in which you do not try to extend yourself regularly is not worth livingâbut
it is up to you to pick the goals you believe are worth striving for. As Socrates (470?-399) said,
âThe unexamined life is not worth living.â
The Style of Thinking
- The core value of the text is not the technical content of coding or filter theory, but the 'style' of thinking applied to those problems.
- A deliberate plan for the future is essential to avoid drifting aimlessly and to maximize one's potential accomplishments.
- Opportunities are constantly present for everyone, and success is often more attainable than it initially appears.
- The author views the instruction as a form of 'revivalist preaching' intended to inspire a commitment to greatness.
- Personal discovery and self-reliance are emphasized, as the author had to find these truths independently.
- The reader is challenged to exceed the author's own achievements now that the roadmap to success has been shared.
A plan for the future, I believe, is essential for success, otherwise you will drift like the drunken sailor through life and accomplish much less than you could otherwise have done.
In summary; as I claimed at the start, the essence of the book is âstyleâ, an d there is no real content in the form
of the topics like coding theory, filter theory, or simulation that were used for examples. I repeat, thecontent of these chapters is âstyleâ of thinking, which I have tried to exhibit in many forms. It is yourproblem to pick out those parts you can adapt to your life as you plan it to be. A plan for the future, I214 CHAPTER 30
believe, is essential for success, otherwise you will drift like the drunken sailor th rough life and accomplish
much less than you could otherwise have done.
In a sense, this has been a course a revivalist preach er might have givenârepent you idle ways and in the
future strive for greatness as you see it . I claim it is generally easier to succeed than it at first seems! It seems
to me at almost all times there is a halo of opportuniti es about everyone from which to select. It is your life
you have to live and I am only one of many possible guides you have for selecting and creating the style ofthe one life you have to live. Most of the things I have been saying were not said to me; I had to discoverthem for myself. I have now told you in some de tail how to succeed, hence you have no excuse for not
doing better than I did. Good Luck! YOU AND YOUR RESEARCH 215
Index
ADA language, 49
a different product, 16
aggregation of data, 223Aitken, H., 3alphabet training, 266
anticongruent triangles, 271
APL language, 48
Aristotle, 2, 17
ASCII code, 116Aspect, A., 288
atomic bomb, 214
Babbage, 29, 41, 67
back of the envelop calculations, 4, 20
Backus John, 42
Baker, W.O., 163Bell A.G., 249Bell Telephone Laboratories, ix
block codes, 115
brain storming, 295
BTL analog computer, 28, 137
BTL Model V computer, 138Buddha, 288
can machines think?, 69, 90, 93
channel encoding, 114checkers, 75, 81
chess, 73, 88
classical e ducation, 267
Clippinger, Dick, 40Club of Rome, 222
computer advantages, 11, 59
constructivists, 275
continental drift, 294, 304
data bases, 62
decoding tree, 111Democritus, 36, 72, 287
Dick, Thomas, 294, 304
Dirac, P.A.M., 287direction field, 234distance function, 108, 143
Dodson (Lewis Carroll), 277
drunken sailor, 10
Eckert, 40
Eddington, 161EDSAC, 41
education vs. training, 3
Einstein A., 45, 276, 307, 350eigenvalue, 174
ENIAC, 31, 40
entropy, 151errors in codes, 132expert systems, 68
feedback, 204
Fermi, E., 260
fifth generation computers, 50
Ford, Henry Sr., 9formalists, 272
FORTRAN, 42
four circle paradox, 107Fourier series, 175
frequency vs. polynomials, 238
fundamentals, 7
Galileo, G., 17
gamma function, 101
garbage in garbage out, 233Gibbsâ inequality, 152
Gibbsâ phenomena, 183
Gilbert, E.N., 133GPS language, 68
216
Godelâs theorem, 280
growth of knowledge, 21
Gulliverâs travels, 57
Hamming code, 143
Hamming window, 189
Hawthorne effect, 260Hermite, H., 275
Hilbert, 272, 273
history, 9Hollerith, H., 30how a filter works, 181
Hopper, Grace, 354
Huffman codes, 125Huskey, H., 40
Huxley, A., 259
IBM 650, 41, 45, 46, 59
information, 149
information system, 114IQâs, 338
interpreter, 45â6
interconnection costs, 15ISBN, 134isoceles triangle proof, 272
jargon, 219
Kaiser, J.P., 164â5, 206
Kaiser filter design, 195
Kane, Jack, 60Kraft inequality, 119Kuhn, T., 303
Lady Levelace (Ada), 67
Landeâ, 290
language, 48
learn from experience, 75learn to learn, viii
life testing, 313
limiting the solution, 221LISP, 44
logical school of mathematics, 274
Los Alamos, 30, 34, 50, 58, 214, 240Lull, Raymond, 57
Mathews, Max, 82
mathematical programs, 80, 84, 87
Mauchly, 40McMillanâs theorem, 117, 120median filters, 209
medicine, 85
Mendel, G., 295Metropolis, N.C., 31, 40
micromanagement, 18
Morgenstern, 319
Morse code, 115
music, 82
NBS publication, 316
neural nets, 52
Newton, I., 4
NIKE missle, 212, 239, 335Nyquist frequency, 166
Originality, 293
parable of the old lady and the Cathedral, 325
Index of Intellectual History
- A comprehensive index listing influential figures in science and philosophy, ranging from Plato and Socrates to Alan Turing and John Tukey.
- The text highlights key computational milestones including the SDS 910 computer, UNIVAC, and the SOAP programming language.
- It references diverse scientific concepts such as the uncertainty principle, solitons, and the stability of solutions.
- The index suggests a focus on the intersection of human psychology and technology through entries like the Turing test and psychological novelty.
- It includes anecdotal 'stories' and specific case studies, such as the shower story and space shot reliability, to illustrate technical points.
Socrates, 2, 12, 359INDEX 217
Pasteur, vii, 138, 297, 301, 350
Pfann, Bill, 351
Pierce, J.R., 82Planck, Max, 284
Plato, 2, 262
Platonic mathematics, 271proper teaching, 261
psychological novelty, 89
public speaking, 55, 358
RAND, 67
RDA, #2 MIT, 28, 212, 231redundancy, 139relevance of a simulation, 223
robots, 15
Rorschach test, 244Russell, 274
St. Augustine, 290
sampling rate stories, 168
Samuel, Art, 75, 81
Schickert, 28Schroedinger, 285
SDS 910 computer, 61
self consciousness, 72Shannon, C.E., 114, 149, 350shower story, 205
Slagle, 87
SOAP language, 40Socrates, 2, 12, 359INDEX 217
solitons, 254
source encoding, 114
space shot reliability, 224
special purpose chips, 24stabililty of solution, 234
Stibitz, G., 30
Stirlingâs formula, 100
stock market, 228
Stonehenge, 27studentâs future, viii
strong focusing, 252
style, 1, 3
tennis simulation, 225
three dimensional tic-tac-toe, 73
top down programming, 46
total reflection, 250transfer function, 174
transfer of training
traveling wave tube, 219
Tukey, John, 164, 190, 298, 355
Tukey-Cooley algorithm, 198Turing, Alan, 45â7
Turing test, 71
UFO, 260
uncertainty principle, 10, 210uniquely decodable, 6
UNIVAC, 32, 59, 316
vision and future, 10
variable length codes, 115
vitalism, 71volume of a sphere, 104
von Hann window, 189
von Neumann, 31, 45, 287
weather, 216
Wegener A., 304
Westerman, H.R., 331
weight lifing story
weighted sum codes, 134
Wilkes, M., 31, 41
Zuse, C., 31218 INDEX