November 12, 2010

I'm not a real scientist, and that's okay

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American

What kind of discipline is computer science? I thought it was a science when I received my BS. I believed its subdiscipline software engineering was engineering when I received my PhD. I’d heard, and would continue to hear, “This isn’t any kind of science/engineering I know!” from physicists and electrical engineers. I tried for years to prove them wrong. But now I think they’re right.

I’ve seen computer science described as many things—a blend, usually, of disciplines: mathematics and electrical engineering, with psychology thrown in, and occasionally more exotic area like physics (quantum computing) and molecular biology (biological computers). Certainly CS research and practice draw from these areas, but drawing from is different from being, or even being derived from. And none of these descriptions quite hits the mark. In my opinion, we would be best served by viewing CS as a branch of philosophy.

I don’t refer to ethics but to logic, and moreover, logic at the expense of observation. In science, a theory fails if it doesn’t predict observation. In CS theory, there is no observation. Things we use in computing—programming language, an operating system, and a computer’s instruction set—are models of a virtual reality. This reality derives from someone’s belief about the easiest way to solve a class of problems. So far, that’s not so different from science. Copernican theory succeeds because it’s easier to apply than Ptolemaic.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

But Copernicus explained the sky he observed. Microsoft Windows provides a model of interaction with your computer, but not a model that explains phenomena, nor one that’s comparable to any other model. At least, I have yet to encounter a Macintosh user who cites repeatable experiments while denigrating Windows. I have read usability studies, but many are controversial and their conclusions are limited.

The cost of performing experiments certainly explains in part why they’re not common. But absence of experimentation also stems from an entrenched mindset in CS researchers.

It wasn’t always this way. Back in the late 1950s, the most important criteria for judging software were: how fast does it run, and how much memory does it use? Researchers, many with electrical engineering backgrounds, answered these questions analytically. Measure how long a program takes to execute on multiple data sets, fit the points to a curve, and voilà! an equation for calculating a program’s expected running time. The problem was that the equation only applied to that program run on that computer. The IBM 709 was not only faster than the earlier 704, its new instruction set dramatically increased programmers’ power. A 709-based program to sort data didn’t much resemble the 704-based version, so a 704-derived equation wasn’t particularly accurate. And if your company switched from IBM to Honeywell, you might as well throw away your data and start again.

Enter the mathematicians. John Hopcroft and Robert Tarjan, devised an abstraction for describing an algorithm’s performance. The abstraction is based purely on the size of the input data set and expresses time and memory performance as a function thereof. It’s not tied to any particular computer. It’s not tied to any physical entity. It’s a simple, straightforward way to compare performance. I won’t describe it here; read Gödel, Escher, Bach for details. Its importance lay in providing a universal tool for predicting performance.

Except it wasn’t, really. Hopcroft and Tarjan’s abstraction predicts performance as a concept, not as a physically measurable quantity. It doesn’t tell you how many seconds your program takes to execute—and can’t, because for that you still need to know the specifics of the computer you’re going to use. And indeed, there was resistance to Hopcroft and Tarjan’s work. But they had created something so powerful that its disadvantages were easily overlooked.

And so CS shifted away from empiricism. Researchers stopped making measurements and explored mathematical formalisms. I’m generalizing here. In some areas of CS, like networks, measurements are still fundamental. But I think I’m safe in saying that most CS research isn’t concerned with empirical results.

Perhaps this history helps explain why CS never developed a research method including steps fundamental in science: observe, and formulate hypotheses. Mathematicians don’t need hypotheses. They only need to state their assumptions and use logic to derive results. It’s Aristotle’s philosophy. It’s a valid approach for deriving conclusions, but unless there’s a rigorous way to test an assumption’s validity, it’s a dangerous approach to apply to the real world. Aristotle’s assumptions led him to conclude the heart was the center of consciousness.

Let me give a present-day example to illustrate the danger. A year or two ago I saw a paper explaining why open-source software is inherently less secure than proprietary software. It had flawlessly logical arguments—yet in the real world, more security flaws have been discovered in Microsoft’s software than in open-source software. This wasn’t data manipulation; the paper had no data. The authors made their point by starting from assumptions about how flaws are discovered. I doubt anyone can contradict their assumptions: Virus writers don’t offer themselves and their techniques for study. But clearly, something in the paper is amiss. And there’s great risk in accepting its conclusions.

Mathematicians, then, have influenced CS as a discipline, but there’s too much of CS that can’t be formalized to call it a branch of mathematics. This is why I find it more of a philosophy. The ancient Greeks, to whom mathematics and philosophy were more or less indistinguishable, would be inclined to agree, I think. The different camps I’ve encountered over the years (best programming language, best operating system, best software development methodology) seldom back up their beliefs with data; when they do, their experiments are met with the kind of skepticism the medical community reserves for cancer studies funded by tobacco companies. Each camp has a “philosophy” about the best way to develop software and doesn’t question its righteousness.

I don’t think this is bad, as long as it’s acknowledged. Though I know CS professors who wouldn’t be caught dead in a philosophy department, I think both education and practice could benefit from a better understanding of philosophical thought. My own work involves thinking about how to structure information to make it understandable. In my undergraduate days we used to debate whether a car without wheels is still a car, and then what if you remove the engine, and just when does it stop being a car? This argument, which is quite relevant to my work, I now know I can frame as a debate between the philosophies of Aristotle and Wittgenstein. There isn’t a right or wrong answer, just different solutions depending on which answer one chooses. Wouldn’t it be great if students learned there was nothing new under the sun?

From time to time I’ve seen calls for a “new kind of science” that encompasses disciplines like computer science. I dislike the ideas; I find them misbegotten attempts to confer the prestige of science. What’s wrong with being a philosopher, anyway? Let’s use tools appropriate to our trade, and recognize the strengths and limitations of every discipline—scientific and nonscientific alike.

About the Author: Steve Wartik works at the Institute for Defense Analyses as an Information Analyst, studying how to structure and represent data to maximize its usefulness. He believes his career, which has included stints in academia, private industry, and nonprofit organizations, has given him perspective on the many different paths of human progress. He is grateful to his wife, science writer and former biomedical researcher Elia Ben-Ari, for forcing him to confront the nature of his profession.

The views expressed are those of the author and are not necessarily those of Scientific American.