The Great Secret of Computer Science

There are two, I suppose. The first, in the words of Professor Brian Harvey:

The biggest secret in Computer Science is that it’s not a science and it’s not about computers.

And the second, in the words of me:

Computer science isn’t hard because we made it up.

We Made It Up

I said that while I was a TA for the “intro to computer science course” at Berkeley1, and you might think that’s a horrible thing to say with students struggling in the class. But I’m not trying to say computer science can’t be difficult. Quite often it is.

In the theory of computation, a hard problem is one that simply, provably cannot be solved in an easy way.2 That’s true of a number of real-world fields: we’re nowhere near a tractable, predictable model of economics, and something like quantum physics…

I think I can safely say that nobody understands quantum mechanics. –Richard Feynman

So why is computer science different? Because all of it—well, all of it that isn’t really in some related field—was made up. It was created by people like you and me. Some of them were smarter than you and me, or knew more about this field or related fields, or just had really good insights, but the crucial thing is that everything has a reason. Somebody, some person decided that this was the way this should be done, and that spread, and that’s what we’re teaching you now.

(There are plenty of times where the reason for something is no longer relevant, or even where it was a bad reason to begin with. But there at least was a reason; somebody thought it was an idea that made sense.)

If your brain works a certain way, this is a huge help in learning computer science. There’s a reason things are the way they are, and that can help you remember them, and even possibly inform your own work.

Computer science may be difficult, but it is not (intrinsically) hard.

Not a Science

This is also a way in which computer science is not a science. In biology, in chemistry, in physics, there are reasons for things, but they didn’t come out of someone’s head. Instead, we try to come up with ideas that provide reasons, hoping they’re right, and using them until we figure out a better model. There is a true answer out there—the universe—and we’re trying to understand it. That’s not what we do in CS.

At its heart, science is observation and experimentation. There is a real system out there, and you perform experiments and make observations to understand how that works. Or try to understand, anyway.

"Evolution isn't a 'theory' in the common sense. Evolution is as well understood as the 'theory' of gravity." / "What the...are you saying we don't understand evolution?!"

(credit to Saturday Morning Breakfast Cereal)

That’s not what we do in CS. You shouldn’t need to observe a system to figure out how it works (well, ideally), and you shouldn’t be experimenting with your programs because you will get it wrong.3 We already know how everything works, because we made it ourselves.

Computer science is not a science.

Not About Computers

Let me illustrate this section with a diagram:

Below our field is Electrical Engineering. This is a form of applied Physics; electrical engineers use what we know about the world to build more capable electronics. The limits of what we can do with computers are at least partly based on what the actual, physical machines can do.

At the opposite side, we have the Theory of Computation, a field that could be classified as either “theoretical Computer Science” or as “Mathematics”. While EE limits what computers can do now, the theory of computation is concerned with what computers will ever be able to do. As I discussed above, some problems are intrinsically hard, such as finding the factors of a very large number, and faster computers will only make the problem a little better.4

We now have our upper and lower bounds, but there are other things that define computer systems. Both artifical intelligence and human-computer interaction take cues from cognitive science. Real-world optics provide the desired behavior of computer graphics. There are probably more examples, but these two are pretty major.

Finally, we have the actual applications of computers, where people are developing software to solve problems in the real world. Obviously this is going to affect the development of the field itself: the tools we use were often invented to solve specific problems and then grew to become more general.

So there you have it: a field defined by its four bounds. Close to any of these boundaries, work in “computer science” looks a lot like an extension of work in the related field. In the center of the diagram is…well, it’s not quite clear. “Software”, perhaps, but that isn’t itself a field of study.5 Nor are “computers” the things being studied.

Up until now, that center label has been “Computer Science”, but I hope that by now you’re at least open to the idea that that’s inappropriate.

The Subject of Study

This post was preceded by another called “Interpreting Information”, which described how computer programs encode their information in “ones and zeroes”. A key point is that the encoding is actually fairly arbitrary, at least from the computer’s point of view. It’s only because one or more programs use the same encoding consistently that it has any meaning, and—somewhat crucially in understanding software—higher levels of a program don’t need to know all the details of how the lower level works.

At a fundamental level, all that software does is collect, process, store, transfer, and display information.

Here’s an example: You’re using your web browser to buy your mom a birthday present from Amazon. On the low level (though not the lowest), the browser is receiving information from Amazon (in the form of HTML) on how to display a web page and what images to load. This information is then processed by an algorithm that draws the web page into an image buffer (probably in the form of a bitmap), which is then sent to your display. The display uses the information in the bitmap to decide which lights to turn on.6

Stepping up to a higher level, Amazon is sending you information about the products they have available, when they can ship, and so on. You send them information about what you want to buy, along with your credit card number (or your username and password, to access a previously stored credit card number). They send your bank information about the total cost of your purchases; your bank updates the information that is your credit card balance and sends back a confirmation (or a denial!). Amazon then updates its list of pending orders, possibly sending more information to, say, UPS. At some point, this information will then be presented to a human, who will bring you your package.

At both the high level and the low level, the purpose of software is to handle information. As with any generalization, it fits some examples better than others, but even in the case of something like a game (whose purpose is not to present something to the user), there is still plenty of important information: the position of the player and enemies on the playing field, the items the player has acquired, the remaining time, and so on. It’s a game to us because we derive enjoyment from it, but to the computer it’s just rearranging and displaying information based on user input.

The conclusion I’ve come to is that the best name for our field—the entire field—is Information Technology.

…Or would have been, if that hadn’t been co-opted by the general public to mean “the computer support department” at a non-software company.

“Programming”?

You’ll notice that “programming” isn’t really a part of the description of “information technology” above. In fact, there is plenty of work in the field that can be done without “programming”: human-computer interaction work, designing a system or interface between two systems, algorithmic analysis, etc. And similarly, being a programmer doesn’t actually imply any real work in the academic field of “information technology”, though of course it is usually a useful background for designing and implementing programs well. It certainly doesn’t guarantee that, though, which is why you get people who graduate with computer science degrees and are still terrible programmers.

But, there you have it. Computer science isn’t a science and it isn’t about computers: it’s the study of existing work and new frontiers in algorithmically manipulating information, particularly using electronics. And while it may be difficult, it isn’t hard in the same way that many other fields are, because everything has a reason and was designed by people.

And there’s so much more still to do.

  1. CS61A. It’s an intro to computer science assuming you’ve already done a bit of programming. If you’re completely new, there’s an “intro to programming”-type class, CS10. ↩︎

  2. Okay, I didn’t actually take any dedicated complexity theory classes, and even I know I’m misusing the term; usually you say “X-hard” to say “this is at least as hard as X”. Poetic license. ↩︎

  3. In science, an experiment coming out negative is positive information; an experiment is “wrong” if it doesn’t actually give you the information you expect. In software, a program being “wrong” might just be wrong information, but it might also mean exposure to dangerous levels of radiation. ↩︎

  4. If this sounds negative, keep in mind that I’m leaving out plenty of other work in the field. In fact, our whole computation model was developed as part of the description of the famous Halting Problem, which demonstrates that certain kinds of programs can never be guaranteed to stop running—or to keep running. Theoretical limits can provide practical benefits. ↩︎

  5. Though there are definite practices of “software engineering” and “software development”. Neither of these terms have a strict definition, and I’ll talk about how I personally distinguish the two in a future blog post. ↩︎

  6. This is not actually how computer displays work, but you get the idea. ↩︎

blog comments powered by Disqus