THE INTERNET STRIKES BACK
*

When one computer in cyberspace sneezes,
software round the world can catch a cold.
So is it time to provide computer networks
with immune systems that kill viruses?
Kurt Kleiner reports


TO LOOK AT, IT'S NOTHING SPECIAL: a small, cluttered, windowless room lined by PCs. But appearances can be deceptive. This is the computer equivalent of a high-security microbiology laboratory equipped to deal with killers such as Ebola virus and the hantavirus. Hundreds of highly infectious floppy discs lie piled on counters and stuffed into half-open drawers. The virus isolation laboratory, deep inside IBM's Watson Research Center in Hawthorne, New York, contains the best part of the 12 000 or so known computer viruses, as well as the tools needed to take them apart safely, identify and kill them.

Jeffrey Kephart sits in front of an oversize monitor that displays the company's AntiVirus logo. For now, IBM AntiVirus is just one of many virus-fighting tools. But Kephart has big plans. He is trying to create nothing less than a global immune system to protect us all from the emerging diseases of cyberspace. "It's only the existence of an immune system that allows the human race to exist," says Steve White. "Only an immune system in cyberspace will allow it to exist."

Electronic organism

To some competitors this sounds like so much hyperbole. But in research papers Kephart, White and their colleagues have focused on the striking similarities between a network of computers and a living organism. Insights from epidemiology and immunology, they argue, can help to protect the electronic organism from dangerous new infections. They are already testing the idea on an internal IBM network, and will begin to release parts of their immune system to the public later this year.

Computer viruses are sneaky programs designed to insinuate their way into your computer, copy themselves, spread to other computers and, usually, cause a few symptoms. Those symptoms might be the equivalent of a cold -- like slowing down your system or the word "Wazza" appearing mysteriously in documents. Or they might be the digital equivalent of fatal pneumonia: some viruses head straight for your hard disc and trash it.

In the past, viruses have passed between computers on floppy discs. If you boot up your computer with an infected disc in place, or run an infected program from one, the virus can take over part of your computer' s memory and copy itself into other files. It can also copy itself onto any uninfected floppy discs that you may later use in your PC.

A huge industry has grown up to fight the virus threat, selling software that promises to protect your computer from infection. At the heart of most of these systems is a virus "scanner". This is a diagnostic program that searches every piece of code in your computer's memory for the "signature" of known viruses. The signature is a small piece of code, chosen carefully by the creator of the antivirus system, that always appears in that virus but never in legitimate programs. If the scanner detects the signature, it warns you that you are infected.

The trouble with these systems is that new viruses are being created all the time -- at an estimated rate of eight a day and rising -- so the companies that make antivirus software are forced to play a constant game of catch-up. When they detect a new virus, they extract a signature, add it to their software, and every few months send out an updated version. Inevitably, this means that by the time the updated software is dispatched, it is already out of date.

Yet this system has worked pretty well up till now. Most new viruses never spread beyond a few machines. Either they are detected in time, or they are not very good, says White. Some are so destructive that they crash their host computer before they have a chance to spread. Even some successful viruses took years to spread round the world.

But computer networks along with a new family of viruses called macro viruses are changing all that. The viruses hide in macros -- small programs that can be attached to documents written with Microsoft's word-processing package Word. As soon as you open one of these documents, the virus infects your computer. According to a survey by the National Computer Security Association (NCSA) in Carlisle, Pennsylvania, the biggest problem today is caused by a macro virus called Concept. It can spread on floppy discs, but travels mostly on e-mails and files downloaded from the World Wide Web and bulletin boards. By exploiting networks in this way, macro viruses have spread round the world in weeks.

"Just as the invention of airplanes caused disease to spread more quickly than they could by ox carts, the Internet is causing computer viruses to spread more quickly than they did through diskettes," says White. As more and more people use the Internet and other global networks, existing antivirus methods won' t be able to keep up, he says. New viruses will spread so quickly that by the time researchers extract a signature and send out their updated software, the virus will have had time to do serious damage. Clearly, we need a more rapid way to protect networks.

For inspiration about the form such protection could take, the IBM team turned to the natural world. After all, computer viruses are forms of artificial life. Calling them viruses is wholly appropriate, says White. They commandeer their host' s resources to replicate just as biological viruses do. "The analogy is breathtakingly deep and important," he says.

Next they studied how the immune system protects humans. On one level, the immune system simply destroys anything it recognises as nonhuman. Unfortunately this strategy is a nonstarter for fighting computer viruses. From time to time almost all users will install new software on their machines, or update old programs. A computer immune system that automatically attacks new programs clearly won' t do the job. But biological immune systems also have more specific responses. When the body encounters a foreign organism, it begins to develop antibodies tailored to detect and disable the invader. The immune system does not need to analyse the entire pathogen. It simply has to "remember" enough of the virus' s structure to recognise it in future. Once the initial infection has subsided, the body keeps some of these antibodies around, so that it can respond to a future infection much more quickly.

Deliberate infection

This is all similar to what existing antivirus packages do, albeit slowly. The software systems contain "antibodies" that recognise not whole viruses, but just their signatures. What White, Kephart and their colleagues have done is push this analogy further, to make identification quicker and extend it to include not just individual computers but whole networks of computers.

In the virus isolation lab, Kephart explains their strategy. He sits down in front of a computer and runs AntiVirus to confirm that it is free of known viruses. Then he takes out a floppy disc that he knows is infected with a "file infecting" virus -- one that infects your computer from a program file that you run. He sticks the floppy into the drive and infects the computer.

Then he runs the antivirus software again. He' s chosen a virus that the system has not seen before, so the signature scanner doesn' t detect it. But another part of the software -- an integrity checker -- notices that something is wrong. The checker scans the computer' s programs and compares them with the way they looked the last time it ran. If it finds a mismatch, it sounds the alarm.

If the mismatch is caused by a virus, the antivirus software may be able to patch up the damage. It does this by making educated guesses about the way a virus normally inserts itself into a program and shifts data around. If the software manages to reverse these changes, the damaged file will be restored to health. But success is not guaranteed.

All this is standard stuff: antivirus systems that do this have been around for years. But it leaves much to be desired. Though the altered file may have been repaired, the virus that did the damage remains at large, ready to do wreak more havoc. To seek out and destroy a new virus, the protection software needs a signature. Worse still, nothing can be done until somebody, somewhere captures a sample of the virus. Only then will a specialist programmer be able to study the virus' s anatomy and find a signature.

White and Kephart's experimental immune system has a way round these problems. After the infected computer finds out which file the new virus is hiding in, it encrypts the file and sends it on to IBM's central virus analyser. This could be done automatically, or under the control of the user -- or, in a company with a network, the system manager -- who would send the encrypted program to IBM over the Net.

In the lab demonstration, the virus analyser sits next to the infected computer, so Kephart merely rolls his chair a couple of feet to the left. The analyser decrypts the program and scans it again for known viruses, on the off chance that someone has already reported the virus that is causing the trouble. But Kephart's analyser doesn't recognise it -- the infection really is a brand-new virus.

Now the analyser coaxes the virus into revealing itself. The machine is set up to contain a "virtual computer" within which the virus can be safely contained and studied. The virtual computer runs the infected program so that it too becomes infected. It then deploys "decoy" programs designed to attract the virus. It executes the decoys, reads them, writes to them, copies them: it manipulates them in as many ways as possible in an effort to draw out the virus and make it do its worst.

Attach and disguise

After each manipulation, the analyser compares the decoy programs with their original states to see how the virus has changed them. In this way it builds up a picture of how the virus attaches itself to the host' s software, and how it disguises itself by rearranging that software. This information can later be sent to the remote, infected machine and used to cut out the virus and repair any damage it has done.

The analyser' s next job is to select a signature to enable the remote machine' s antivirus software to detect the new virus and avoid future infection. Extrac-ting a signature from a virus is a tricky task that until now has always been left to a human expert. "It' s a very arcane skill," says White. "It' s useful for almost nothing else." Computer viruses often mutate, as some nonvital sections of its code change slightly. So the signature must come from a section of code that is essential to the virus. And, of course, the signature must not appear in legitimate programs, otherwise people may find their expensive programs being attacked and destroyed by the antivirus software. So the human virus hunters learn to look for pieces of code that are unique to the functioning of a virus -- telltale commands that manipulate the computer' s operations in unusual ways, such as copying themselves into the middle of existing programs.

But now the IBM researchers want computers to take over the task of teasing out the signatures of new virus. And since they cannot, yet, create a computer program that has the insight of a human, they are concentrating on what computers do best: number crunching. Kephart and White' s automatic signature extractor compares the virus code to tens of thousands of legitimate programs, containing gigabytes of code, looking for short command sequences that do not appear in any of the legitimate programs. The extractor turns out to be good at its job. In tests carried out at IBM the computerised version proved to be better at extracting signatures than humans. Finally, the virus analyser places a call to the remote, infected computer to deliver the new signature along with instructions on how to repair any damage the virus might have done.

Kephart's demonstration takes about a minute, with everything up and running on an experimental network. But he and White want to go further. The first stage of their plan is for anyone logging on to IBM's Web site (http://av.ibm.com) to be able to pick up the signature for the new virus, along with the necessary fix. "Customers could download new signatures every day, every hour, whatever," says Kephart.

To provide even faster protection, the two researchers want to make the whole process automatic. Infected computers will send samples of new viruses directly to the IBM virus analysis centre, where new methods for detecting and destroying the virus, and repairing the damage will be devised automatically. These will then be sent back to the original infected machine. But they will also be sent all over the world to other participating companies and computer users. By distributing the cure faster than the virus can spread, Kephart and White hope that their system will prevent new viruses from reaching epidemic proportions. "It will work on a very rapid timescale, and the world will be protected," says Kephart. IBM will begin running a pilot program of its immune system in a group of selected companies later this year. The complete scheme is expected to be go online shortly after that.

But if the whole of cyberspace is soon to be protected by IBM's immune system, where will that leave competing antivirus companies? They are not shutting up shop just yet. Jimmy Kuo is senior virus researcher at California-based McAfee Associates, which has its own line of antivirus software. He admits that with the Net and macro viruses, infections will spread more quickly, and researchers will have a tough time keeping up. But McAfee and its rivals are already using software tools similar to those in IBM's immune system.

Kuo does not accept that an automated system will ever be sophisticated enough to run on its own. "As a goal it' s nice," he says. "But ultimately it will be much better to have a person with virus knowledge review the results."

Jonathan Wheat of the NCSA sees more potential in the IBM system -- particularly its ability "to head off damage before it happens", and then to send out the fixes quickly. Macro viruses have ushered in a new era, says Wheat. They are easy to write, take no programming skill and their numbers are increasing rapidly. Without automated systems to head off macro viruses, the consequences could be dire. "Eventually," says Wheat, "it could get to the point where they will be out of control."

From New Scientist, 24 May 1997