A computer virus is a computer program that can copy itself and infect a computer without permission or knowledge of the user. The original may modify the copies or the copies may modify themselves, as occurs in a metamorphic virus. A virus can only spread from one computer to another when its host is taken to the uninfected computer, for instance by a user sending it over a network or carrying it on a removable medium such as a floppy disk, CD, or USB drive. Additionally, viruses can spread to other computers by infecting files on a network file system or a file system that is accessed by another computer. Viruses are sometimes confused with computer worms and Trojan horses. A worm, however, can spread itself to other computers without needing to be transferred as part of a host. A Trojan horse is a file that appears harmless until executed. In contrast to viruses, Trojan horses do not insert their code into other computer files. Many personal computers are now connected to the Internet and to local-area networks, facilitating their spread. Today's viruses may also take advantage of network services such as the World Wide Web, e-mail, and file sharing systems to spread, blurring the line between viruses and worms. Furthermore, some sources use an alternative terminology in which a virus is any form of self-replicating malware.
The term comes from the term virus in biology. A computer virus reproduces by making (possibly modified) copies of itself in the computer's memory, storage, or over a network. This is similar to the way a biological virus works.
Some viruses are programmed to damage the computer by damaging programs, deleting files, or reformatting the hard disk. Others are not designed to do any damage, but simply replicate themselves and perhaps make their presence known by presenting text, video, or audio messages. Even these benign viruses can create problems for the computer user. They typically take up computer memory used by legitimate programs. As a result, they often cause erratic behavior and can result in system crashes. In addition, many viruses are bug-ridden, and these bugs may lead to system crashes and data loss.
There are many viruses operating in the general Internet today, and new ones are discovered every day.
History
A program called "Elk Cloner" is credited with being the first computer virus to appear "in the wild" — that is, outside the single computer or lab where it was created. Written in 1982 by Richard Skrenta, it attached itself to the Apple DOS 3.3 operating system and spread by floppy disk. This virus was originally a joke, created by the high school student and put onto a game. The disk could only be used 49 times.The game was set to play, but release the virus on the 50th time of starting the game. Only this time, instead of playing the game, it would change to a blank screen that read a poem about the virus named Elk Cloner. The poem that showed up on the screen is as follows: It will get on all your disks. It will infiltrate your chips. Yes it's Cloner! It will stick to you like glue. It will modify RAM too. Send in the Cloner! The computer would then be infected.
The first PC virus was a boot sector virus called (c)Brain, created in 1986 by two brothers, Basit and Amjad Farooq Alvi, operating out of Lahore, Pakistan. The brothers reportedly created the virus to deter pirated copies of software they had written. However, analysts have claimed that the Ashar virus, a variant of Brain, possibly predated it based on code within the virus.
Before computer networks became widespread, most viruses spread on removable media, particularly floppy disks. In the early days of the personal computer, many users regularly exchanged information and programs on floppies. Some viruses spread by infecting programs stored on these disks, while others installed themselves into the disk boot sector, ensuring that they would be run when the user booted the computer from the disk.
Traditional computer viruses emerged in the 1980s, driven by the spread of personal computers and the resultant increase in BBS and modem use, and software sharing. Bulletin board driven software sharing contributed directly to the spread of Trojan horse programs, and viruses were written to infect popularly traded software. Shareware and bootleg software were equally common vectors for viruses on BBS's. Within the "pirate scene" of hobbyists trading illicit copies of retail software, traders in a hurry to obtain the latest applications and games were easy targets for viruses.
Since the mid-1990s, macro viruses have become common. Most of these viruses are written in the scripting languages for Microsoft programs such as Word and Excel. These viruses spread in Microsoft Office by infecting documents and spreadsheets. Since Word and Excel were also available for Mac OS, most of these viruses were able to spread on Macintosh computers as well. Most of these viruses did not have the ability to send infected e-mail. Those viruses which did spread through e-mail took advantage of the Microsoft Outlook COM interface.
Macro viruses pose unique problems for detection software. For example, some versions of Microsoft Word allowed macros to replicate themselves with additional blank lines. The virus behaved identically but would be misidentified as a new virus. In another example, if two macro viruses simultaneously infect a document, the combination of the two, if also self-replicating, can appear as a "mating" of the two and would likely be detected as a virus unique from the "parents".[1]
A virus may also send a web address link as an instant message to all the contacts on an infected machine. If the recipient, thinking the link is from a friend (a trusted source) follows the link to the website, the virus hosted at the site may be able to infect this new computer and continue propagating.
The newest species of the virus family is the cross-site scripting virus. The virus emerged from research and was academically demonstrated in 2005[citation needed]. This virus utilizes cross-site scripting vulnerabilities to propagate. Since 2005 there have been multiple instances of the cross-site scripting viruses in the wild, most notable sites affected have been MySpace and Yahoo.
Etymology
The word virus is derived from and used in the same sense as the biological equivalent. The term "virus" is often used in common parlance to describe all kinds of malware (malicious software), including those that are more properly classified as worms or Trojans. Most popular anti-virus software packages defend against all of these types of attack. In some technical communities, the term "virus" is also extended to include the authors of malware, in an insulting sense. The English plural of "virus" is "viruses". Some people use "virii" or "viri" as a plural, but this is rare. For a discussion about whether "viri" and "virii" are correct alternatives of "viruses", see plural of virus.
The term "virus" was first used in an academic publication by Fred Cohen in his 1984 paper Experiments with Computer Viruses, where he credits Len Adleman with coining it. However, a 1972 science fiction novel by David Gerrold, When H.A.R.L.I.E. Was One, includes a description of a fictional computer program called "VIRUS" that worked just like a virus (and was countered by a program called "VACCINE"). The term "computer virus" with current usage also appears in the comic book Uncanny X-Men #158, written by Chris Claremont and published in 1982. Therefore, although Cohen's use of "virus" may, perhaps, have been the first "academic" use, the term had been used earlier.
Why people create computer viruses
Unlike biological viruses (see Virus), computer viruses do not simply evolve by themselves. Computer viruses do not come into existence spontaneously, nor are they likely to be created by bugs in regular programs. They are deliberately created by programmers, or by people who use virus creation software. Computer viruses can only do what the programmers have programmed them to do.
Virus writers can have various reasons for creating and spreading malware. Viruses have been written as research projects, pranks, vandalism, to attack the products of specific companies, to distribute political messages, and financial gain from identity theft, spyware, and cryptoviral extortion. Some virus writers consider their creations to be works of art, and see virus writing as a creative hobby. Additionally, many virus writers oppose deliberately destructive payload routines. Many writers consider the systems they attack an intellectual challenge or a logical problem to be solved; this multiplies when a cat-and-mouse game is anticipated against anti-virus software. Some viruses were intended as "good viruses". They spread improvements to the programs they infect, or delete other viruses. These viruses are, however, quite rare, and they still consume system resources, may accidentally damage systems they infect, and, on occasion, have become infected and acted as vectors for malicious viruses. A poorly written "good virus" can also inadvertently become a harmful virus in and of itself (for example, such a 'good virus' may misidentify its target file and delete an innocent system file by mistake). Moreover, they normally operate without asking for the permission of the computer owner. Since self-replicating code causes many complications, it is questionable if a well-intentioned virus can ever solve a problem in a way that is superior to a regular program that does not replicate itself. In short, no single answer is likely to cover the broad demographic of virus writers.
Releasing computer viruses (as well as worms) is a crime in most jurisdictions.
See also the BBC News article.[2]
Replication strategies
In order to replicate itself, a virus must be permitted to execute code and write to memory. For this reason, many viruses attach themselves to executable files that may be part of legitimate programs. If a user tries to start an infected program, the virus' code may be executed first. Viruses can be divided into two types, on the basis of their behavior when they are executed. Nonresident viruses immediately search for other hosts that can be infected, infect these targets, and finally transfer control to the application program they infected. Resident viruses do not search for hosts when they are started. Instead, a resident virus loads itself into memory on execution and transfers control to the host program. The virus stays active in the background and infects new hosts when those files are accessed by other programs or the operating system itself.
Nonresident viruses
Nonresident viruses can be thought of as consisting of a finder module and a replication module. The finder module is responsible for finding new files to infect. For each new executable file the finder module encounters, it calls the replication module to infect that file.
For simple viruses the replicator's tasks are to:
Open the new file
Check if the executable file has already been infected (if it is, return to the finder module)
Append the virus code to the executable file
Save the executable's starting point
Change the executable's starting point so that it points to the start location of the newly copied virus code
Save the old start location to the virus in a way so that the virus branches to that location right after its execution
Save the changes to the executable file
Close the infected file
Return to the finder so that it can find new files for the replicator to infect
Resident viruses
Resident viruses contain a replication module that is similar to the one that is employed by nonresident viruses. However, this module is not called by a finder module. Instead, the virus loads the replication module into memory when it is executed and ensures that this module is executed each time the operating system is called to perform a certain operation. For example, the replication module can be called each time the operating system executes a file. In this case, the virus infects every suitable program that is executed on the computer.
Resident viruses are sometimes subdivided into a category of fast infectors and a category of slow infectors. Fast infectors are designed to infect as many files as possible. For instance, a fast infector can infect every potential host file that is accessed. This poses a special problem to anti-virus software, since a virus scanner will access every potential host file on a computer when it performs a system-wide scan. If the virus scanner fails to notice that such a virus is present in memory, the virus can "piggy-back" on the virus scanner and in this way infect all files that are scanned. Fast infectors rely on their fast infection rate to spread. The disadvantage of this method is that infecting many files may make detection more likely, because the virus may slow down a computer or perform many suspicious actions that can be noticed by anti-virus software. Slow infectors, on the other hand, are designed to infect hosts infrequently. For instance, some slow infectors only infect files when they are copied. Slow infectors are designed to avoid detection by limiting their actions: they are less likely to slow down a computer noticeably, and will at most infrequently trigger anti-virus software that detects suspicious behavior by programs. The slow infector approach does not seem very successful, however.
Vectors and hosts
Viruses have targeted various types of transmission media or hosts. This list is not exhaustive:
Binary executable files (such as COM files and EXE files in MS-DOS, Portable Executable files in Microsoft Windows, and ELF files in Linux)
Volume Boot Records of floppy disks and hard disk partitions
The master boot record (MBR) of a hard disk
General-purpose script files (such as batch files in MS-DOS and Microsoft Windows, VBScript files, and shell script files on Unix-like platforms).
Application-specific script files (such as Telix-scripts)
Documents that can contain macros (such as Microsoft Word documents, Microsoft Excel spreadsheets, AmiPro documents, and Microsoft Access database files)
Cross-site scripting vulnerabilities in web applications
Inhospitable vectors
It is difficult, but not impossible, for viruses to tag along in source files, seeing that computer languages are built also for human eyes and experienced operators. It is very probably impossible for viruses to tag along in data files like MP3s, MPGs, OGGs, JPGs, GIFs, PNGs, MNGs, PDFs, and DVI files (this is not an exhaustive list of generally trusted file types). Even if a virus were to 'infect' such a file, it would be inoperative, since there would be no way for the viral code to be executed. A caveat must be mentioned from PDFs, that like HTML, may link to malicious code. Further, an exploitable buffer overflow in a program which reads the data files could be used to trigger the execution of code hidden within the data file, but this attack is substantially mitigated in computer architectures with an execute disable bit.
It is worth noting that some virus authors have written an .EXE extension on the end of .PNG (for example), hoping that users would stop at the trusted file type without noticing that the computer would start with the final type of file. See Trojan horse (computing).
Methods to avoid detection
in order to avoid detection by users, some viruses employ different kinds of deception. Some old viruses, especially on the MS-DOS platform, make sure that the "last modified" date of a host file stays the same when the file is infected by the virus. This approach does not fool anti-virus software, however, especially that which maintains and dates Cyclic Redundancy Codes on file changes.
Some viruses can infect files without increasing their sizes or damaging the files. They accomplish this by overwriting unused areas of executable files. These are called cavity viruses. For example the CIH virus, or Chernobyl Virus, infects Portable Executable files. Because those files had many empty gaps, the virus, which was 1 KB in length, did not add to the size of the file.
Some viruses try to avoid detection by killing the tasks associated with antivirus software before it can detect them.
As computers and operating systems grow larger and more complex, old hiding techniques need to be updated or replaced. Defending your computer against viruses may demand that your file system migrate towards detailed and explicit permission for every kind of file access.
Avoiding bait files and other undesirable hosts
A virus needs to infect hosts in order to spread further. In some cases, it might be a bad idea to infect a host program. For example, many anti-virus programs perform an integrity check of their own code. Infecting such programs will therefore increase the likelihood that the virus is detected. For this reason, some viruses are programmed not to infect programs that are known to be part of anti-virus software. Another type of host that viruses sometimes avoid is bait files. Bait files (or goat files) are files that are specially created by anti-virus software, or by anti-virus professionals themselves, to be infected by a virus. These files can be created for various reasons, all of which are related to the detection of the virus:
Anti-virus professionals can use bait files to take a sample of a virus (i.e. a copy of a program file that is infected by the virus). It is more practical to store and exchange a small, infected bait file, than to exchange a large application program that has been infected by the virus.
Anti-virus professionals can use bait files to study the behavior of a virus and evaluate detection methods. This is especially useful when the virus is polymorphic. In this case, the virus can be made to infect a large number of bait files. The infected files can be used to test whether a virus scanner detects all versions of the virus.
Some anti-virus software employs bait files that are accessed regularly. When these files are modified, the anti-virus software warns the user that a virus is probably active on the system.
Since bait files are used to detect the virus, or to make detection possible, a virus can benefit from not infecting them. Viruses typically do this by avoiding suspicious programs, such as small program files or programs that contain certain patterns of 'garbage instructions'.
A related strategy to make baiting difficult is sparse infection. Sometimes, sparse infectors do not infect a host file that would be a suitable candidate for infection in other circumstances. For example, a virus can decide on a random basis whether to infect a file or not, or a virus can only infect host files on particular days of the week!
Stealth
Some viruses try to trick anti-virus software by intercepting its requests to the operating system. A virus can hide itself by intercepting the anti-virus software’s request to read the file and passing the request to the virus, instead of the OS. The virus can then return an uninfected version of the file to the anti-virus software, so that it seems that the file is "clean". Modern anti-virus software employs various techniques to counter stealth mechanisms of viruses. The only completely reliable method to avoid stealth is to boot from a medium that is known to be clean.
Self-modification
Most modern antivirus programs try to find virus-patterns inside ordinary programs by scanning them for so-called virus signatures. A signature is a characteristic byte-pattern that is part of a certain virus or family of viruses. If a virus scanner finds such a pattern in a file, it notifies the user that the file is infected. The user can then delete, or (in some cases) "clean" or "heal" the infected file. Some viruses employ techniques that make detection by means of signatures difficult but probably not impossible. These viruses modify their code on each infection. That is, each infected file contains a different variant of the virus.
Encryption with a variable key
A more advanced method is the use of simple encryption to encipher the virus. In this case, the virus consists of a small decrypting module and an encrypted copy of the virus code. If the virus is encrypted with a different key for each infected file, the only part of the virus that remains constant is the decrypting module, which would (for example) be appended to the end. In this case, a virus scanner cannot directly detect the virus using signatures, but it can still detect the decrypting module, which still makes indirect detection of the virus possible. Since these would be symmetric keys, stored on the infected host, it is in fact entirely possible to decrypt the final virus, but that probably isn't required, since self-modifying code is such a rarity that it may be reason for virus scanners to at least flag the file as suspicious.
An old, but compact, encryption involved XORing each byte in a virus with a constant, such that a XOR b = c, and c XOR b = a, so that the exclusive or operation had only to be repeated for decryption. It is suspicious code that modifies itself, so the code to do this may be part of the signature in many virus definitions.
Polymorphic code
Polymorphic code was the first technique that posed a serious threat to virus scanners. Just like regular encrypted viruses, a polymorphic virus infects files with an encrypted copy of itself, which is decoded by a decryption module. In the case of polymorphic viruses however, this decryption module is also modified on each infection. A well-written polymorphic virus therefore has no parts that stay the same on each infection, making it very difficult to detect directly using signatures. Anti-virus software can detect it by decrypting the viruses using an emulator, or by statistical pattern analysis of the encrypted virus body. To enable polymorphic code, the virus has to have a polymorphic engine (also called mutating engine or mutation engine) somewhere in its encrypted body. See Polymorphic code for technical detail on how such engines operate.
Some viruses employ polymorphic code in a way that constrains the mutation rate of the virus significantly. For example, a virus can be programmed to mutate only slightly over time, or it can be programmed to refrain from mutating when it infects a file on a computer that already contains copies of the virus. The advantage of using such slow polymorphic code is that it makes it more difficult for anti-virus professionals to obtain representative samples of the virus, because bait files that are infected in one run will typically contain identical or similar samples of the virus. This will make it more likely that the detection by the virus scanner will be unreliable, and that some instances of the virus may be able to avoid detection.
Metamorphic code
To avoid being detected by emulation, some viruses rewrite themselves completely each time they are to infect new executables. Viruses that use this technique are said to be metamorphic. To enable metamorphism, a metamorphic engine is needed. A metamorphic virus is usually very large and complex. For example, W32/Simile consisted of over 14000 lines of Assembly language code, 90% of it part of the metamorphic engine.
The vulnerability of operating systems to viruses
Another analogy to biological viruses: just as genetic diversity in a population decreases the chance of a single disease wiping out a population, the diversity of software systems on a network similarly limits the destructive potential of viruses.
This became a particular concern in the 1990s, when Microsoft gained market dominance in desktop operating systems and office suites. The users of Microsoft software (especially networking software such as Microsoft Outlook and Internet Explorer) are especially vulnerable to the spread of viruses. Microsoft software is targeted by virus writers due to their desktop dominance, and is often criticized for including many errors and holes for virus writers to exploit. Integrated applications, applications with scripting languages with access to the file system (for example Visual Basic Script (VBS), and applications with networking features) are also particularly vulnerable.
Although Windows is by far the most popular operating system for virus writers, some viruses also exist on other platforms. Any operating system that allows third-party programs to run can theoretically run viruses. Some operating systems are less secure than others. Unix-based OS's (and NTFS-aware applications on Windows NT based platforms) only allow their users to run executables within their protected space in their own directories.
As of 2006, there are relatively few security exploits[3] targeting Mac OS X (with a Unix-based file system); the known vulnerabilities fall under the classifications of worms and Trojans. The number of viruses for the older Apple operating systems, known as Mac OS Classic, varies greatly from source to source, with Apple stating that there are only four known viruses, and independent sources stating there are as many as 63 viruses. It is safe to say that Macs are less likely to be exploited due to their secure Unix base, and because a Mac-specific virus could only infect a small proportion of computers (making the effort less desirable). Virus vulnerability between Macs and Windows is a chief selling point Apple Computers use to get users to switch away from Microsoft (Get a Mac). Ironically if a change in the user base away from PCs and towards Macs was to occur then the Mac OS X platform would become a much more desirable target to virus writers. As there are currently few anti virus solutions available on the OS X platform, there is the possibility that this would become a considerable problem for Mac users very quickly, Apple literally becoming a victim of their own success.[4]
Windows and Unix have similar scripting abilities, but while Unix natively blocks normal users from having access to make changes to the operating system environment, Windows does not. In 1997, when a virus for Linux was released – known as "Bliss" – leading antivirus vendors issued warnings that Unix-like systems could fall prey to viruses just like Windows.[5] The Bliss virus may be considered characteristic of viruses – as opposed to worms – on Unix systems. Bliss requires that the user run it explicitly (making it a trojan), and it can only infect programs that the user has the access to modify. Unlike Windows users, most Unix users do not log in as an administrator user except to install or configure software; as a result, even if a user ran the virus, it could not harm their operating system. The Bliss virus never became widespread, and remains chiefly a research curiosity. Its creator later posted the source code to Usenet, allowing researchers to see how it worked.[6]
The role of software development
Because software is often designed with security features to prevent unauthorized use of system resources, many viruses must exploit software bugs in a system or application to spread. Software development strategies that produce large numbers of bugs will generally also produce potential exploits.
Anti-virus software and other preventive countermeasures
There are two common methods that an anti-virus software application uses to detect viruses. The first, and by far the most common method of virus detection is using a list of virus signature definitions. The disadvantage of this detection method is that users are only protected from viruses that pre-date their last virus definition update. The second method is to use a heuristic algorithm to find viruses based on common behaviors. This method has the ability to detect viruses that anti-virus security firms’ have yet to create a signature for.
Many users install anti-virus software that can detect and eliminate known viruses after the computer downloads or runs the executable. They work by examining the content heuristics of the computer's memory (its RAM, and boot sectors) and the files stored on fixed or removable drives (hard drives, floppy drives), and comparing those files against a database of known virus "signatures". Some anti-virus programs are able to scan opened files in addition to sent and received emails 'on the fly' in a similar manner. This practice is known as "on-access scanning." Anti-virus software does not change the underlying capability of host software to transmit viruses. Users must update their software regularly to patch security holes. Anti-virus software also needs to be regularly updated in order to gain knowledge about the latest threats.
One may also prevent the damage done by viruses by making regular backups of data (and the Operating Systems) on different media, that are either kept unconnected to the system (most of the time), read-only or not accessible for other reasons, such as using different file systems. This way, if data is lost through a virus, one can start again using the backup (which should preferably be recent). If a backup session on optical media like CD and DVD is closed, it becomes read-only and can no longer be affected by a virus. Likewise, an Operating System on a bootable can be used to start the computer if the installed Operating Systems become unusable. Another method is to use different Operating Systems on different file systems. A virus is not likely to affect both. Data backups can also be put on different file systems. For example, Linux requires specific software to write to NTFS partitions, so if one does not install such software and uses a separate installation of MS Windows to make the backups on an NTFS partition (and preferably only for that reason), the backup should remain safe from any Linux viruses. Likewise, MS Windows can not read file systems like ext3, so if one normally uses MS Windows, the backups can be made on an ext3 partition using a Linux installation.
Recovery methods
Once a computer has been compromised by a virus, it is usually unsafe to continue using the same computer without completely reinstalling the operating system. However, there are a number of recovery options that exist after a computer has a virus. These actions depend on severity of the type of virus.
Virus removal
One possibility on Windows XP is a tool known as System Restore, which restores the registry and critical system files to a previous checkpoint. Often a virus will cause a system to hang, and a subsequent hard reboot will render a system restore point from the same day corrupt. Restore points from previous days should work provided the virus is not designed to corrupt the restore files. Some viruses, however, disable system restore and other important tools such as Task Manager and Command Prompt. Examples of viruses that do this would be CiaDoor.
Administrators have the option to disable such tools from limited users for various reasons. The virus modifies the registry to do the same, except, when the Administrator is controlling the computer, it blocks all users from accessing the tools. When an infected tool activates it gives the message "Task Manager has been disabled by your administrator.", even if the user trying to open the program is the administrator.
Operating system reinstallation
As a last ditch effort, if a virus is on your system and anti-viral software can't clean it, then reinstalling the operating system may be required. To do this properly, the hard drive is completely erased (partition deleted and formatted) and the operating system is installed from media known not to be infected. Important files should first be backed up, if possible, and separately scanned for infection before erasing the original hard drive and reinstalling the operating system.