What is it?
A buffer overrun happens when some code writes data outside the boundaries of an array. Take, for example, this snippet of code:
char HelloWorld; strcpy(HelloWorld, "Hello World!");
This causes a buffer overrun because the string "Hello World!" is 12 characters + a NULL char (which is a total of 13 characters) and the array is only 5 characters.
Why is it bad?
The biggest problem with buffer overruns is that they are a security issue.
Utilizing buffer overruns, it's possible to run custom code in a process. This vulnerability has been overexploited in attacks against Windows for a long time (just take a look at Windows Update).
Another security hole that has recently been exploited is the buffer overrun in the famous game Twilight Princess. By exploiting the bug, hackers have been able to run custom code, such as Tetris, on the console. For more information, see here. The scary thing is what could have happened if that harmless code to run Tetris had been replaced by malicious code. Then a seemingly harmless application can be a security issue or even harm the computer on which it is installed.
Another problem with buffer overruns is that they can corrupt other data in the program. Let's take a look at this simple, yet dangerous code:
Note that the standard says that buffer overruns causes undefined behavior. Therefore, this code may not give the same results on all compilers and platforms. This example was done with Microsoft Visual Studio on the Windows platform. This gives the result that this example assumes. This is not an example that shows what will happen, but rather what might happen.
const size_t SIZEOF_HELLO_WORLD = sizeof("Hello World!"); int MyInt = 0; char MyArray[SIZEOF_HELLO_WORLD - 1]; strcpy(MyArray, "Hello World!!"); printf("%s", MyArray); printf("%i", MyInt);
Notes: this snippet of code assumes that the computer is a 32-bit computer. This may or may not work on all compilers and platforms, either. Some compilers (such as Visual Studio) can place extra bytes before and after a variable, buffer or array to detect buffer overruns. The code is only here to give a possible scenario as to how data corruption can occur.
The size of "Hello World!" is actually 13 bytes, but since the stack is aligned on DWORD-boundaries (4 bytes), the size of the array is reduced by one so that it actually aligns on a boundary. Otherwise, padding would have been added and the code would have run fine.
The buffer overrun occurs on the strcpy line which actually copies a total of 14 bytes into a 12-byte array.
To demonstrate how corruption can occur, an int is defined above, initializing its value to 0. However, when printing out the contents of both variables, we should get "Hello World!!" and 33!
But we just initialized MyInt to 0. So how come it prints 33? The answer is simple: when copying the data into the MyArray array, it actually wrote data beyond the end of the array, into the region of memory occupied by the MyInt variable! The number 33 is the number representation of the character '!'. If we'd assign '!' to MyInt, we'd get the same result - 33.
A trusted member on CBoard, Prelude, has also provided valuable experience that further demonstrates the danger of corruption: corrupting financial data!
Writing beyond the end of a buffer can also cause the program to crash if the program tries to write to memory it does not own. Typically this can happen if an array's end is situated near the end of a page and the next page is unallocated. For more information, see Virtual Memory.
Avoiding buffer overruns
Avoid dangerous functions
To avoid buffer overruns, it is a good idea to use safe functions and avoid ones prone to buffer overruns. Dangerous functions that should be avoided include the commonly used standard C function gets, among them. Reading strings with scanf can also be dangerous, unless scanf is told the size of the buffer you're trying to read into. For more information about avoiding this, see this. For more information on the dangers of certain dangerous functions and how to avoid these, see gets, fgets and Buffer overrun.
Use safe libraries
Another good suggestion is to use a safe library if available. Safe libraries contains functions such as strcpy_s which takes the size of the buffer where a string is copied into. If the buffer is too small, the function will throw an assert. Visual Studio is one IDE and compiler that contains such a safe library. Visual Studio Express is free which makes it a viable option.
Otherwise, safe functions can be written by the programmer him- or herself. If a safe library can't be used, this is recommended. However, writing such functions are left as an exercise to the reader.
Safe use of arrays
A common way of doing buffer overruns is by writing or reading beyond the end of an array. To safeguard against this, the best method available for C++ is to use std::vector for the arrays and also use the .at() member function which will throw if out-of-bounds access is attempted. Some compilers and IDEs, such as Visual Studio, can throw an assert if an out-of-bounds access is attempted and even throw an exception, if configured in such a way, with the index operator (which is not guaranteed to throw according to the standard). This lessens the chances of buffer overruns and is therefore recommended.
Unfortunately for C devs, there is no such easy way. C developers are encouraged to use detection tools instead.
Use detection tools
There are some tools created to detect buffer overruns which can be quite handy when debugging and writing code. Two example of such software are ElectricFence and Valgrind. It might be a good idea to run such a software even on presumably "safe" code just to make sure there are no detected security issues.