SSJX.CO.UK
Content

Memory Safety - How D Prevents Out Of Bound Errors

Introduction

An out of bounds error / exception occurs if you try to read or write to an area of memory that is outside the amount allocated. Simply put, being out of bounds means if you mark 10 boxes out of 20 to use, and then put something in box 11.

Given that we don't know what is stored beyond what we have allocated or what it is being used for, by putting our data in there we will have caused unexpected behaviour. Continuing our example, was box 11 storing a person's name? If so we have just corrupted it, maybe a it was a password which we have now changed? Reading memory that is outside our allocated amount is also a problem as it may reveal private information.

By exploiting memory issues, it is possible for bad parties to launch and install rogue software. The situation regarding memory safety has prompted the National Security Agency to publish documentation advising the use of memory safe languages and other techniques to combat the problem.

The ideal outcome for going out of bounds is for the program to crash and to give an error message, this does not always happen though. As with most things, prevention is better than cure.

Examples Of Going Out Of Bounds

The example below (in C) shows many ways to go out of bounds:

#include <stdio.h>

void main(){
	// This means that our array has boxes 0,1,2
	int array[3]={1,2,3};
		
	// 1. No box 3, Should be easy to catch this one during compilation...
	array[3]=97;
	
	// 2. Given that 'pa' is fixed, should be caught during compilation too...
	const int pa=3;
	array[pa]=98;

	// 3. Our array position 'pb' is a variable this time
	int pb=4;
	array[pb]=99;

	// 4. This loop counts up to 3, one greater than our array size.
	for(int i=0;i<4;i++){
		printf("%d\n",array[i]);
	}
}

As mentioned in the above green comments, some of these errors really should be caught by compilers but they are not always. The example above is in C but even the Java version does not detect any issues during compilation. The Rust language only picked up the first error.

Running the compiled C version of this program did not trigger a crash which can result in the problems mentioned in the introduction. On the plus side going out of bounds in Java or Rust does cause a program crash which is a better outcome than just continuing to run.

The next section details how the D Programming Language handles these errors.

Using The D Programming Language

From the D website:

D is a general-purpose programming language with static typing, systems-level access, and C-like syntax. With the D Programming Language, write fast, read fast, and run fast.

For further reading about D's Memory Safety, click here.

Here is the same program written in the D Programming Language:

import std.stdio;

void main(){
	// This means that our array has boxes 0,1,2
	int[3] array=[1,2,3];

	// 1. No box 3, Should be easy to catch this one during compilation...
	array[3]=97;
	
	// 2. Given that 'pa' is fixed, should be caught during compilation too...
	const int pa=3;
	array[pa]=98;

	// 3. Our array position 'pb' is a variable this time
	int pb=4;
	array[pb]=99;

	// 4. This loop counts up to 3, one greater than our array size.
	foreach(i;0..4){
		writeln(array[i]);
	}
}

The program looks very similar to the C version, when it is compiled I get these two error messages:

oob_d.d(14): Error: array index 3 is out of bounds `array[0 .. 3]`
oob_d.d(18): Error: array index 3 is out of bounds `array[0 .. 3]`

The line numbers refer to the first two errors in the example, this is a great start as it points me in the direction of the problems! When those errors are commented out, the other two error get picked up at runtime and the program exits.

This is what we get the loop goes outside our allocated range:

core.exception.ArrayIndexError@oob_d.d(26): index [3] is out of bounds for array of length 3

Extra Points

To fix error #4, we could use a range based loop:

	// 4. Looping through array - Better version
	foreach(i;array){
		writeln(i);
	}

Range based loops are available in other langauges and should be used where possible. They guarantee we will not go outside our allocated memory.

Conclusion

I like the fact that these sensible checks are enabled by default and do not require some arcane knowledge to enable. Given the number of severe memory based flaws that keep happening, changing to a safer language such as D, which may mean more work in the short term, may be worth it in the long term. The familiar syntax and helpful compiler can help make the change easier.

Updated 20/01/2024
Created 18/02/23