In which we use System.Runtime.CompilerServices.Unsafe a generic API (“type-safe” but still “unsafe”) and mess with the C# Type System!
The post covers the following topics:
- What it is and why it’s useful
- How it works
- Code samples
- Tricks you can do with it
- Using it safely
What It Is and Why It’s Useful
The XML documentation comments for System.Runtime.CompilerServices.Unsafe
state that it:
Contains generic, low-level functionality for manipulating pointers.
But we can get a better understanding of what it is by looking at the actual API definition from the current NuGet package (4.0.0):
public static class Unsafe
{
public static T As<T>(object o) where T : class
public static void* AsPointer<T>(ref T value);
public static void Copy<T>(void* destination, ref T source);
public static void Copy<T>(ref T destination, void* source);
public static void CopyBlock(void* destination, void* source, uint byteCount);
public static void InitBlock(void* startAddress, byte value, uint byteCount);
public static T Read<T>(void* source);
public static int SizeOf<T>();
public static void Write<T>(void* destination, T value);
}
Note: I edited the XML doc-comments for brevity, the full versions are available in the source. There are also some additional methods that have been added to the API, but to make use of them, you have to use a version of the C# compiler with support for ref returns and locals.
However, this doesn’t really tell us why it’s useful. To get some background on that, we can look at the GitHub issue “Provide a generic API to read from and write to a pointer”:
So at a high-level, the goals of the System.Runtime.CompilerServices.Unsafe
library are to:
- Provide a safer way of writing low-level
unsafe
code
- Without this library, you have to resort to
fixed
and pointer manipulation, which can be error prone
- Allow access to functionality that can’t be expressed in C#, but is possible in IL
- Save developers from having to repeatedly write the same
unsafe
code
How It Works
Because the library allows access to functionality that can’t be expressed in C#, it has to be written in raw IL, which is then compiled by a custom build-step. As an example, we will look at the AsPointer
method, which has the following signature:
public static void* AsPointer<T>(ref T value)
The IL for this is shown below, note how the ref
keyword becomes &
in IL and <T>
is expressed as !!T
:
.method public hidebysig static void* AsPointer<T>(!!T& 'value') cil managed aggressiveinlining
{
.custom instance void System.Runtime.Versioning.NonVersionableAttribute::.ctor() = ( 01 00 00 00 )
.maxstack 1
ldarg.0
conv.u
ret
}
Here, we can see that it’s making use of the conv.u
IL instruction. For reference, the explanation of this, along with some of the other op codes used by the library are shown below:
- Conv_U - Converts the value on top of the evaluation stack to unsigned native int, and extends it to native int.
- Ldobj - Copies the value type object pointed to by an address to the top of the evaluation stack.
- Stobj - Copies a value of a specified type from the evaluation stack into a supplied memory address.
After searching around, I found several other places in the .NET Runtime that make use of raw IL in this way:
Code Samples
There’s a nice set of unit tests that show the main use-cases for the library, for instance, here is how to use Unsafe.Write(..)
to directly change the value of an int
via a pointer.
[Fact]
public static unsafe void WriteInt32()
{
int value = 10;
int* address = (int*)Unsafe.AsPointer(ref value);
int expected = 20;
Unsafe.Write(address, expected);
Assert.Equal(expected, value);
Assert.Equal(expected, *address);
Assert.Equal(expected, Unsafe.Read<int>(address));
}
You can write something similar by manipulating pointers directly, but it’s not as straightforward (unless you are familiar with C or C++).
int value = 10;
int* ptr = &value;
*ptr = 30;
Console.WriteLine(value);
For a more real-world use case, the code below shows how you can access a KeyValuePair<DateTime, decimal>
directly as a byte []
(taken from a GitHub discussion):
var dt = new KeyValuePair<DateTime, decimal>[2];
ref byte asRefByte = ref Unsafe.As<KeyValuePair<DateTime, decimal>, byte>(ref dt[0]);
fixed (byte * ptr = &asRefByte)
{
...
}
(This example is based on the StackOverflow question: “Get unsafe pointer to array of KeyValuePair<DateTime,decimal> in C#”)
Tricks You Can Do With It
Despite providing you with a nice strongly-typed API, you still have to mark your code as unsafe
, which is a bit of a give-away that you can use it to do things that normal C# can’t!
Breaking Immutability
String
s in C# are immutable and the runtime goes to great lengths to ensure you can’t bypass this behaviour. However, under-the-hood, the String
data is just bytes which can be manipulated, indeed the runtime does this manipulation itself inside the StringBuilder
class.
So using Unsafe.Write(..)
, we can modify the contents of a String
- yay!!
var text = "ABCDEFGHIJKLMNOPQRSTUVWXKZ";
Console.WriteLine("String Length {0}", text.Length);
Console.WriteLine("Text: \"{0}\"", text);
var pinnedText = GCHandle.Alloc(text, GCHandleType.Pinned);
char* textAddress = (char*)pinnedText.AddrOfPinnedObject().ToPointer();
Unsafe.Write(textAddress - 2, 5);
Console.WriteLine("String Length {0}", text.Length);
Console.WriteLine("Text: \"{0}\"", text);
Unsafe.Write(textAddress + 1, '@');
Console.WriteLine("Text: \"{0}\"", text);
pinnedText.Free();
Messing with the CLR type-system
But we can go even further than that and do a really nasty trick to completely defeat the CLR type-system. This code is horrible and could potentially break the CLR in several ways, so don’t ever use it in a real application!!
int intValue = 5;
float floatValue = 5.0f;
object boxedInt = (object)intValue, boxedFloat = (float)floatValue;
var pinnedFloat = GCHandle.Alloc(boxedFloat, GCHandleType.Pinned);
var pinnedInt = GCHandle.Alloc(boxedInt, GCHandleType.Pinned);
int* floatAddress = (int*)pinnedFloat.AddrOfPinnedObject().ToPointer();
int* intAddress = (int*)pinnedInt.AddrOfPinnedObject().ToPointer();
Console.WriteLine("Type: {0}, Value: {1}", boxedInt.GetType().FullName, boxedInt);
int floatType = Unsafe.Read<int>(floatAddress - 1);
Unsafe.Write(intAddress - 1, floatType);
Console.WriteLine("Type: {0}, Value: {1}", boxedInt.GetType().FullName, boxedInt);
pinnedFloat.Free();
pinnedInt.Free();
Which prints out:
Type: System.Int32, Value: 5 Type: System.Single, Value: 7.006492E-45
Yep, we’ve managed to convince a int
(Int32
) type that it’s actually a float
(Single
) and behave like one instead!!
This works by overwriting the Method Table pointer for the int
, with the same value as the float
one. So when it looks up its type or prints out its value, it uses the float
methods instead! Thanks to @Porges for the example that motivated this, his code does the same thing using fixed
instead.
Using It Safely
Despite the library requiring you to annotate your code with unsafe
, there are still some safe or maybe more accurately safer ways to use it!
Fortunately, one of the main .NET runtime developers provided a nice list of what you can and can’t do:
But as with all unsafe
code, you’re asking the runtime to let you do things that you are normally prevented from doing, things that it normally saves you from, so you have to be careful!
The post Subverting .NET Type Safety with 'System.Runtime.CompilerServices.Unsafe' first appeared on my blog Performance is a Feature!
CodeProject
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.