Click here to Skip to main content
15,868,164 members
Articles / Programming Languages / C#

Creating a pointer type in C#

Rate me:
Please Sign up or sign in to vote.
3.46/5 (11 votes)
30 Jul 2018CPOL5 min read 16.9K   13   2
How to take the address of a managed object and create a pointer

Can you really do that?

Creating a pointer type in a managed language like C# may sound like heresy, but C# actually has most of the features necessary to create a pointer type.

  1. Index operator
  2. Postfix operator
  3. struct value type
  4. void*

You may be thinking, "how are you going to create a pointer if you can't get the address of a managed object?" But that's totally possible! There is an undocumented keyword, __makeref, that can help us do this.

__makeref returns a structure, TypedReference. A TypedReference acts very much like a pointer itself. It's commonly used in Reflection, notably when (directly) retrieving the value of a field.  A TypedReference contains two IntPtr fields: "Value", and "Type". We can make an educated guess that "Value" is the actual address of the object. Therefore, with the power of casting, we can use a TypedReference to get the address of a managed object:

C#
public static IntPtr AddressOf<T>(ref T t)
{
   TypedReference tr = __makeref(t);
   return *(IntPtr*) (&tr);
}

Note: You'll want to use the ref qualifier here so we get the correct address of reference types.

Creating a pointer type

Requirements: System.Runtime.CompilerServices.Unsafe for reading/writing. It can be found on NuGet.

Now that we can get the address of an object, we can create a pointer type. First, we'll add one field containing the address we're pointing to:

C#
public unsafe struct Pointer<T>
{
   private void* m_value;

To make our pointer as authentic™ as possible, we'll make it a struct so it behaves like a C/C++ pointer and like a reference type in C#.

We'll make m_value our only field so it is represented just like a pointer in memory. It'll also keep the size of the type the same size as a native pointer, 4 or 8 bytes.

Note: You can use either void* or IntPtr, it doesn't make a difference. For this tutorial we'll use void* because it's what CompilerServices.Unsafe uses.

We'll also create a constructor while we're at it:

C#
public Pointer(void* v)
{
   m_value = v;
}

Creating the indexer

If you're at all familiar with C/C++ pointers, you know you can index them just like arrays. All that's up to us is doing the arithmetic behind the scenes. So we'll make a method to calculate the correct address to read from. The math is very simple, just multiply the index by the size of the type:

private static void* Offset(void* p, int elemCnt)
{
   int size = Unsafe.SizeOf<T>();
   size *= elemCnt;
   return (void*) (((long) p) + size);
}

But you may be thinking, "sizeof doesn't work with managed types!" But thankfully, CompilerServices.Unsafe provides us with a SizeOf<T> method. Even then, the size of a managed type is just the size of a pointer, 4 (32-bit) or 8 (64-bit) bytes, so you don't actually have to use SizeOf<T> if you don't want to.

CompilerServices.Unsafe also supplies us with a method for reading/writing to addresses, so we will use that to implement the getter/setter for our indexer:

C#
public T this[int index] {
   get => Unsafe.Read<T>(Offset(m_value, index));
   set => Unsafe.Write(Offset(m_value, index), value);
}

Overloading the postfix operators

Now we have an indexer, but we're not done yet. In C/C++ you can also postfix increment and decrement pointers just like integers. To do this, we'll also use the Offset method we wrote earlier to create an Increment and Decrement method, and overload the postfix operators.

C#
private void Increment(int cnt = 1)
{
   m_value = Offset(m_value, cnt);
}
C#
private void Decrement(int cnt = 1)
{
   m_value = Offset(m_value, -cnt);
}
C#
public static Pointer<T> operator ++(Pointer<T> p)
{
   p.Increment();
   return p;
}
C#
public static Pointer<T> operator --(Pointer<T> p)
{
   p.Decrement();
   return p;
}

Finishing touches

Now you can index and postfix the pointer. But how do you dereference it? Easy, we'll just create a Value property:

C#
public T Value {
   get => Unsafe.Read<T>(m_value);
   set => Unsafe.Write(m_value, value);
}

Note: this is essentially the same operation as this[0], but it's just for convenience.

At this point we have most of the traits that make up a pointer. All that's left is adding a few things to polish things up. We'll create an implicit operator for synergy with our AddressOf method:

C#
public static implicit operator Pointer<T>(void* v)
{
   return new Pointer<T>(v);
}
C#
public static implicit operator Pointer<T>(IntPtr p)
{
   return new Pointer<T>(p.ToPointer());
}

We'll also override the ToString method for easily seeing what we're pointing to:

C#
public override string ToString()
{
   return Value.ToString();
}

Now we've written pretty much everything that defines a pointer. You can add a few more features yourself, such as an Address property or overloading the add and subtract operators.

Of course, pointers aren't just used for pointing to other objects, but for dynamic memory allocation, handles, and arrays. (I'll write an article on all those some other time.) But for now we'll stick to pointing to other objects.

Testing and explanation

We'll make a string and tell our pointer to point to str's address. We can read and write to both objects no problem.

var             str     = "foo";
Pointer<string> pString = AddressOf(ref str);
Debug.Assert(pString.Value == "foo");
pString.Value = "bar";
Debug.Assert(str == "bar");
str = "deadbeef";
Debug.Assert(pString.Value == "deadbeef");
Debug.Assert(pString.Value == str);

 

There's an important thing to note here if you aren't super familiar with reference types and pointers. What we're really doing here is pointing to a pointer. We aren't pointing to str itself in heap memory. Because a string is a reference type, it's actually a pointer internally. The runtime does the dereferencing and pointer mechanics behind the scenes for you.

Of course, our pointer will also work on unmanaged types just as well:

C#
long l = 0xDEADBEEF;
Pointer<long> pLong = AddressOf(ref l);
Debug.Assert(pLong.Value == 0xDEADBEEF);
pLong.Value = 0xDEAD;
Debug.Assert(l == 0xDEAD);

What about the GC?

If you know a thing or two about the garbage collector, you know the garbage collector compacts the heap, which changes the address reference types point to. However, the address of the reference type's pointer doesn't change, only the address of what the reference type  points to.

To demonstrate:

C#
var               str     = "foo";
Pointer<string> pString = AddressOf(ref str);
Debug.Assert(pString.Value == "foo");
pString.Value = "bar";
Debug.Assert(str == "bar");
Console.WriteLine("pString points to: {0:X}", pString.Address.ToInt64());
Console.WriteLine("str points to: {0:X}", (*(IntPtr*) AddressOf(ref str)).ToInt64());


const int maxPasses  = 1000;
const int maxObjects = 9000;

int passes = 0;
while (passes++ < maxPasses) {
   object[] oArr = new object[maxObjects];
   for (int i = 0; i < oArr.Length; i++) {
      oArr[i] = new object();
   }

   Debug.Assert(pString.Value == str);
   Debug.Assert(pString.Address == AddressOf(ref str));
   Debug.Assert(str == "bar");
   Debug.Assert(pString.Value == "bar");
}


Debug.Assert(pString.Value == str);
Debug.Assert(pString.Address == AddressOf(ref str));
Debug.Assert(str == "bar");
Debug.Assert(pString.Value == "bar");
Console.WriteLine("pString points to: {0:X}", pString.Address.ToInt64());
Console.WriteLine("str points to: {0:X}", (*(IntPtr*) AddressOf(ref str)).ToInt64());

Here, we create a ton of objects to make the GC run and compact the heap. We make sure our pointer points to the correct object and has the correct value.

The output before the GC compacted the heap:

C#
pString points to: 6BD399EA98
address of str: 6BD399EA98
str points to: 20559E392B8

After the GC compacted the heap:

pString points to: 6BD399EA98
address of str: 6BD399EA98
str points to: 20559E2A1F0

As you can see, str pointed to a different address in the heap after the GC ran. This test would fail if our pointer pointed to the heap memory of str, but it didn't. For an explanation, continue reading.

What can't it do?

Like mentioned earlier, you'll run into problems if you point to the heap memory of reference types because the GC compacts the heap. This actually can be avoided if you pin an object, but we won't go into detail about that.

To demonstrate, we'll try pointing to an int inside of an int[] array:

int[] arr = {1, 2, 3, 4, 5};
IntPtr arrInHeap = *(IntPtr*) AddressOf(ref arr);
int offsetToArrData = IntPtr.Size * 2;
Pointer<int> pArr = arrInHeap + offsetToArrData;
Debug.Assert(pArr[0] == arr[0]);

To get to the heap memory of arr, we need to read arr's heap pointer. Then we need to offset that pointer by IntPtr.Size * 2 (skip over the MethodTable*, padding Int32, and length Int32.) (I explained this in my first article.)

Now we're pointing to arr's actual data. We can read and write to it all we want, but we'll run into a problem:

C#
int[]          arr             = {1, 2, 3, 4, 5};
IntPtr         arrInHeap       = *(IntPtr*) AddressOf(ref arr);
int            offsetToArrData = IntPtr.Size * 2;
Pointer<int> pArr            = arrInHeap + offsetToArrData;
Debug.Assert(pArr[0] == arr[0]);
Console.WriteLine("pArr points to: {0:X}", pArr.Address.ToInt64() - IntPtr.Size * 2);
Console.WriteLine("arr's heap address: {0:X}", (*(IntPtr*) AddressOf(ref arr)).ToInt64());

const int maxPasses  = 1000;
const int maxObjects = 9000;

int passes = 0;
while (passes++ < maxPasses) {
   object[] oArr = new object[maxObjects];
   for (int i = 0; i < oArr.Length; i++) {
      oArr[i] = new object();
   }
}

Console.WriteLine("pArr points to: {0:X}", pArr.Address.ToInt64() - IntPtr.Size * 2);
Console.WriteLine("arr's heap address: {0:X}", (*(IntPtr*) AddressOf(ref arr)).ToInt64());

Before GC:

C#
pArr points to: 2D09BC69330
arr's heap address: 2D09BC69330

After GC:

C#
pArr points to: 2D09BC69330
arr's heap address: 2D09BC5A268

Basically, don't point to raw heap memory with this unless you pin the object.

Conclusion

And that is how you create a pointer in C#! I hope you enjoyed reading.

Here is the full source

My GitHub

This article was originally posted at https://github.com/Decimation/RazorSharp

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralMy vote of 4 Pin
tbayart14-Aug-18 4:37
professionaltbayart14-Aug-18 4:37 
QuestionSafety and use cases? Pin
Paulo Zemek31-Jul-18 10:45
mvaPaulo Zemek31-Jul-18 10:45 
First, this seems to be extremely unsafe.
You are using an undocumented keyword, which also means it might change in behavior in the future (and I am not sure if it is even implemented in all .NET implementations).

In any case, it looks extremely unsafe. When you talk about pinning or not the objects, you said that the address of the reference doesn't change, only the address of the content it references change.
Well... that probably happens because the reference itself was a local variable, and so it was in the stack. Yet, if you were getting the address of any reference that is already in the heap, like part of an array or another object, bad things could happen. What's worse, that are lot of "corner cases" to this, because many times local variables are not stored in the stack (in particular, when we use yield return or unnamed delegates, local variables are actually taken out of the stack and put into an invisible "state" object).

So, the main questions are: When would you really want to use this? What would be your suggestions about safety?
Can we "pin" an array of references? Does this gives real performance advantages, or is it simply allowing you to have a pointer to anything, with no practical gain?

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.