Click here to Skip to main content
15,032,024 members
Articles / Programming Languages / C#
Article
Posted 7 Jul 2017

Stats

56.1K views
880 downloads
30 bookmarked

Passing Strings Between Managed and Unmanaged Code

Rate me:
Please Sign up or sign in to vote.
5.00/5 (26 votes)
7 Jul 2017CPOL6 min read
How to pass strings between managed and unmanaged code
This article explains how to pass strings between a C# assembly and an unmanaged C++ DLL.

Background

The reader should have a basic knowledge of C# and unmanaged C++.

Returning a BSTR

An easy way to return a string from an unmanaged C++ DLL is to use the BSTR type.

The following C++ export returns a BSTR containing a version string:

C++
extern BSTR __stdcall GetVersionBSTR()
{
    return SysAllocString(L"Version 3.1.2");
}

The .DEF file is as follows:

LIBRARY

EXPORTS
    GetVersionBSTR

The export is imported into the .NET application as follows:

C#
namespace DemoApp.Model
{
    static class ImportLibrary
    {
        const String DLL_LOCATION = "DemoLibrary.dll";

        [DllImport(DLL_LOCATION, CharSet = CharSet.Ansi, 
         CallingConvention = CallingConvention.StdCall)]
        [return: MarshalAs(UnmanagedType.BSTR)]
        public static extern string GetVersionBSTR();
    }
}

The managed code invokes the imported function as follows:

C#
string version = Model.ImportLibrary.GetVersionBSTR();

The managed code marshals the string as a BSTR and frees the memory when it is no longer required.

When calling the export from unmanaged code, the BSTR should be freed, and a failure to do so creates a memory leak.

Returning a char *

Marshalling a char * return value is more difficult for the simple reason that the .NET application has no idea how the memory was allocated, and hence it does not know how to free it. The safe approach is to treat the return char * as a pointer to a memory location. The .NET application will not try to free the memory. This of course has the potential for memory leaks if the managed code allocated the string on the heap.

The following C++ export returns a version string defined as a string literal:

C++
extern char * __stdcall GetVersionCharPtr()
{
    return "Version 3.1.2";
}

The corresponding .DEF file is as follows:

LIBRARY

EXPORTS
    GetVersionCharPtr

The export is imported into the .NET application as follows:

C#
namespace DemoApp.Model
{
    static class ImportLibrary
    {
        const String DLL_LOCATION = "DemoLibrary.dll";

        [DllImport(DLL_LOCATION, CharSet = CharSet.Ansi, 
         CallingConvention = CallingConvention.StdCall)]
        public static extern IntPtr GetVersionCharPtr();
    }
}

The managed code invokes the imported function as follows:

C++
IntPtr intPtr = Model.ImportLibrary.GetVersionCharPtr();
string version = System.Runtime.InteropServices.Marshal.PtrToStringAnsi(intPtr);

Passing a String as a BSTR Parameter

It is very easy to pass a string as a parameter using the BSTR type.

The following C++ export takes a BSTR parameter:

C++
extern void __stdcall SetVersionBSTR(BSTR version)
{
    // Do something here .. 
}

The unmanaged code should not free the BSTR.

The .DEF file is as follows:

LIBRARY

EXPORTS
    SetVersionBSTR

This function is imported into a C# application as follows:

C#
namespace DemoApp.Model
{
    static class ImportLibrary 
    { 
        const String DLL_LOCATION = "DemoLibrary.dll"; 
        [DllImport(DLL_LOCATION, CharSet = CharSet.Ansi, 
                   CallingConvention = CallingConvention.StdCall)] 
        public static extern void SetVersionBSTR
               ([MarshalAs(UnmanagedType.BSTR) string version); 
    } 
}

The managed code invokes the imported function as follows:

C#
Model.ImportLibrary.SetVersionBSTR("Version 1.0.0);

Passing a String as a char * Parameter

The following C++ export takes a char * parameter:

C++
extern void __stdcall SetVersionCharPtr(char *version)
{
    // Do something here .. 
}

The .DEF file is as follows:

LIBRARY

EXPORTS
    SetVersionCharPtr

This function is imported into a C# application as follows:

C#
namespace DemoApp.Model
{
    static class ImportLibrary 
    { 
        const String DLL_LOCATION = "DemoLibrary.dll"; 
        [DllImport(DLL_LOCATION, CharSet = CharSet.Ansi, 
                   CallingConvention = CallingConvention.StdCall)] 
        public static extern void SetVersionCharPtr
               ([MarshalAs(UnmanagedType.LPStr) string version); 
    } 
}

The managed code invokes the imported function as follows:

C#
Model.ImportLibrary.SetVersionCharPtr("Version 1.0.0);

Returning a String with a BSTR * Parameter

An unmanaged C++ DLL can return a string to the caller using a BSTR * parameter. The DLL allocates the BSTR, and the caller frees it.

The following C++ export returns a string using a parameter of BSTR * type:

C++
extern HRESULT __stdcall GetVersionBSTRPtr(BSTR *version)
{
    *version = SysAllocString(L"Version 1.0.0"); 
    return S_OK;
}

The .DEF file is as follows:

LIBRARY

EXPORTS
    GetVersionBSTRPtr

This function is imported into a C# application as follows:

C#
namespace DemoApp.Model
{
    static class ImportLibrary 
    { 
        const String DLL_LOCATION = "DemoLibrary.dll"; 
        [DllImport(DLL_LOCATION, CharSet = CharSet.Ansi, 
                   CallingConvention = CallingConvention.StdCall)] 
        public static extern int GetVersionBSTRPtr
               ([MarshalAs(UnmanagedType.BSTR) out string version); 
    } 
}

Using this function in the C# code is straightforward:

C#
string version;
Model.ImportLibrary.GetVersionBSTRPtr(out version);

The managed code will automatically take care of the memory management.

Passing a String as a char** Parameter

The following C++ export returns a version string using a char ** parameter:

C++
extern HRESULT __stdcall GetVersionCharPtrPtr(char **version)
{
    *version = "Version 1.0.0"; 
    return S_OK;
}

The .DEF file is as follows:

LIBRARY

EXPORTS
    GetVersionCharPtrPtr

The function is imported into the managed code as follows:

C#
namespace DemoApp.Model
{
    static class ImportLibrary
    {
        const String DLL_LOCATION = "DemoLibrary.dll"; 
        [DllImport(DLL_LOCATION, CharSet = CharSet.Ansi, 
                   CallingConvention = CallingConvention.StdCall)]
        public static extern void GetVersionCharPtrPtr(out IntPtr version);
    }
}

Using this function in the C# code is straightforward:

C#
IntPtr intPtr;
Model.ImportLibrary.GetVersionCharPtrPtr(out intPtr);
string version = System.Runtime.InteropServices.Marshal.PtrToStringAnsi(intPtr);

Clearly, there is a danger of memory leaks if the unmanaged DLL allocates the memory for the string on the heap.

Passing a String with a Buffer

A safe way to return a string from an unmanaged C++ DLL is to use a buffer allocated by the caller. For example:

C++
extern void __stdcall GetVersionBuffer(char *buffer, unsigned long *pSize)
{
    if (pSize == nullptr)
    {
        return;
    }

    static char *version = "Version 5.1.1";
    unsigned long size = strlen(version) + 1;
    if ((buffer != nullptr) && (*pSize >= size))
    {
        strcpy_s(buffer, size, s_lastSetVersion);
    }
    // The string length including the zero terminator
    *pSize = size;
}

The caller should call the function twice, once with a null buffer address to determine the required buffer size, and then with an appropriately sized buffer.

The .DEF file is as follows:

LIBRARY

EXPORTS
    GetVersionBuffer

The function is imported into the managed code as follows:

C#
[DllImport(DLL_LOCATION, CharSet = CharSet.Ansi)]
public static extern Boolean GetVersionBuffer
([MarshalAs(UnmanagedType.LPStr)] StringBuilder version, ref UInt32 size);

Using this function in the C# code is straightforward:

C#
UInt32 size = 0;
Model.ImportLibrary.GetVersionBuffer(null, ref size);

var sb = new StringBuilder((int)size);
Model.ImportLibrary.GetVersionBuffer(sb, ref size);
string version = sb.ToString();

The above code determines the required buffer size, and then retrieves the version string.

Passing an Array of Strings

It is surprisingly easy to pass an array of strings using the .NET array and C++ SAFEARRAY types.

The following C++ export takes a SAFEARRAY parameter containing an array of BSTR values:

C++
extern void __stdcall SetStringArray(SAFEARRAY& safeArray)
{
    if (safeArray.cDims == 1)
    {
        if ((safeArray.fFeatures & FADF_BSTR) == FADF_BSTR)
        {
            BSTR* bstrArray;
            HRESULT hr = SafeArrayAccessData(&safeArray, (void**)&bstrArray);

            long iMin = 0;
            SafeArrayGetLBound(&safeArray, 1, &iMin);
            long iMax = 0;
            SafeArrayGetUBound(&safeArray, 1, &iMax);

            for (long i = iMin; i <= iMax; ++i)
            {
                // Do something here with the data! 
            }
        }
    }
}

The .DEF file is as follows:

LIBRARY

EXPORTS
    SetStringArray

The function is imported into the managed code as follows:

C#
[DllImport(DLL_LOCATION, CharSet = CharSet.Ansi,
           CallingConvention = CallingConvention.StdCall)]
public static extern void SetStringArray
       ([MarshalAs(UnmanagedType.SafeArray)] string[] array);

Using this function in the C# code is simple:

C#
string[] array = new string[4] {"one", "two", "three", "four"};
Model.ImportLibrary.SetStringArray(array);

Although the C++ code is a little bit messy, the managed code could not be simpler.

Returning an Array of Strings

The following C++ export fills a SAFEARRAY parameter with an array of BSTR values:

C++
extern void __stdcall GetStringArray(SAFEARRAY *&pSafeArray)
{
    if (s_strings.size() > 0)
    {
        SAFEARRAYBOUND  Bound;
        Bound.lLbound = 0;
        Bound.cElements = s_strings.size();

        pSafeArray = SafeArrayCreate(VT_BSTR, 1, &Bound);

        BSTR *pData;
        HRESULT hr = SafeArrayAccessData(pSafeArray, (void **)&pData);
        if (SUCCEEDED(hr))
        {
            for (DWORD i = 0; i < s_strings.size(); i++)
            {
                *pData++ = SysAllocString(s_strings[i].c_str());
            }
            SafeArrayUnaccessData(pSafeArray);
        }
    }
    else
    {
        pSafeArray = nullptr;
    }
}

The s_strings variable is assumed to be a std::list<std::string> instance containing multiple entries.

The .DEF file is as follows:

LIBRARY

EXPORTS
    GetStringArray

The function is imported into the managed code as follows:

C#
[DllImport(DLL_LOCATION, CharSet = CharSet.Ansi,
           CallingConvention = CallingConvention.StdCall)]
public static extern void GetStringArray
       ([MarshalAs(UnmanagedType.SafeArray)] out string[] array);

This is almost the same as for the SetStringArray method except that the argument is declared as the 'out' parameter.

The function may be called from the C# code as follows:

C#
string[] array;
Model.ImportLibrary.GetStringArray(array);

As before, the C++ code is a bit messy, but the managed code could not be simpler.

Dealing with ASCII and Unicode Strings

Quite often, a DLL will define ASCII and Unicode versions of a function. Indeed Microsoft often do this. The unmanaged MessageBox function is actually defined in a Windows header file as follows:

C++
#ifdef UNICODE
#define MessageBox  MessageBoxW
#else
#define MessageBox  MessageBoxA
#endif // !UNICODE

Fortunately, the .NET Framework has built in support for this.

We could have defined our first export as follows:

C++
extern void __stdcall SetVersionA(char *version)
{
    // Store the version
}

extern void __stdcall SetVersionW(wchar_t *version)
{
 // Store the version
}

The .DEF is defined as follows:

LIBRARY

EXPORTS
    SetVersionA
    SetVersionW

The function is imported into C# managed code as follows:

C#
namespace DemoApp.Model
{
    static class ImportLibrary
    {
        const String DLL_LOCATION = "DemoLibrary.dll";

        [DllImport(DLL_LOCATION, CharSet = CharSet.Unicode, 
                   CallingConvention = CallingConvention.StdCall)]
        public static extern string SetVersion(string version);
    }
}

Note that the function name is declared as SetVersion, rather than SetVersionA or SetVersionW, and the CharSet field is set to Unicode.

Using this function in the C# code is straightforward:

C#
string version = "Version 3.4.5"
Model.ImportLibrary.SetVersion(version);

If you debug through the above code, you will see that the SetVersionW export is invoked. This is because the CharSet was set to Unicode. If you change the CharSet to Ansi, and debug through, lo and behold, the SetVersionA export is invoked!

We can easily disable this feature using the ExactSpelling field as follows:

C#
namespace DemoApp.Model
{
    static class ImportLibrary
    {
        const String DLL_LOCATION = "DemoLibrary.dll";

        [DllImport(DLL_LOCATION, ExactSpelling = true, 
         CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
        public static extern void SetVersion(string version);
    }
}

Now the .NET application will try and invoke a function called SetVersion. Since one does not exist, the function call will fail.

Conclusions

Passing a string into an unmanaged C++ DLL is very easy. Returning a string is not so easy, and pitfalls include memory leaks and heap corruption. A simple way is for the caller to allocate a buffer of the required size. This method is suitable for both managed and unmanaged clients. A slightly easier alternative is to use the BSTR * type, with the risk that an unmanaged client could introduce a memory leak by not freeing the BSTR.

Passing an array of strings between a managed application and an unmanaged DLL is also fairly easy, although the code in the unmanaged DLL is a little messy.

I have by no means exhausted the ways to exchange strings between managed and unmanaged code. Other methods are left as an exercise for the reader.

Example Code

I have created a simple WPF and C++ DLL application which demonstrates the ideas discussed in this article. Don't worry if you do not understand WPF, you shouldn't have any trouble understanding the relevant code fragments, and with a bit of luck, you will be encouraged to go on and learn WPF, which I highly recommend.

History

  • 7th July, 2017: First version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Leif Simon Goodwin
United Kingdom United Kingdom
C#/WPF/C++ Windows developer

Comments and Discussions

 
QuestionHow to get array of strings returned from c++ Pin
johnyroyan24-Jul-20 1:23
Memberjohnyroyan24-Jul-20 1:23 
AnswerRe: How to get array of strings returned from c++ Pin
john_172613-Aug-21 7:40
Memberjohn_172613-Aug-21 7:40 
GeneralRe: How to get array of strings returned from c++ Pin
john_172613-Aug-21 7:44
Memberjohn_172613-Aug-21 7:44 
Questionnamespace "std"has no member "wstring" Pin
Grant736-Mar-20 8:29
MemberGrant736-Mar-20 8:29 
QuestionGetStringArray: exporting a SAFEARRAY.... Pin
ajburge19-Nov-18 16:50
Memberajburge19-Nov-18 16:50 
QuestionDam u are awesome Pin
Member 1397171324-Oct-18 21:22
MemberMember 1397171324-Oct-18 21:22 
GeneralMy vote of 5 Pin
Alexey KK10-Jul-17 11:02
professionalAlexey KK10-Jul-17 11:02 
Great work.
Thank you very much fore sharing.
May i ask you one question - how would you pass unsigned long long to c#? Seems there is no direct equivalence.

Dont you think approaches may be simplified? I mean to many difficult types to understand.
From types equivalency table we have:
MS C++ signed char == .Net System.SByte
MS C++ unsigned char == .Net System.Byte
MS C++ wchar_t == .Net System.Char

So for simple string we have to resolve only one problem: allocate memory and prevent it freeing on native method exit if we want to get C# string from native method. And nothing special if native method is getting C# string as parameter. So what is left is what it will be one byte per character or not.
Certainly if we choose bytes array it is not string anymore but we keep logic - we are coming from initial c# string or we are heading there.
It is really easy to use Encoding class methods to get bytes or string depending on what you need:
byte[] bytesWord = Encoding.UTF8.GetBytes(string word);
or string = Encoding.UTF8.GetString(byte[] bytes);
My choice is mostly one byte string operations but 2 bytes string may be needed too.

So version one: Get Native hash code by passing string:
extern "C"
{
	__declspec (dllexport)int GetWordHash_Native_Dll(const char* word, const int size, const int vocab_hash_size)
	{
		unsigned long long a, hash = 0;
		for ( a = 0; a < size; a++)
		{
			auto ch = word[a];
			hash = hash * 257 + word[a];
		}
		hash = hash % (unsigned long long)vocab_hash_size;
		return hash;
	}
}


Consuming in C#:
[SuppressUnmanagedCodeSecurity]
        [DllImport("CompareNativeModel.CPP.dll", CallingConvention = CallingConvention.Cdecl)]
        unsafe extern static Int32 GetWordHash_Native_Dll(byte* word, Int32 size, Int32 vocab_hash_size);

        private unsafe static int GetWordHashNative(string word, int vocabHashSize)
        {
            byte[] bytesWord = Encoding.UTF8.GetBytes(word);
            fixed (byte* pWord = bytesWord)
            {
                return GetWordHash_Native_Dll(pWord, bytesWord.Length, vocabHashSize);
            }
        }



Version 2:
sending 2 bytes string. I used it when needed to port some native algorithms to managed version and was sending message from dll to Action<string> C# delegate. Some additional tricks for keeping pointer to this delegate fixed are needed but it s another question.
In C++ code we have to use CoTaskMemAlloc method and that's all: object memory is allocated and CLR garbage collector will do his standard job when managed method will be finished or earlier if local object can be disposed.
In example below message is sent using callback function but without any problem it may be returned by reference as method parameter or as function result by return operator.

#include <ole2.h> 
// in C++ AMP CoTaskMemAlloc is already in amp.h
static wchar_t* GetAllocatedMessage(string message)
{
	const size_t _size = message.length() + 1;
	//
	wchar_t* returnedString = (wchar_t*)CoTaskMemAlloc(2 * (_size + 1));
	if (returnedString == NULL) return NULL;
	//
	std::swprintf(returnedString, _size, L"%Ts", message.c_str());		
	return returnedString;
}

// in h.file
typedef void(/*__stdcall*/__cdecl * WriteMessageCallback)(wchar_t*);

// in class
WriteMessageCallback m_Messagecallback;
void SetWriteMessageCallback(WriteMessageCallback messagecallback)
	{
		if (messagecallback)
			m_Messagecallback = messagecallback;
	}

	void PushMessage(wchar_t* message)
	{
		try
		{
			if (m_Messagecallback && message)
				m_Messagecallback(message);
		}
		catch (exception ex)
		{

		}
	}

	void WriteMessage(std::string message)
	{
		PushMessage(GetAllocatedMessage(message));
	}

}

Alexey


modified 10-Jul-17 18:58pm.

GeneralMy vote of 5 Pin
Franc Morales7-Jul-17 11:54
MemberFranc Morales7-Jul-17 11:54 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.