Click here to Skip to main content
15,867,308 members
Please Sign up or sign in to vote.
5.00/5 (1 vote)
See more:
Hi,

I did a google search and couldn't find something quite like this, so I thought I'd post here.

I've got a mature, high performance native library with lots of optimizations including using zone allocators to increase performance. It's used by a variety of legacy software, and/but I'm building a managed wrapper to expose that functionality to (e.g.) C#. Moving all the code to managed is not an option for a variety of reasons, including performance. I've had to build a circular reference between the native and managed objects:
* the managed class needs a pointer to the native class for business logic; the wrapper does nothing other than call to the native code
* the native code stores native objects in ordered sets, so I retain a reference from the native class back to the managed class for iterators, etc. - I don't want to duplicate list and set management

Example illustrative code:

ref class Managed_eh {
    class Native_eh *n_this;                    // this is always valid
};

class Native_eh {                               // handled by a zone allocator
    Native_eh *prev, *next;                     // doubly-linked list
    msclr::auto_gcroot<managed_eh> ^gc_this;    // lazy assignment
};</managed_eh>


* If client code creates the managed object, then the native object is necessarily created.
* If the native object is created due to internal logic, then we don't create the managed wrapper until we need to - to avoid unnecessary allocations, etc., otherwise the benefits of the zone allocators, etc. are lost. Also, there are many times when the managed wrapper is never actually needed.
* gc_this must not be cleared (to maintain consistency) so long as there's client managed code with a handle to the Managed_eh object.

Clean-up is the concern. Neither the wrapper class, nor the native class know when they are no longer interesting to client code and clean up. Sometimes I can solve this problem by cleaning up the container objects managing the sets of Native_eh. But it's not a generic solution.
* In the strictly native world, this would be a trivial fix by changing the hierarchy.
* In the COM reference-counter world, this is equally easy with a different solution:
— if gc_this is not set, then clean up when the reference counter to Managed_eh reaches 0
— if gc_this is set, then when the reference counter on the managed pointer drops to 1, we know we can clean up because we know exactly what that '1' is (gc_this).

What is the appropriate ".Net" approach to solving this problem?

Thanks,

—Rob

[EDIT: in response to Answers]

Interesting discussion, thanks, though I don't think it helps my particular problem.

I'm not particularly concerned on precisely when the clean-up of the garbage-collected objects occur, I can easily work around that. I'm more concerned that they can.
* So long as the native object retains the handle to the managed wrapper, then the garbage collector will not perform clean-up. Tests show this, and it makes sense because the garbage collector cannot track references in the native side.
* But...the native object cannot release its handle until it knows that nobody else on the managed side has a handle - or risk invalidating consistency.

Said another way, if the native object has the only reference to the managed wrapper, than we (it) can perform clean-up. But how can the native object determine that nobody else is interested in its wrapper?

In the COM world, we could examine reference counters to conclude "I'm the only one left with a handle to this object so I can release it and we can clean up". But I don't see that available in the .Net world.

Is there a way to (efficiently) query the dependency graph used by the garbage collector?

Is there an event that tells you when the state of the dependency graph for a given object (the managed wrapper) changes?

—Rob
Posted
Updated 8-Apr-11 5:13am
v2
Comments
Sergey Alexandrovich Kryukov 8-Apr-11 11:20am    
OP commented after receiving Answers from SA, Olivier an Espen:
Hi,

This thread provided me with a great resource to finding what I think is an appropriate solution. I'm going to play with the code this weekend, but...it's something like this:

The MSDN link defines a separation between the destructor and the finalizer - which was only found during discussion on Dispose(). This is a key to breaking the chicken/egg problem here, but not the complete solution.

The 2nd part that is necessary is improving the logic on clean-up on the native side. The generic solution is to maintain a counter in the native object for conditional releases. But for my problem, I can effectively reduce the counter to a boolean. When the native object is on the internal set, then both the native and managed code may be interested in it. During clean-up of the native object set, if gc_this is valid, then release it and don't delete the native object (the managed object still needs it). This breaks the chicken/egg issue because it can start the appropriate chain of events. Clean-up of the managed wrapper will then only occur when the native object becomes detached from its list AND no client code still maintains a handle to it. Things should work cleanly, deterministically. The managed wrapper will be forced to exist so long as someone on either the native, or the managed side is interested in it.

Thanks!

Let's start with the pure managed code.

This is simple: when you "loose" the reference to some object, it is scheduled for destruction by the Garbage Collector (GC). The actual destruction happens some time later; it depends on both GC design and your code. It is not recommended to assume any certain moment of time when it happens; it is not recommended to attempt to change GC behavior (which is possible, so this is only a rule of thumb). Buy this reason, it also not recommended to write destructors. However, I think, in mixed mode the managed-code destructors may be very useful, please see below.

Now, this mechanism always works correctly even if two references are mutual or several references are cyclic. You can easily check it up.

This does not mean the memory leak is not possible. It is quite possible, but not due to some random bug as in C++. Rather, it can be due to design. Also, the definition of leak is not as simple as one could think of. Basically, you have memory leak when your Final State Machine comes back to some state which is supposed to be the same as before, but number of the accessible object is not the same (you cannot speak of the amount of memory, strictly speaking, because the GC is the independent actor, see above). Imagine you have something as a multi-buffer editor. If you load N documents, changes they data and eventually close them all, you're supposes to get rid of all objects you had in the middle. Imagine, however, you also collected references to some object in some Dictionary used, say for quick search of something. Now imagine this: when you close the Tab object with the document, every instance of every object created inside gets hierarchically lost, but… you forget to find and remove some of the objects from the Dictionary. In this way, you get a memory leak. This is trivial enough.

To finish with managed-only C++/CLI code, let's note: this language is unique as it allows for value semantic even for reference types. This is a great feature. Again, there is nothing special to do about it: stack objects are removed as the code goes out of stack frame as in C++. If this object hold some references, so GC will take care of them later.

As you perfectly understand, it's not so easy with mixed mode. Different approaches can be applied. I would classify them based on tight or loose coupling between managed and unmanaged types. As loose coupling in general is very good thing, it is not always so in the sense I'm trying to use this term.

Let's consider very loose coupling first: you have a separate C++ (unmanaged) code and C++/CLI managed code. This part collaboration is very rare, so you can isolate all cases of collaboration and pay special attention for memory issues. In this case, you use you C++ technique for unmanaged part and C++/CLI for .NET part. This is really easy but not very realistic for many applications. There are many cases when you need tight collaboration.

In the case of tight collaboration, I would suggest to put all collaboration to extreme: wrap every semantic unit of C++ code (presumably, any relatively independent C++ class) in the C++/CLI wrapper. The deletion of C++ heap objects (in umnanaged sense) should be well manageable in each such unit. Now, you should make sure that such unmanaged deletion is completed in the destructor of the wrapper class. You also should take care about processing exceptions, which is also not very difficult. This technique will one have important limitation: as the moment of the reclaiming of the managed memory in unknown (due to the nature of GC), the unmanaged memory clean-up should have no side effect, just deletion. This would a reasonable design principle for the purely unmanaged C++ project.

Also, there are techniques based on IDisposable. There is a lot of controversies and even holy wars about different approaches. I think, it depends. First thing to understand, IDisposable is nothing more than a way to guarantee a call to the method object::Dispose. Even though this method is mostly used to reclaim unmanaged resources, it can do anything, anything at all.

I would combine the two approached without big problems. I would prefer using IDisposable objects for short-leaving objects (typically withing a single stack frame). In this way, such managed objects would be similar (in terms of lifetime) to instances of C++ classes used on stack and destroyed at the moment of leaving the current stack frame. Such managed objects are well suitable for control of the life time of C++ unmanaged heap objects. In this way, wrapping the unmanaged destruction in object::Dispose should be preferred.

Another special case of wrapping unmanaged heap objects in IDisposable, with destruction in object::Dispose rather than constructors is this: unmanaged objects claiming a lot of heap, especially with the possibility of fragmentation. Leaving such object to a destructor-based managed wrapper can be an uncomfortable thing: who knows when GC "decides" to reclaim this wrapper?

So, such programming needs good strategic planning and following the strategy very accurately. At the same time, I think the .NET part of the mixed-mode process can only improve the robustness of the unmanaged code in terms of memory management.

[EDIT]
In response to the follow-up (I moved it to the text of original Question after [EDIT]).
There is no event you are looking for and never was. For things like that people uses well-known technique of reference counting, possibly with "smart pointers" typical for C++. Did you use this approach in pure native code? One variant is to keep using it following loose-coupled model.

Your further steps depends on your general design (Object Model or something) and your decision on how you want to design or redesign it, on how much you're willing to re-design and the trade off. (Please also see my comment to the answer by Olivier.) You should be careful not to mix the approaches so they would not struggle against each other :-).

I actually advised you a managed equivalent or smart pointers: wrapping unmanaged objects in the managed. This is not so simple, because your relationships between referenced and referencing objects lie inside the unmanaged word. You need to change is so these relationships would be driven by the managed references.

On more idea is to remove all managed code except some performance-critical calculation and interfacing with unmanaged libraries. All you Object Model can become purely managed. It could nearly eliminate you native memory allocation problem.

At this point, I don't think abstract discussion (not concerting your concrete architecture) could bring much more. You can either explain you project and try to get advice applicable to your concrete situation or do the new architecture all by yourself.

Good luck,
—SA
 
Share this answer
 
v2
Comments
Olivier Levrey 8-Apr-11 4:07am    
OP added a solution instead of a comment. Please have a look...
Olivier Levrey 8-Apr-11 8:20am    
Although you could have made this answer shorter (it is discouraging to read all), you explained a lot of things. Have a 5.
Sergey Alexandrovich Kryukov 8-Apr-11 11:04am    
Thank you, Olivier. Yes, long answer can be discouraging, but the topic is pretty big.
--SA
Espen Harlinn 8-Apr-11 8:34am    
Nice effort, my 5
Sergey Alexandrovich Kryukov 8-Apr-11 11:04am    
Thank you, Espen.
--SA
If you are willing to allow the managed clients to dictate when they no longer need the wrapper, you can implement the Dispose-pattern. In C++/CLI Disposing is the same as calling delete on the reference handle, and at this point the wrapper can call a method in Native_eh (something like CleanUpRequested), and Native_eh::CleanUpRequested can reset the member handle to nullptr. This frees up the managed wrapper for collection during the next GC cycle (and at this point the native object does not care about it anymore, it's up to the CLR/GC).

[Response 1]

The Dispose pattern is for .NET, it's not language specific. C++/CLI internally uses the Dispose pattern (IL is nearly identical to what the C# compiler generates).

I am curious why you don't want the caller to decide when to dispose off the object. Because that's what is always done in the native world. And your managed code merely wraps native code and should really be following native world deterministic disposal semantics.
 
Share this answer
 
v2
Comments
Olivier Levrey 8-Apr-11 4:07am    
OP added a solution instead of a comment. Please have a look...
Nish Nishant 8-Apr-11 7:33am    
Thanks, updated my response.
Olivier Levrey 8-Apr-11 8:00am    
I completely agree on that. Have a 5.
Nish Nishant 8-Apr-11 8:00am    
Thank you, Olivier!
Espen Harlinn 8-Apr-11 8:43am    
Good advice - my 5.
I think there is a problem in your design. I don't understand the need to hold references to managed objetcs inside your native code: the so-called gc_this.

It is not a wrapper anymore. A managed wrapper is just some code you put "around" your native code without changing anything to that native code.

You should rather add native callbacks or something to give some feedback to your wrapper, and let the wrapper do the cleaning with what Nishant and SA explained: Dispose, deterministic cleaning, ...
 
Share this answer
 
Comments
Rob Bryce 8-Apr-11 10:22am    
The native code is, in fact, simply a wrapper. To walk along the ordered set of these objects, the client managed code would make a call to Managed_eh->Next (not show in the original code block).

To implement this call would be Managed_eh->n_this->Next->gc_this. If the managed wrapper itself maintained next & prev pointers, then it would then no longer be a wrapper...
Sergey Alexandrovich Kryukov 8-Apr-11 11:18am    
Olivier, most probably you're right. My 5.
It's probably a sign that Rob tries to follow the unmanaged approach to his new managed design. He probably realized how the managed is managing memory but still his unmanaged approach is leading him.
--SA
Olivier Levrey 8-Apr-11 11:25am    
Probably. Thank you for the vote.
This answer assums that the .Net layer does not define the lifetime of objects in the application datamodel.

When it doesn't, and you need to implement some sort of notification from native to clr, you could consider a "handle" based approach, allowing clr objects to potentially have a different lifetime than the objects in the "real" application model. You will obviously need validate the handles when crossing the clr/native border. Notifications can the be sent from native to clr using a common application object that last the lifetime of the application, identifying the objects by "handle".

Regards
Espen Harlinn
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 8-Apr-11 11:20am    
It looks like a very good idea, but I feel it need some clarification (or just working into detail). My 5.
--SA
Thanks...but...

Isn't Dispose for C#, etc.? The wrapper is in managed C++, so I not should be using info at http://msdn.microsoft.com/en-us/library/ms177197.aspx?

More importantly, there still seems to be chicken-egg problem here - I don't want to require the client code to have to call Dispose(). Nishant, I understand your logic here, but if Native_eh->gc_this is valid, then the garbage collector cannot initiate your sequence of events - because the garbage collector will always (deterministically) determine that there's at least one reference out there to it (because there is - gc_this) - even if managed clients release all of their handles. But Native_eh can't release it's gc_this until all other managed clients have released their handles.

--Rob
 
Share this answer
 
Comments
Olivier Levrey 8-Apr-11 4:16am    
You are right, the link you found is the correct way to implement proper and deterministic destruction under C++/CLI.
Nish Nishant 8-Apr-11 7:32am    
The Dispose pattern is for .NET, it's not language specific. C++/CLI internally uses the Dispose pattern (IL is nearly identical to what the C# compiler generates).

I am curious why you don't want the caller to decide when to dispose off the object. Because that's what is always done in the native world. And your managed code merely wraps native code and should really be following native world deterministic disposal semantics.
Olivier Levrey 8-Apr-11 7:59am    
I completly aggre on that.
Rob Bryce 8-Apr-11 10:20am    
The answer is simple: to follow the object-oriented methodology, you should be hiding all implementation details from the user. The fact that it contains/wraps a native object is an implementation detail that should be hidden from the user when it can be (i.e., when it doesn't encapsulate a system resource, for example.)
Olivier Levrey 8-Apr-11 10:38am    
Hiding implementation details doesn't mean the user shouldn't release resources when needed. This is why Dispose method exists.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900