|
Honestly I still don't understand what you are trying to say.
Note, I use casts for two reasons:
1) I use the placement new. So, I allocate raw memory. Then I do the "math" using char. It could appear that I could use the target type directly, but I multiply a value that's the sizeof() the target type aligned as a pointer. So, I can't simply use a cast to T and add the item number. I must use char and multiply the itemSize by the item number. Then I do the final cast;
2) For deleted items, I use the memory as a pointer to the next deleted item pointer (so, a pointer to a pointer). I could use an union between the real type and a pointer, but I decided to use a cast.
So, what I don't understand is: How do you want me to use template parameters differently?
In fact, I want to avoid more template parameters. I don't want users to have to pass other parameters. They only need to pass the minimum (the destination type they want) and the class will do all the job. If they really think that using the default allocator is bad, they could provide their own allocator. And that's all.
The only thing that I would change is the use of variadic template arguments to avoid the need of the GetNextWithoutInitializing, but I am not using a compiler that supports variadic templates and I still think that such method is useful.
It is important to note that if you use the GetNextWithoutInitializing the default constructor of the type T will never be invoked for that item. If I used an ordinary new[], all items will be constructed using the default constructor (and so I should not use the placement new anymore).
This is not only a performance penalty as it is a problem if a type doesn't provide a default constructor. This pool is capable of working with types that don't have a default constructor thanks to its capacity to allocate raw memory and use the placement new.
|
|
|
|
|
What I was trying to state before is related to how to declare template methods so that they are not inline, which would make your ObjectPool class better. Unlike C#, most C++ methods that are more than a few lines are not usually declared inline methods in the class definition. When they are, the compiler can decide to make the function inline in the code. This can cause the code to grow large and also hurt performance, because inline code might not be cached. So, if instantiating the same class type multiple places in the code using the same ObjectPool instance, you probably don't want the method put inline in every place it is used.
My last post shows how to hint to the compiler that the code should not be inlined. By creating a method body that is unique, similar to putting a C++ method in the implementation file, the compiler will (typically, not always) create a single method in memory for all instantiations of the template for a specific type T. (In some cases that you will probably realize, there will only be a single method for two instantiations of a template class for different types as long as the two types are the same size and the method only uses the size of the object). Modern compilers are pretty good about things like that. By the way, there are linkage issues with templates and how many times the compiler creates a method too, but modern compilers do a fair job of minimizing duplication of code.
I realized afterward, the code can be modified to have one cast, not zero casts. But only one is needed. The code can also be simplified with regard to the size of the template type. These changes make the code simpler and have no performance penalty at all.
First, change class DefaultMemoryAllocator to be:
template<typename T>
class DefaultMemoryAllocator
{
public:
static inline T * Allocate(size_t capacity)
{
return (T *)::operator new(capacity * sizeof(T), ::std::nothrow);
}
static inline void Deallocate(T *pointer)
{
::operator delete(pointer);
}
};
That needs only one cast.
Then change "_itemSize * _capacity" in the call to the allocator to just "_capacity".
I took the "size" parameter out of the Deallocate method. If you want that to extend capabilities in the future, you can add it back, but it is not used now - and I don't know how it could be used in the current code. Of course, if you remove that, then you have to remove the size at the call site.
To use that class above, declare the allocator in the ObjectPool class as shown below:
template <typename T, class TMemoryAllocator=DefaultMemoryAllocator<T> >
class ObjectPool
struct _Node
{
T *_memory;
size_t _capacity;
_Node *_nextNode;
<rest of code ommitted>
Note the "T *" declaration for the data member named "_memory".
Also note the declaration ends with TMemoryAllocator=DefaultMemoryAllocator<t> >. There has to be a space after the first ">" and the second ">" so the compiler won't confuse that with operator ">>".
Then change all other "char *" and all "void *" declarations in the code to "T *", and remove all casts except the one cast in the DefaultMemoryAllocator class.
Leave the rest of your code the same.
Also, see http://www.parashift.com/c++-faq/memory-pools.html[^] That does not deal with the O(1) performance, but ideas there are still relevant. You're code uses some ideas from that site.
modified 15-Apr-14 22:59pm.
|
|
|
|
|
The sizeof(T) in the DefaultMemoryAllocator will not work with char, which is used in the example.
I don't want to put the minimum size requirement in the allocator, as it should be independent of the ObjectPool, yet the pool really requires any type to be aligned as a pointer.
So, in a 64-bit system, a char, a short, a long, or any struct that has less than 8 bytes will actually multiply the value by 8. That's why I use that _itemSize. If the sizeof(T) was enough, I would have used sizeof(T) * something.
I do not do casts to (void *). I use void * to calculate the size of a pointer. And the char * is needed because I want to sum a value that may not the the size of T. Also note that in some situations I do a cast to T **. This is because a deleted item becomes "linked list" of deleted items (so, it is a pointer to the next free item).
Also, remember that how many casts this code does is not important. Changing the cast from one place to another but adding another template class is not really making things better. The important is if users will need to cast the results of the pool or not. And they don't. As I don't use variadic templates, they may require to use the emplacement new, but that's all.
About the size in the delete, I did it to support possible memory managers that don't track the size of the allocated block (amiga OS did had this case... I know there are other cases out there). Yet, in a typed memory allocator, that's not an issue.
About my code using ideas from that site... maybe we have compatible ideas, because I didn't use that site to write mine.
modified 15-Apr-14 22:58pm.
|
|
|
|
|
Paulo Zemek wrote: The sizeof(T) in the DefaultMemoryAllocator will not work with char, which is used in the example.
I don't want to put the minimum size requirement in the allocator, as it should be independent of the ObjectPool, yet the pool really requires any type to be aligned as a pointer.
The alignment always belongs in the allocator. Alignment is a property of the platform memory system, not of the type T. In any event, if you want it in the ObjectPool, it can be passed into the Allocate function, or another template parameter can be added to the DefaultMemoryAllocator for the alignment size.
For example:
template<typename T>
class DefaultMemoryAllocator
{
public:
static inline T * Allocate(size_t capacity, size_t alignment_in_bytes)
{
size_t size_aligned_type = sizeof(T);
size_t mult = size_aligned_type / alignment_in_bytes;
diff = size_aligned_type - (mult * alignment_in_bytes);
size_aligned_type += diff;
return (T *)::operator new(capacity * size_aligned_type, ::std::nothrow);
}
static inline void Deallocate(T *pointer)
{
::operator delete(pointer);
}
};
Of course, placement new, which calls malloc, will return an aligned pointer, so there is no need to worry about the first item.
But, sizeof(char) should be fine without doing any alignment. An array of char is fine, you can increment addresses by one and point to the next character. So now I do not understand what issue you are referring to. In any event, handling the size of the object in the ObjectPool class does not make sense to me - that is the exclusive domain of the allocator. Everything else in the C++ memory system is designed that way.
Perhaps there is something that I don't understand. If so, the DefaultMemoryAllocator I posted could be changed slightly to accomodate those changes without any limitation to the DefaultMemoryAllocator or the ObjectPool. Avoiding the casting has benefits I mention below, particularly avoiding the use of "void *" in the code.
Paulo Zemek wrote: I do not do casts to (void *). I use void * to calculate the size of a pointer. And the char * is needed because I want to sum a value that may not the the size of T. Also note that in some situations I do a cast to T **. This is because a deleted item becomes "linked list" of deleted items (so, it is a pointer to the next free item).
I did not write you "cast to void *". I wrote change the "declaration" of "void *" to "T *".
Using void * has issues when storing class instances in C++. If your type uses multiple inheritance, and you can't always control that if you use third-party code, then casting to "T *" from "void *" and then trying to cast again to one of the inherited types might not do what you expect. void * should be avoided in C++ whenever possible. (There are articles about this, I'll try to find one). In this case, there is no benefit to using "void *", so why use it?
Paulo Zemek wrote: Also, remember that how many casts this code does is not important. Changing the cast from one place to another but adding another template class is not really making things better. The important is if users will need to cast the results of the pool or not. And they don't. As I don't use variadic templates, they may require to use the emplacement new, but that's all.
As I mentioned above, C++, casting can result in bugs if casting from "void *" to a type that uses multiple inheritance.
Also, casts make code harder to read because the actual type stored is not clear throughout the code. Casts are sometimes necessary, but should be avoided.
modified 15-Apr-14 23:57pm.
|
|
|
|
|
Bill, I am not talking about real alignment. I am talking about the fact that if you use a pool of chars (like I did), when you "delete" an item from the pool, I will treat it as a node in a linked list (a pointer to the next deleted item).
So, independently on the size of the data and independently on its real alignment (chars can be 1-byte aligned), I need it to have at least a pointer size, which is a particularity of the Pool, because I do such unsafe casts.
"Of course, placement new, which calls malloc, will return an aligned pointer, so there is no need to worry about that."
No. The placement new doesn't call malloc. In fact, it doesn't call anything. The object is simply constructed on the pointer I am giving.
"I wrote change the "declaration" of "void *" to "T *". There is no reason to not do this, you always will only store a pointer to a potential T * instance in _memory."
Wrong. On the deleted items, instead of containing T the address contains a T*.
So, as I am already using a pointer to get there, it is either T* or T**. It is a needed extra level of indirection. And, as the size needed may not be the size of T, I calculate the value over char *, to correctly sum the pointer.
To explain with an example:
Imagine that I allocate memory for 4 chars from the pool. I will fill the word Test in each allocated char:
T___e___s___t___ (first deleted is NULL, or 0000)
They will not be 1 byte aligned (as char allows). In this case, they are 4 chars aligned (it is not 32-bit because it is text, but it could be).
When I delete an item (the s, for example), I will put the address of s in the first deleted member, but the s will point to the "next" deleted. As it was null (because s is the first deleted) the memory will be like this:
T___e___0000t___ (first deleted is 1008)
Now I delete the e. The address of e will be put in the first deleted, but the contents of e will point to the previous value that was there (the address of the deleted s... it becomes the "next" deleted in the sense that it is the next to be used when we do reallocations).
T___10080000t___ (first deleted is 1004)
* I am considering the buffer to be in address 1000, so the address of S is 1008.
Now I add a new value. It will be put in the place of the deleted e, then the value 1008, which is the address of deleted s, will become the first deleted again:
T___b___0000t___ (first deleted is 1008)
Finally, I add C:
T___b___c___t___ (first deleted is NULL)
So, the allocator of char can allocate one byte at a time. Yet I need a minimum of 4 bytes (in this demonstration) to fill the addresses of deleted items.
About the casts to void: I am dealing with raw memory. I return the value cast as the type the pool creates. There's no risk of seeing the class as the wrong sub-type, as the pool can't create sub-types.
Your last paragraph restates the problem of casts with multiple inheritance, but again, it doesn't exist.
This is a low level component. It allocates raw memory (something memory pools and components like vector are expected to do) then uses the placement new on that raw memory. After the object is returned, you can cast it from one type to another without ever using a void *, but the only thing the pool cares is about memory, placement new and "placement delete".
|
|
|
|
|
"Of course, placement new, which calls malloc, will return an aligned pointer, so there is no need to worry about that."
No. The placement new doesn't call malloc. In fact, it doesn't call anything. The object is simply constructed on the pointer I am giving.
I agree, I made a mistake here. I was referring to the operator new in the DefaultMemoryAllocator class, NOT to placement new. I have used placement new before, I just used the word "placement" because it was mentioned earlier and I am tired.
Okay, I get that you are storing pointers to deleted items in T instances. I will take a look at that again.
My point is, the alignment and the casts can be greatly reduced by using the template code differently. And, the template methods should definitely not be all inlined for the reasons stated in my earlier post.
It is possible to write a template the does the alignment for you, and store T instances in that template. I started to work out the details but it is late. Suffice to say, it would be something like.
pragma pack(sizeof(void *)) template <typename TS>
struct AlignedStruct
{
TS instance;
operator TS *()
{
return &instance;
}
};
pragma pack()
That class is a simplification to save space, and will make the size a bit larger than necessary - Using a technique in "Modern C++ Design" with the int2type template class, the sizeof(AlignedStruct<t>) can be made the same as the aligned value for class T that is calculated in your code now. The more complex solution requires making a few templates for the size ranges below sizeof(void *) and using template specialization and the int2type template class to automatically select the right template. Search for int2type - if nothing else, you will find the concept interesting.
The object pool would be declared as:
template<typename t,="" class="" tmemoryallocator="DefaultMemoryAllocator<AlignedStruct<T"> > >
Now, DefaultMemoryAllocator would allocate a size equal to.
sizeof(AlignedStruct<t> * capacity)
If the memory is declared as
AlignedStruct<t> * _memory;
Then
T * var = _memory;
will do the right thing because of the cast operator declared in AlignedStruct.
The same is true of any method that returns type T *.
Now I understand you will still have to cast for the deleted items, but most of the other code is automatic. By using AlignedStruct<t> internally, you can copy or return T * anytime, and then cast to whatever.
Also, using AlignedStruct<t> internally, you can now do:
address += count;
wheres before it was necessary to do:
address += count * _itemSize;
You can get rid of _itemSize;
Sure, I might be missing something, and there are other details, but the code can be simplified. There are only rare cases in C++ where the compiler cannot handle the size of objects automatically, and, while I was wrong before, casting cannot be totally eliminated because of your deleted items, it can be greatly reduced by making it automatic, along with making the alignment to a pointer size be automatic.
Oh, and I gave you a 5. I am nitpicking. I didn't test the code, but I presume it works and the design is clever. I would be interested in seeing timing data between your class and the corresponding boost class for a moderate size pool used over time.
modified 16-Apr-14 1:59am.
|
|
|
|
|
Okay about the placement new error.
But about the template... even if I can make the allocator deal with the alignment, I don't want to. It means that the allocator will need to know the alignment needed by the dictionary.
I can, of course, make it work as a template. But in that case I will give both the number of items requested and the minimum alignment. Even if today only the pool uses it, I don't want the allocator to be dependent on the pool. So, if it is a template it will receive the right type it needs to deal with and will be able to use the right alignment for it. Yet, I will give an extra requirement that's the alignment I want. But if I do that, I will need to receive a result of the final size used for the "object allocations". I really don't think this will make things easier.
So, this will actually simplify the places that use the cast, but it will complicate other places (the template will be harder, there will be more parameters, when receiving the result I will need to store the size used by the allocator)... so, I will solve a small cast and I will add other complexities.
About the boost comparison, I didn't made one, but I don't know if boost actually has an O(1) pool for individual deletes. As far as I know, boost has the advantage that it deletes all inner items if the pool itself is deleted, but individual deletes are pretty slow. So, they solve different problems (I actually know very rare situations where allocating many objects without deleting them will be acceptable... so in the end I can delete them all... but allocating and deallocating many small objects in a loop, for example, is something common).
About the 5... thank you
|
|
|
|
|
Paul, here's a better solution. Also, if I understand you, when you said "alignment" wasn't the right word, I think what you really meant is that you need the size to be either sizeof(T) or sizeof(void *), whichever is larger. That way, it can hold a T instance and a void * pointer. Of course, sizeof(T) will automatically be an aligned size unless compiler settings or pragmas are used.
Paulo Zemek wrote: But in that case I will give both the number of items requested and the minimum alignment. Even if today only the pool uses it, I don't want the allocator to be dependent on the pool.
The allocator I show is not dependent on the pool. The pool can pass anything it wants to the allocator, including size/alignment/or other information. Memory-related information can be held in the pool, and then passed to the Allocate function. That is the way to do it. I cannot see any downside to that.
By the way, last night I realized a better solution. I'm not sure I have the syntax for a union correct, I haven't used a union in years, but a union solves the issue of choosing the larger of the pointer size or the sizeof(T) size by overlaying them in the same memory.
You can also incorporate size information in the class I declare below by adding calculations that modify the size of the array, but that should not be necessary. In any event, the template class below is independent of the issue of allocation.
template <typename T>
struct TProxy
{
union
{
unsigned char * instance[sizeof(T)];
T * vdeleted;
} u;
TProxy()
{
}
TProxy(T * vdeleted)
{
u.vdeleted = vdeleted;
}
operator=(T * vdeleted)
{
u.vdeleted = = vdeleted;
}
T * GetDeleted() const
{
return u.vdeleted;
}
operator T *()
{
return (T *)(&u.instance[0]);
}
};
If an array of type TProxy<t> is allocated, then no T constructors will run, it will be allocated in O(1) time, and the code can either assign a deleted pointer to the instance or return an instance as T *!
Yes, your code does alignment of sizeof(T). That actually can be added to the class above by adding an 'int' template parameter, but that is more complicated. It requires template meta-programming to dynamically declare an enum value that is used for the size of a padding array of 'unsigned char' that is added after the union. The template meta-programming would also select a different proxy class if no padding was needed. (That's needed because an array of length 0 cannot be declared).
modified 16-Apr-14 21:30pm.
|
|
|
|
|
Bill_Hallahan wrote: Finally, it's pretty standard in C++ to name class data members with the prefix "m_". It's not wrong to use just "_", but it is unusual in C++ code.
Actually, I believe it was Herb Sutter who suggested many years ago to use a trailing '_' rather than a leading 'm_'. Maybe some other C++ Guru.
Personally I prefer a trailing 'anything' (well, trailing '_' - but it really doesn't matter thanks to autocomplete), because it's easier to distinguish variables and take advantage of autocompletion that way!
|
|
|
|
|
I don't know if you are interested in seeing the new version. I didn't change the naming style or the use of NULL, yet I did the following changes:
* In the delete of a block I don't delete the next block anymore. This was making the deletes recursive. So, I use a loop to delete all blocks in the pool itself;
* I declared the copy constructor and equals operator private, as the pool is not meant to be copied;
* I declared the constructor explicit. I pool should not be given where an int is expected;
* I added a template parameter to allow users to use a different memory allocator. The default one uses new/delete (instead of malloc/free), yet malloc/free is still possible if the user wants... as well as using HeapAlloc/HeapFree directly in windows.
|
|
|
|
|
I see - yeah those are fine changes. With equals operator you mean the assignment operator I guess? Because equals == is different from assigning = .
Just one thing that comes out of my mind: I had a similar code that used a linked list (but of course my goal has been different, i.e. another purpose, another program flow). I thought I've been smart. However, managing the code to have no memory leaks has been hard. Nevertheless, I achieved my goal - e.g. valgrind told me that no memory leaks happened in any scenario. Nevertheless I really had a poor performance (some seconds) when adding and traversing thousands (10^5) of items. Now I simply changed everything to a std::vector and I got a few milliseconds.
Also when I changed the whole code to use shared_ptr I ended up with simplifying the whole algorithms. The code is much easier to read, more straight forward and does not perform worse. Instead it is now also possible to make the API better, as other parts of the program can still grab sub-items without me worrying that I am deleting objects that are still in use.
I am just writing this, because I actually believe that any programmer using C++(11) should actually never (there are exclusions of course, but without looking at those special cases) use raw pointers any where in the code. In my opinion using new and delete represents most of the time a code smell, same as malloc and free . Modern C++ style should use make_unique or make_shared . I know your reasons here (for instance you are not using C++11), but this is a general remark I cannot express often enough and independent of this article. With C++14 coming in shortly and C++17 being on the way for specification, I think people need to adopt to the new style. It does not only feel like a new language, in a way it really is.
|
|
|
|
|
About the operator, I have some trouble translating them sometimes, I really mean the assignment one.
About your other concerns, you can actually use a shared_ptr with custom deleters. So, it is possible to keep using the pool for performance while getting the safety from the smart pointers.
Only to comment that I agree with you, I already solved many "long living bugs" (caused by objects deleted too soon, by double deletes, by memory leaks etc) in different companies by using smart pointers (the first time I created my own types, now I use the STL ones when available).
In fact, I never discovered where were the leaks, I simply solved them by using smart pointers.
|
|
|
|
|
Leaks can be really nasty. I sometimes had the case such leaks, where it was nearly impossible to find the real source. I use valgrind a lot for detecting and finding them. But getting to know the real source of leak is still a pain (even though valgrind helps a lot there). Unfortunately valgrind does not work on Windows.
|
|
|
|
|
... here: https://drive.google.com/?pli=1&authuser=0#folders/0B_ZrZ_DR_ygdNDAzOTU2ZDItMmRhMi00ODMyLTg3ZWYtOWFjMmU0YWRlYmUx[^]
( I hope the link works )
I did that implementation some 3-4 years ago, starting from the same realization that boost::pool does not have O(1) complexity on deallocations - in fact deallocations weren't even implemented (the relevant code was documented out, even though the functions existed). Seems I didn't upload the text documentation, but I did add some class diagrams that might give you an idea of how I did it.
I also did a comparison on my computer comparing the performance of repeated allocations (up to 10 million) and deallocations, or , more realistically, randomly alternating between allocations and deallocations. I've found my implementation to be consistently faster, and standard new/delete to be at least 5 times slower in the random allo-/deallocation scenario (there's a spread sheet showing my results in the document folder linked above)
Like your implementation, mine is not thread-safe. Also, it always returns a specifically defined smartpointer to the object rather than a raw pointer, and that smartpointer is not interchangable with common smartpointers or raw pointers, as it holds the reference to the actual allocation object. I did need an indirect pointer to be able to manage both the allocation and deallocation in O(1), so I made the best of it and added reference counting as well
I'll have to take a closer look on your implementation to see if you found a better way to solve the O(1) problem, but on first glance it seems you've done sth similar.
|
|
|
|
|
|
|
|
In the constructor of ObjectPool you assign to _nodeMemory a newly allocated block, by using malloc . This is wrong, because you are leaking that memory; it must be:
_nodeMemory = _firstNode->memory;
Also, think about adding a variadic template version of New to allow users to directly call other constructors instead of being forced to get raw memory and then call placement constructor.
Ale
|
|
|
|
|
Good spot. Actually I don't know how I did publish this version. I had already corrected such a bug.
In any case, thanks for pointing it out and I will try to publish a corrected version today.
About the variadic template, I don't know if it is worth. After all the users can always wrap this class with another one. I will consider using variadic templates.
But I am actually thinking about using a template parameter to do the memory allocation and deallocation, so if users don't want the malloc(), they will be able to use an alternative.
modified 21-Mar-14 7:30am.
|
|
|
|
|
Actually, I can't use variadic templates. I am still using Visual Studio 2012. In 2013 it seems that variadic templates are available, but I want to make this code useable by those who can't still use the newest version.
Ah... and I updated the article and the download with the bugfix.
Thanks!
|
|
|
|
|
Using old software is bad but you can still use #ifdef s to exclude code your version doesn't support.
Ale
|
|
|
|
|
I know that I can use #ifdef s, but then the code that uses the library would probably need #ifdef s too.
So, if I want to level things considering less resources, it is better to keep things like they are. It is always possible to create another class that uses this class and adds the new method (VariadicObjectPool maybe?)
In any case, I will consider this when I stop using the old version of Visual Studio (I actually use both 2012 and 2013... but in some places I need to use 2010 too).
And I thank you for spotting the bug.
|
|
|
|
|
Actually I was planning to wrote mine 2 years ago but gave up at the design stage because without language, compiler and runtime support, I cannot keep track of the roots and it is difficult to write non-intrusive GC library. My initial design uses smart pointer/reference or inherits from a common base class. Curious on how you approach yours.
|
|
|
|
|
In fact the garbage collector needs some "help" from the users. Instead of simply using SomeClass * you must use Root<SomeClass> or Reference<SomeClass>.
But I will let you know when I publish the article.
|
|
|
|
|
Saw your dilemma to decide between 'ObjectPool' and 'MemoryPool'. How about 'ObjectMemoryPool'? Or 'ObjectAllocator'?
However, nice solution!
|
|
|
|
|