Click here to Skip to main content
15,885,278 members
Articles / Programming Languages / C++

Simple Variable-Size Memory Block Allocator

Rate me:
Please Sign up or sign in to vote.
4.82/5 (10 votes)
13 Apr 2017CPOL5 min read 20.3K   21   13   8
This article presents an implementation of a lightweight variable-size memory block allocator that is used in Diligent Engine 2.0

Disclaimer: This article is a repost of the original article from Diligent Engine web site.

Introduction

This article describes a lightweight class that handles free memory block management to accommodate variable-size allocation requests. This manager has a very specific goal, which is not to replace or beat standard new/delete. Rather, this class is used by the descriptor heap management system Diligent Engine to handle descriptor heap allocations (the system is thoroughly discussed in this post). Descriptor heap is a region of memory that is allocated by Direct3D12 system. Direct3D12 only provides virtual address of raw space, and it is the application's responsibility to manage this space. The class can be useful in similar scenarios where managing unstructured space is necessary. For instance, it can be used as base allocator for fixed-size block allocators.

The class keeps track of free blocks of memory, and facilitates allocation by finding the smallest block that provides enough space. Deallocated chunks of memory are merged with the existing free blocks. Runtime complexity is logarithmic with respect to the number of blocks for both allocation and deallocation.

Allocator Description

The allocation manager keeps track of free blocks only and does not record allocation sizes. The class uses two ordered maps to facilitate operations. The first map keeps blocks sorted by their offsets. The second multimap keeps blocks sorted by their sizes. The elements of the two maps reference each other, which enables efficient block insertion, removal and merging. Note that since we need ordered access, all operations with the maps (search/insert/delete) take logarithmic time on the number of blocks. The two maps are defined as shown below:

C++
typedef size_t OffsetType;
struct FreeBlockInfo;
// Type of the map that keeps memory blocks sorted by their offsets
using TFreeBlocksByOffsetMap = std::map<OffsetType, FreeBlockInfo>;

// Type of the map that keeps memory blocks sorted by their sizes
using TFreeBlocksBySizeMap = std::multimap<OffsetType, TFreeBlocksByOffsetMap::iterator>;

struct FreeBlockInfo
{
    // Block size (no reserved space for the size of allocation)
    OffsetType Size;

    // Iterator referencing this block in the multimap sorted by the block size
    TFreeBlocksBySizeMap::iterator OrderBySizeIt;

    FreeBlockInfo(OffsetType _Size) : Size(_Size){}
};

TFreeBlocksByOffsetMap m_FreeBlocksByOffset;
TFreeBlocksBySizeMap m_FreeBlocksBySize;

Note that the second map is a multimap because there may be several blocks of the same size. To the contrary, there could be only single block at the specific offset. An example for the case of four free blocks is given in the figure below:

Image 1

Note that the Size member of the FreeBlockInfo structure is redundant as it is equal to OrderBySizeIt->first. It however makes implementation more readable and clear.

Allocation

To allocate a new block, we use the second map to find the smallest block that is large enough to encompass the requested size. multimap::lower_bound standard function does exactly what we need:

C++
auto SmallestBlockItIt = m_FreeBlocksBySize.lower_bound(Size);
if(SmallestBlockItIt == m_FreeBlocksBySize.end())
    return InvalidOffset;
auto SmallestBlockIt = SmallestBlockItIt->second;

This operation takes logarithmic time on the size of the container. Offset to the beginning of the requested allocation will then be given by the beginning of this block:

C++
auto Offset = SmallestBlockIt->first;

The offset of the new free block will be given by adding requested size to the original offset, and the new size will be given by subtracting the requested memory size from the original block size:

C++
auto NewOffset = Offset + Size;
auto NewSize = SmallestBlockIt->second.Size - Size;

The final step is to update the maps. Since both block size and offset have been changed, we have to first remove original elements from the containers, and if the new block is not empty, insert new elements into the maps:

C++
m_FreeBlocksBySize.erase(SmallestBlockItIt);
m_FreeBlocksByOffset.erase(SmallestBlockIt);
if (NewSize > 0)
     AddNewBlock(NewOffset, NewSize);

AddNewBlock() is a helper function that inserts a new block of given size at the specified offset:

C++
void AddNewBlock(OffsetType Offset, OffsetType Size)
{
    auto NewBlockIt = m_FreeBlocksByOffset.emplace(Offset, Size);
    auto OrderIt = m_FreeBlocksBySize.emplace(Size, NewBlockIt.first);
    NewBlockIt.first->second.OrderBySizeIt = OrderIt;
}

Both insertion and deletion of an element into an ordered map take logarithmic time, so the total method complexity is O(log n).

As an example, if we request the block of size 20 bytes, the system will return offset 32 and the structure of the blocks after the allocation will look like shown in the figure below:

Image 2

The full function source code is given in the following listing:

C++
OffsetType Allocate(OffsetType Size)
{
    if(m_FreeSize < Size)
        return InvalidOffset;

    // Get the first block that is large enough to encompass Size bytes
    auto SmallestBlockItIt = m_FreeBlocksBySize.lower_bound(Size);
    if(SmallestBlockItIt == m_FreeBlocksBySize.end())
        return InvalidOffset;

    auto SmallestBlockIt = SmallestBlockItIt->second;
    auto Offset = SmallestBlockIt->first;
    auto NewOffset = Offset + Size;
    auto NewSize = SmallestBlockIt->second.Size - Size;
    m_FreeBlocksBySize.erase(SmallestBlockItIt);
    m_FreeBlocksByOffset.erase(SmallestBlockIt);
    if (NewSize > 0)
        AddNewBlock(NewOffset, NewSize);

    m_FreeSize -= Size;
    return Offset;
}

Deallocation

Deallocation is little more involved as there are more cases we need to handle. The first step is to understand where the new block should be inserted. For that purpose, we can use the first map that keeps memory blocks sorted by the offsets. Since we know the offset of the block to be deallocated, we can use map::upper_bound to find a block that goes right after it. Using the obtained interator, we can then find preceding free block:

C++
// Find the first element whose offset is greater than the specified offset
auto NextBlockIt = m_FreeBlocksByOffset.upper_bound(Offset);
auto PrevBlockIt = NextBlockIt;
if(PrevBlockIt != m_FreeBlocksByOffset.begin())
    --PrevBlockIt;
else
    PrevBlockIt = m_FreeBlocksByOffset.end();

Next, we need to handle four possible cases:

  1. The block being released is not adjacent to any other free block. In this case, we simply add the block as a new free block.
  2. The block is adjacent to the previous block. The two blocks need to be merged.
  3. The block is adjacent to the next block. The two blocks need to be merged.
  4. The block is adjacent to both previous and next blocks. All three blocks need to be merged.

The full function listing is given below:

C++
void Free(OffsetType Offset, OffsetType Size)
{
    // Find the first element whose offset is greater than the specified offset
    auto NextBlockIt = m_FreeBlocksByOffset.upper_bound(Offset);
    auto PrevBlockIt = NextBlockIt;
    if(PrevBlockIt != m_FreeBlocksByOffset.begin())
        --PrevBlockIt;
    else
        PrevBlockIt = m_FreeBlocksByOffset.end();

    OffsetType NewSize, NewOffset;
    if(PrevBlockIt != m_FreeBlocksByOffset.end() && 
          Offset == PrevBlockIt->first + PrevBlockIt->second.Size)
    {
        // PrevBlock.Offset           Offset
        // |                          |
        // |<-----PrevBlock.Size----->|<------Size-------->|
        //
        NewSize = PrevBlockIt->second.Size + Size;
        NewOffset = PrevBlockIt->first;

        if (NextBlockIt != m_FreeBlocksByOffset.end() && Offset + Size == NextBlockIt->first)
        {
            // PrevBlock.Offset           Offset               NextBlock.Offset
            // |                          |                    |
            // |<-----PrevBlock.Size----->|<------Size-------->|<-----NextBlock.Size----->|
            //
            NewSize += NextBlockIt->second.Size;
            m_FreeBlocksBySize.erase(PrevBlockIt->second.OrderBySizeIt);
            m_FreeBlocksBySize.erase(NextBlockIt->second.OrderBySizeIt);
            // Delete the range of two blocks
            ++NextBlockIt;
            m_FreeBlocks.erase(PrevBlockIt, NextBlockIt);
        }
        else
        {
            // PrevBlock.Offset           Offset                       NextBlock.Offset
            // |                          |                            |
            // |<-----PrevBlock.Size----->|<------Size-------->| ~ ~ ~ |<-----NextBlock.Size----->|
            //
            m_FreeBlocksBySize.erase(PrevBlockIt->second.OrderBySizeIt);
            m_FreeBlocksByOffset.erase(PrevBlockIt);
        }
    }
    else if (NextBlockIt != m_FreeBlocksByOffset.end() && Offset + Size == NextBlockIt->first)
    {
        // PrevBlock.Offset                   Offset               NextBlock.Offset
        // |                                  |                    |
        // |<-----PrevBlock.Size----->| ~ ~ ~ |<------Size-------->|<-----NextBlock.Size----->|
        //
        NewSize = Size + NextBlockIt->second.Size;
        NewOffset = Offset;
        m_FreeBlocksBySize.erase(NextBlockIt->second.OrderBySizeIt);
        m_FreeBlocksByOffset.erase(NextBlockIt);
    }
    else
    {
        // PrevBlock.Offset                   Offset                       NextBlock.Offset
        // |                                  |                            |
        // |<-----PrevBlock.Size----->| ~ ~ ~ |<------Size-------->| ~ ~ ~ |<-----NextBlock.Size----->|
        //
        NewSize = Size;
        NewOffset = Offset;
    }

    AddNewBlock(NewOffset, NewSize);

    m_FreeSize += Size;
}

The run-time complexity of deallocation routine is logarithmic with respect to the number of free blocks. The figure below shows an example of deallocating a block of size 8 at offset 56, that results in a merge of the block with the previous and next free blocks:

Image 3

Variable-Size GPU Allocations Manager

When managing GPU memory, deallocations can only be performed when the memory is no longer accessed by the GPU. A simple extension of the basic allocator allows deferring block deallocation until the time when it is safe to do so. The class provides new Free() method that takes frame number along with the block offset and size. The method adds the block attributes in the deletion queue, and releases all the stale allocations at once at the end of each frame:

C++
class VariableSizeGPUAllocationsManager : public VariableSizeAllocationsManager
{
private:
     struct FreedAllocationInfo
     {
        OffsetType Offset;
        OffsetType Size;
        Uint64 FrameNumber;
    };

public:
    VariableSizeGPUAllocationsManager(OffsetType MaxSize, IMemoryAllocator &Allocator);
    ~VariableSizeGPUAllocationsManager();
    VariableSizeGPUAllocationsManager(VariableSizeGPUAllocationsManager&& rhs);

    VariableSizeGPUAllocationsManager& operator = (VariableSizeGPUAllocationsManager&& rhs) = default;
    VariableSizeGPUAllocationsManager(const VariableSizeGPUAllocationsManager&) = delete;
    VariableSizeGPUAllocationsManager& operator = (const VariableSizeGPUAllocationsManager&) = delete;

    void Free(OffsetType Offset, OffsetType Size, Uint64 FrameNumber)
    {
        // Do not release the block immediately, but add
        // it to the queue instead
        m_StaleAllocations.emplace_back(Offset, Size, FrameNumber);
    }

    void ReleaseCompletedFrames(Uint64 NumCompletedFrames)
    {
        // Free all allocations from the beginning of the queue that belong to completed frames
        while(!m_StaleAllocations.empty() && m_StaleAllocations.front().FrameNumber < NumCompletedFrames)
        {
            auto &OldestAllocation = m_StaleAllocations.front();
            VariableSizeAllocationsManager::Free(OldestAllocation.Offset, OldestAllocation.Size);
            m_StaleAllocations.pop_front();
        }
    }

 private:
    std::deque< FreedAllocationInfo > m_StaleAllocations;
 };

Discussion

The implementation can be easily extended so that the allocation size is not required for the deallocation routine. For that, the allocated blocks need to be extended to hold the size. The allocator provides a lot of flexibility as it can handle allocations of any (potentially very large) size. Its run-time complexity is logarithmic with respect to the number of free blocks as opposed to constant-time complexity for fixed-size block allocators. Most of the allocations through the class happen at the initialization time, so it is not used in hot paths of the engine, and the performance is acceptable.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United States United States
Being a 3D graphics enthusiast for many years, I have worked on various rendering technologies including deformable terrain, physically-based water, shadows, volumetric and post-processing effects and other. I run Diligent Graphics as a place where I can experiment, learn new technologies, try new algorithms and share my ideas.

Comments and Discussions

 
QuestionI made a C++ wrapper class that works on iOS/OSX/and more.. Pin
mewza6-Jul-23 22:03
mewza6-Jul-23 22:03 
QuestionWhat is your motivation/goal? Pin
Stefan_Lang11-Apr-17 21:28
Stefan_Lang11-Apr-17 21:28 
AnswerRe: What is your motivation/goal? Pin
EgorYusov13-Apr-17 5:50
EgorYusov13-Apr-17 5:50 
GeneralRe: What is your motivation/goal? Pin
Stefan_Lang17-Apr-17 19:49
Stefan_Lang17-Apr-17 19:49 
GeneralRe: What is your motivation/goal? Pin
EgorYusov17-Apr-17 20:48
EgorYusov17-Apr-17 20:48 
QuestionAny advantages and good usage scenarios? Pin
ehaerim11-Apr-17 20:56
ehaerim11-Apr-17 20:56 
Good work!

But

An Efficient C++ Fixed Block Memory Allocator[^]

Replace malloc/free with a Fast Fixed Block Memory Allocator[^]

the above two articles present a great memory allocators for both fixed and variable size allocation.
As far as I remember they are constant time compared to the logarithmic time scale.

Q1) What advantages over the above two allocators does your allocator have?

Q2) Present a few good usage samples if you have.

Regards
HaeRim
PraiseRe: Any advantages and good usage scenarios? Pin
Stefan_Lang12-Apr-17 21:02
Stefan_Lang12-Apr-17 21:02 
AnswerRe: Any advantages and good usage scenarios? Pin
EgorYusov13-Apr-17 16:03
EgorYusov13-Apr-17 16:03 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.