Click here to Skip to main content
15,569,768 members
Articles / Programming Languages / C++
Tip/Trick
Posted 15 Feb 2022

Tagged as

Stats

5.7K views
82 downloads
2 bookmarked

vectormap: A C++ Map Class with Order-added Iteration

Rate me:
Please Sign up or sign in to vote.
4.11/5 (2 votes)
16 Feb 2022CPOL2 min read
Use vectormap when you want the fast lookup of unordered_map, and the order-added iteration of vector
unordered_map provides great speed, but what if you want to maintain the order items were added to the map? vectormap combines a vector of pairs and an unordered_map to save the day.

Introduction

New programmers might assume that when they add multiple key-value pairs to a map data structure that the order items were added would be the order you get iterating over the data structure.

Not so!

The C++ Standard Library provides two map data structures, one named, get this, map, the other, unordered_map.

The map class is implemented using a red-black tree, a binary tree that balances itself when items are added to or removed. When you iterate over a map, you get the sort order of the keys. So if you add C, A, B, or any other order, when you iterate, you will always get A, B, C. If that ordering is what you want, you've got it for free...almost. Searching for an item takes O(log(n)) time, so you pay a price for that ordering by keys.

The unordered_map class is implemented as a hash table. As the name implies, there is no ordering for iteration provided at all. Instead, you get sort of a random order, as the items added scatter across the buckets of the hash table. The benefit of unordered_map over map is that searching for items takes O(1).

So what's to be done if you want fast searches and order-added iteration? Read on!

Enter vectormap

C++
#pragma once

#include <stdexcept>
#include <unordered_map>
#include <vector>

namespace mscript
{
    /// <summary>
    /// A vectormap combines a map with its fast key->value lookups
    /// with a vector to preserve the order that key-value pairs were added
    /// </summary>
    /// <typeparam name="K">Key type of the map</typeparam>
    /// <typeparam name="V">Value type of the map</typeparam>
    template <typename K, typename V>
    class vectormap
    {
    public:
        /// <summary>
        /// Access the vector of pairs directly for index access or iteration
        /// </summary>
        const std::vector<std::pair<K, V>>& vec() const
        {
            return m_vec;
        }

        /// <summary>
        /// What is the value for a given key?
        /// NOTE: By returning the value by, well, value
        ///       this function can stay const
        ///       and users of the class can use pointers
        ///       if they want side effects down the line.
        /// </summary>
        V get(const K& key) const
        {
            const auto& it = m_map.find(key);
            if (it == m_map.end())
                throw std::runtime_error("key not found");
            return m_vec[it->second].second;
        }

        /// <summary>
        /// What is the value at a given index?
        /// NOTE: By returning the value by, well, value
        ///       this function can stay const
        ///       and users of the class can use pointers
        ///       if they want side effects down the line.
        /// </summary>
        V getAt(size_t index) const
        {
            if (index >= m_vec.size())
                throw std::runtime_error("index out of range");
            return m_vec[index].second;
        }

        /// <summary>
        /// How many key-value pairs are in this?
        /// </summary>
        size_t size() const
        {
            return m_vec.size();
        }

        /// <summary>
        /// Does this map contain this key?
        /// </summary>
        bool contains(const K& key) const
        {
            return m_map.find(key) != m_map.end();
        }

        /// <summary>
        /// Associate a value with a key
        /// </summary>
        void set(const K& key, const V& val)
        {
            auto it = m_map.find(key);
            if (it == m_map.end())
            {
                m_map.insert({ key, m_vec.size() });
                m_vec.push_back({ key, val });
            }
            else
                m_vec[it->second].second = val;
        }

        /// <summary>
        /// Get a value, checking if it exists first
        /// </summary>
        /// <param name="key">Key value to look up</param>
        /// <param name="val">Value to populate</param>
        /// <returns>true if a value exists for the key, false otherwise</returns>
        bool tryGet(const K& key, V& val) const
        {
            const auto& it = m_map.find(key);
            if (it == m_map.end())
                return false;

            val = m_vec[it->second].second;
            return true;
        }

        /// <summary>
        /// Get a list of the keys of this map
        /// </summary>
        std::vector<K> keys() const
        {
            std::vector<K> retVal;
            retVal.reserve(m_vec.size());
            for (size_t k = 0; k < m_vec.size(); ++k)
                retVal.push_back(m_vec[k].first);
            return retVal;
        }

        /// <summary>
        /// Get a list of the values of this map
        /// </summary>
        std::vector<V> values() const
        {
            std::vector<V> retVal;
            retVal.reserve(m_vec.size());
            for (size_t v = 0; v < m_vec.size(); ++v)
                retVal.push_back(m_vec[v].second);
            return retVal;
        }

    private:
        std::unordered_map<K, size_t> m_map;
        std::vector<std::pair<K, V>> m_vec;
    };
}

A pretty straightforward unordered_map + vector wrapper class. Just use the vec() function to walk the items in the order in which they were added, and use contains() and get/tryGet() for fast item lookups.

Update February 15, 2022: Per Paulo Zemek's feedback about set() making a poor bargain with the programmer, I took his advice and stashed the vector's indexes into the map's value entries.  Now when you update the map, you know which vector element to update.  The updated class passed good unit tests and the calling application's unit and smoke tests all pass, too, with no modificiation to application or tests. Good times.

 

Update February 18, 2020: Upon further reflection, having the map values in both the unordered_map and the vector was duplicating the memory usage.  Now the map values are just indexes into the vector which has the values.

Conclusion

If you ever need a data structure with fast searching and order-added iteration, look no further than vectormap. Enjoy!

History

  • 15th February, 2022: Initial version and first revision
  • 18th February, 2022: Second revision

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
United States United States
Michael Balloni is a manager of software development at a cybersecurity software and services provider.

Check out https://www.michaelballoni.com for all the programming fun he's done over the years.

He has been developing software since 1994, back when Mosaic was the web browser of choice. IE 4.0 changed the world, and Michael rode that wave for five years at a .com that was a cloud storage system before the term "cloud" meant anything. He moved on to a medical imaging gig for seven years, working up and down the architecture of a million-lines-code C++ system.

Michael has been at his current cybersecurity gig since then, making his way into management. He still loves to code, so he sneaks in as much as he can at work and at home.

Comments and Discussions

 
QuestionThe set shows the problem with this approach. Pin
Paulo Zemek15-Feb-22 12:39
Paulo Zemek15-Feb-22 12:39 
AnswerRe: The set shows the problem with this approach. Pin
Michael Sydney Balloni15-Feb-22 15:16
mvaMichael Sydney Balloni15-Feb-22 15:16 
GeneralRe: The set shows the problem with this approach. Pin
Paulo Zemek15-Feb-22 15:23
Paulo Zemek15-Feb-22 15:23 
I agree that adding only is more common, so the set situation on itself could be avoided.
Anyways, I am really glad you liked my suggestion to get a better set performance.

This is one of those cases where I don't know what's the best solution, as having a set means it is better to also store the indexes (and so, complicates the class for other uses), where having only adds or a slow performance set is just that, a lack of or a bad set method...
I would love to have a perfect solution, but it seems we need to choose one, the other, or have both available, but then I also worry how much this can become a problem if we make this kind of double class for every problem.

GeneralRe: The set shows the problem with this approach. Pin
Michael Sydney Balloni15-Feb-22 17:50
mvaMichael Sydney Balloni15-Feb-22 17:50 
GeneralRe: The set shows the problem with this approach. Pin
Paulo Zemek16-Feb-22 9:31
Paulo Zemek16-Feb-22 9:31 
GeneralRe: The set shows the problem with this approach. Pin
Michael Sydney Balloni16-Feb-22 16:27
mvaMichael Sydney Balloni16-Feb-22 16:27 
GeneralRe: The set shows the problem with this approach. Pin
Paulo Zemek16-Feb-22 21:59
Paulo Zemek16-Feb-22 21:59 
AnswerRe: The set shows the problem with this approach. Pin
ElectronProgrammer17-Feb-22 6:02
ElectronProgrammer17-Feb-22 6:02 
GeneralRe: The set shows the problem with this approach. Pin
Michael Sydney Balloni17-Feb-22 8:26
mvaMichael Sydney Balloni17-Feb-22 8:26 
GeneralRe: The set shows the problem with this approach. Pin
Michael Sydney Balloni18-Feb-22 7:16
mvaMichael Sydney Balloni18-Feb-22 7:16 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.