Containers IE Architect Advanced Beginner Intermediate Dev C++

Inherited Lists

Yves Florido-Monnier

4.85/5 (4 votes)

Jan 22, 2020

CPOL

5 min read

9313

114

An efficient and simple implementation of the one-to-many relationship in C++

Download source code - 1.6 KB

Introduction

This article deals with the algorithmic complexity induced by the common implementation of the one-to-many relationship. It proposes a simple and efficient alternative in C++.

Background

The relationship between objects is one of the main topics a developer has to deal with in object oriented programming. These relationships are classified according their cardinality: one-to-one, one-to-many, many-to-many,

For the one-to-many, the various resources (books, tutorial, forum) available to developers offers, in my experience, a terribly inefficient implementation from the point of view of algorithmic complexity.

These implementations mainly use aggregated arrays or lists, some using sets. This example code, applied to the subject-observer pattern, illustrates my point.

class Observer
{
   Subject* _subject;

public:
   Observer() : _subject(nullptr) { }
   ~Observer() { detach(); }

   void attach(Subject* s)
   {
      detach();
      if(_subject = s)
         _subjet->attach(this);
   }
   void detach()
   {
      if(_subject)
         _subject->detach(this);
      _subject = nullptr;
   }
};

class Subject
{
   friend class Observer;

  // Obviously, only one of these containers should be used at a time
  array<Observer*> _observers;
  list<Observer*> _observers;
  set<Observer*> _observers;

  void _notify() { ... }
  void _attach(Observer* o)
  {
      // this is pseudo-code
      // the proper method should be called according to the container specification
     _observers.insert(o);
  }
  void _detach(Observer* o)
  {
     // this is pseudo-code
     // the proper method should be called according to the container specification
     _observers.remove(o);
  }
  ~Subject()
  {
     // this is pseudo-code
     while(_observers.size())
        _observers[0]->detach();
  }
};

On an array container, the complexity of the remove is in O (n). First, the position of the observer must be sought, then the following elements of the table must be moved to fill the vacancy left by removing the observer.

On a list container, the complexity is identical, the problem being similar. The position of the observer must first be sought before its removing. The search for the element leads to complexity in O(n), not the remove itself.

On a set container, the complexity depends on the implementation of the set. On a binary tree (such as the std::set), the insertion as the remove will be done in O (log2 (n)).

To sum up, all the complexity comes from the need to find the position of the observer in the container that points to it.

Simple but Efficient Alternative

The find complexity can be resolved in a very simple way, by using a double linked list implementation directly inside the Observer class.

class Observer
{
   friend class Subject;

   Subject*  _subject;
   Observer* _prev;
   Observer* _next;

public:
   Observer() : _subject(nullptr) , _prev(nullptr) , _next(nullptr) { }
   ~Observer();           // same implementation
   void attach(Subject*); // same implementation
   void detach();         // same implementation
};

class Subject
{
   friend class Observer;

   Observer* _first;

   void attach(Observer* o)
   {
      o->_next = _first;
      if(_first)
         _first->_prev = o;
      _first = o;
   }
   void detach(Observer* o)
   {
      if(_first == o)
         _first = o->_next;
      if(o->_prev)
         o->_prev->_next = o->_next;
      if(o->_next)
         o->_next->_prev = o->_prev;
      o->_prev = o->_next = nullptr;
   }
};

Because the observer knows its position in the list (_prev and _next members), there is no need to search for it. Complexity is O(1).

Implementation of attach or detach is straightforward,

It occupies a memory space similar to a list or a set, and two times that of an array.

Its only fault is the presence of naked pointers (not fashionable today), and it requires rewriting the same few lines of code for each new relationship.

Using Inheritance to Resolve the Last Faults

The members and attach/detach methods of the related classes can be moved to a pair of template classes: inList and inCell. The "in" prefix stands for inheritance.

template <typename LIST, typename CELL>
class inCell
{
 public: 
   inList<LIST,CELL>* _list;
   inCell<LIST,CELL>* _prev;
   inCell<LIST,CELL>* _next;
 
   inCell() : _list(nullptr) , _prev(nullptr) , _next(nullptr) { }
   ~inCell();                       // same implementation
   void attach(inList<LIST,CELL>*); // same implementation
   void detach();                   // same implementation
};

template <typename LIST, typename CELL>
class inList
{
public:
   inCell<LIST,CELL>* _first;

   void attach(inCell<LIST,CELL>*); // same implementation
   void detach(inCell<LIST,CELL>*); // same implementation
};

Then, each implementation of the one-to-many relationship is ensured by an inheritance. For design questions, private inheritance will be privileged with the writing of facade methods.

class Observer : private inCell<Subject,Observer>
{
public:
   void attach(Subject* s) { InCell<Subject,Observer>::attach(s); }
   void detach()           { InCell<Subject,Observer>::detach(); } 
};

class Subject : private inList<Subject,Observer>
{
   friend class Observer;
};

For the subject-observer, the one-to-many relationship is guaranteed, "sealed" forever in the inList-inCell implementation. This is not the case for the "common" implementation, where few lines have to be written for each new relation.

Because C++ allows multiple inheritance, all needed one-to-many relationships between classes can be achieved easily this way. Here is an example with 3 related classes.

class Record : private inCell<Artist,Record>
{
};

class Label : private inList<Label,Artist>
{
   friend class Artist;
};

class Artist : private inList<Artist,Record>,
               private inCell<Label,Artist>
{
   friend class Record;

   void sign(Label* l) { inCell<Label,Artist>::attach(l); }
   ...
};

In addition, this way of doing things is very expressive. The one-to-many relationships are clearly visible, grouped in one place.

Benchmarks

For the test, vector, list and set come from the standard library.

The test consists in attaching N observers to the subject, then in detaching them. The order of detachment is important for measuring algorithmic complexity. To simulate real use cases, the observers located in the middle of the container will be detached, to avoid favoring the first or last positions. In the particular case of a list container, if the removed observers are always at the first position, the execution time will be in O (1) (not visible in this test).

Inherited lists (inList) are linear in execution. The execution time is normalized to inList execution time.

Attach

All container types are slower in execution than inList.

All are linear in execution time, except std::set in O(log2(n)).

vector becomes relatively faster as the container grows in size. This is because of the reallocation strategy of the vector.

list needs to allocate/free each cell contained in it, which is time consuming. This is not the case for inCell, because the observer is the cell itself.

	attach
	vector	list	set	inList
10	20	16	23	1
100	11	36	61	1
1000	7	40	86	1
10000	7	42	101*	1

(*) the attach of the 10k elements in the set are achieved in 0.0015 seconds on a common laptop today.

Detach

vector and list are in O(n). If the set grows by 10, the execution time, relatively to inList becomes 10 times longer.

The set is in O(log2(n)). Each time the container doubles in size, the execution time becomes one step slower.

	detach
	vector	list	set	inList
10	5	15	29	1
100	42	71	80	1
1000	575	703	187	1
10000	5135	8681*	233	1

(*) the detach of the 10k elements in the list are achieved in 0.05 seconds on a common laptop today.

Discussions

A common implementation (i.e., with aggregated vector/list/set) may suffice for some needs; containers may be small in size, attach/detach happens rarely, computers are fast enough, etc.

However, the inherited list is always the best, far ahead, in runtime, regardless of the size of the set. It is easy to implement and understand, robust in its design, clear in its presentation, and in O(1). In short, it deserves to be known to all.

I would like to pay tribute to my mentor, Michel Ziackovic ("Rendre à César ce qui est à César" in French). He taught me this way of doing things 25 years ago. He was the architect of sophisticated CAD software where thousands of one-to-many relationships, for hundred of classes, are still implemented this way.

History

23^rd January, 2020: Initial version