Click here to Skip to main content
15,881,709 members
Articles / Programming Languages / C#

A Generic, Fast implementation of a 2dimensional dense array in C#

Rate me:
Please Sign up or sign in to vote.
4.68/5 (9 votes)
17 Dec 2015CPOL4 min read 26.2K   191   14   10
A brief explanation of why 2d array is slow, and why not to use jagged array and howto overcome the problems using the dense array storage

Introduction

Multi dimensional arrays exist in almost every programming language, but generally they are not the best solution to any problem because of various reasons. In this article I will show the different approaches offered by C# to create a 2D array. The article focuses primarialy on these, because it's the most common use of multidimensional arrays.

The "standard" way

C# rovides two types of arrays. The "standard" multidimensional array which has the following syntax:

C#
var myarray = new int[2, 3];

The First number specifies the number of rows, while the second one specifies the number of columns. It has a neat, and understandable syntax, nothing fancy or complicated. But the biggest catch is the performance. If you have to pass this array to a function, then it will be painfully slow.

The slowness comes from the internals of the CLR. A multidimensional array doesn't have a Length property. Instead it has a GetLength(int dimension) method that returns the specified dimensions length. This requires more arithmetic than a standard Length property call, because length returns a memory segment, that can be compared, and worked with.

Jagged array

Another, better way is to use a Jagged array. A Jagged array is a one dimensional array of one dimensional arrays. Because of this layout the rows can differ in length. An extra overhead of memory consumption is added this way. The jagged array has the following syntax:

C#
var jaggedArray = new int[2][];
for (int i=0; i<2; i++) jaggedArray[i] = new int[3];

The extra memory overhead comes from that we are storing an array of arrays. Each array has some internal properties that needed to be stored. This can be a big overhead, if we are dealing with a large array. Also notice the extra initialization cycle that is needed for the initialization of the sub arrays.

Dense array

Beyond these two approaches exists another implementation, which is called dense array. The dense term comes from matrices. A dense array is actually a one dimensional array indexed in a special way, so you can fit a 2 dimensional array into a one dimensial one.

This has some very good benefits, for example you could easily copy the elements, with an Array.Copy call. The downside of this is that you have to use some fancy indexing, which can lead to a serious performance issue. Typically you would calculate the element index for a given row, and column with the following method:

C#
var index = column * rows + row;
array[index] = 33;

The biggest catch in this formula is the multiplication. Modern compilers and CPUs handle multiplication very well, but at large numbers this takes some time. If you have a 10000x10000 array, then you have to perform 100000000 multiplications on every indexing. This problem can be solved by storing the column and row multiplication value in another array, and looking up the value at indexing, sure it costs some extra memory, but the performance won't be as bad as multiplying every time.

The Implementation

Implementing the dense array method every time is bit of an overkill. Luckily we have Generics in c# since .net 2.0, so I came up with a generic Dense array class. The implementation is currently in a proof of concept stage, but it can easily extended to a full featured implementation.

namespace DenseArray
{
    public class DenseArray<T> : IEnumerable<T>
    {
        private T[] _dataarray;
        private int[] _columnindexes;

        /// <summary>
        /// Calculates the column indexes
        /// </summary>
        private void CalculateColumnIndexes()
        {
            _columnindexes = new int[Columns];
            for (int i=0; i<Columns; i++)
            {
                _columnindexes[i] = i * Rows;
            }
        }

        /// <summary>
        /// Creates a new instance of DenseArray
        /// </summary>
        /// <param name="rows">Number of rows</param>
        /// <param name="columns">Number of columns</param>
        public DenseArray(int rows, int columns)
        {
            _dataarray = new T[rows * columns];
            Rows = rows;
            Columns = columns;
            CalculateColumnIndexes();
        }

        /// <summary>
        /// Creates a new instance of DenseArray
        /// </summary>
        /// <param name="source">Source DenseArray</param>
        public DenseArray(DenseArray<T> source)
        {
            _dataarray = new T[source.Rows * source.Columns];
            Rows = source.Rows;
            Columns = source.Columns;
            Array.Copy(source._dataarray, this._dataarray, source._dataarray.LongLength);
            _columnindexes = new int[Columns];
            Array.Copy(source._columnindexes,
                       this._columnindexes,
                       source._columnindexes.LongLength);
        }

        /// <summary>
        /// Creates a new instance of DenseArray
        /// </summary>
        /// <param name="array">source 2d array</param>
        public DenseArray(T[,] array)
        {
            Rows = array.GetLength(0);
            Columns = array.GetLength(1);
            _dataarray = new T[Rows * Columns];
            CalculateColumnIndexes();
            for (int i = 0; i < Rows; i++)
            {
                for (int j = 0; j < Columns; j++)
                {
                    this[i, j] = array[i, j];
                }
            }
        }

        /// <summary>
        /// Gets the number of columns
        /// </summary>
        public int Columns { get; private set; }

        /// <summary>
        /// Gets the number of rows
        /// </summary>
        public int Rows { get; private set; }

        /// <summary>
        /// Gets or sets an element of the array
        /// </summary>
        /// <param name="row">Row index</param>
        /// <param name="column">Column index</param>
        /// <returns>Value at row and column index</returns>
        public T this[int row, int column]
        {
            get { return _dataarray[_columnindexes[column] + row]; }
            set { _dataarray[_columnindexes[column] + row] = value; }
        }

        /// <summary>
        /// Gets or sets an element of the array
        /// </summary>
        /// <param name="i">Index</param>
        /// <returns>Value at index</returns>
        public T this[int i]
        {
            get { return _dataarray[i]; }
            set { _dataarray[i] = value; }
        }

        /// <summary>
        /// IEnumerable implementation.
        /// </summary>
        /// <returns>internal array enumerator</returns>
        public IEnumerator<T> GetEnumerator()
        {
            return (IEnumerator<T>)_dataarray.GetEnumerator();
        }

        /// <summary>
        /// IEnumerable Implementation
        /// </summary>
        /// <returns>internal array enumerator</returns>
        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            return _dataarray.GetEnumerator();
        }

        /// <summary>
        /// Gets a row as an Array
        /// </summary>
        /// <param name="rowindex">Row index</param>
        /// <returns>Row values in an array</returns>
        public T[] GetRow(int rowindex)
        {
            T[] ret = new T[this.Columns];
            for (int i = 0; i < this.Columns; i++)
            {
                ret[i] = this[rowindex, i];
            }
            return ret;
        }

        /// <summary>
        /// Gets a column as an Array
        /// </summary>
        /// <param name="columnindex">Column index</param>
        /// <returns>Column values in an array</returns>
        public T[] GetColumn(int columnindex)
        {
            T[] ret = new T[this.Rows];
            for (int i = 0; i < this.Rows; i++)
            {
                ret[i] = this[i, columnindex];
            }
            return ret;
        }

        /// <summary>
        /// Sets a row's values from an array
        /// </summary>
        /// <param name="row">the row index</param>
        /// <param name="items">items in an array</param>
        public void SetRow(int row, T[] items)
        {
            if (row < 0 || row > this.Rows)
               throw new ArgumentOutOfRangeException("row");
            if (items == null)
               throw new ArgumentNullException("items");

            int limit = Math.Min(items.Length, this.Columns);

            for (int i = 0; i < limit; i++)
            {
                this[row, i] = items[i];
            }
        }

        /// <summary>
        /// Sets a column's values from an array
        /// </summary>
        /// <param name="column">the column index</param>
        /// <param name="items">items in an array</param>
        public void SetColumn(int column, T[] items)
        {
            if (column < 0 || column > this.Columns)
               throw new ArgumentOutOfRangeException("column");
            if (items == null)
               throw new ArgumentNullException("items");

            int limit = Math.Min(items.Length, this.Rows);

            for (int i = 0; i < limit; i++)
            {
                this[i, column] = items[i];
            }
        }
    }
}

Using the code

Using the implementation is quite easy, it acts like a 2D array, because the class provides object indexers:

var dense = new DenseArray<int>(3, 3);
dense[0, 2] = 42;

The implementation provides methods for getting and setting whole column data in an easy way. Also it implements the IEnumerable interface, which is at the moment

Performance

I measured the performance of the methods shown in this article with a simple program (source code is in the downloadable zip), that allocates a 10000x10000 element array using the standard 2D array method, the jagged method and my implementation. The results was measured on a machine with an Intel i3 530 CPU running a 64 bit version Windows 10 with .NET framework 4.6 and 4GiB RAM.

    2D array Jagged array DenseArray
1st Run Memory Allocation in bytes: 400527360 403034112 400023552
Random fill time in ms: 2292 1599 1962
Memory overhead in % 0,6259 -0,1258
Speed Gain in % 43,3396 16,8196
2nd Run Memory Allocation in bytes: 400510976 403079168 400023552
Random fill time in ms: 2285 1556 1990
Memory overhead in % 0,6412 -0,1217
Speed Gain in % 46,8509 14,8241
3rd Run Memory Allocation in bytes: 400502784 403087360 400039936
Random fill time in ms: 2271 1373 1958
Memory overhead in % 0,6453 -0,1156
Speed Gain in % 65,4042 15,9857
  Average Memory in bytes: 400513706,7 403066880 400029013,3
  Average random fill time in ms: 2282,666667 1509,333333 1970
  Average memory overhead compared to 2d array in %: 0,6375 -0,1210
  Average speed gain compared to 2d array in %: 51,2367 15,8714

Further improvements

The implementation in the future might be extended with proper GetHashCode() and ToString() ovverides, and a decent implementation of an Enumeratior coud be added, because the current enumerator enumerates items as they are in the underlaying array.

History

  • 2015-12-16 Initial release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Architect
Hungary Hungary
Been interested in computers & electronics since I got my NES, eventually became a Computer engineer, now a Software Architect

Comments and Discussions

 
SuggestionYour Perf evaluation is not correct Pin
Summer23 HV7-Jun-18 5:57
Summer23 HV7-Jun-18 5:57 
QuestionAren't you deceiving yourself a bit? Pin
Rolf Borchmann18-Dec-15 20:40
Rolf Borchmann18-Dec-15 20:40 
AnswerRe: Aren't you deceiving yourself a bit? Pin
John Torjo21-Dec-15 4:05
professionalJohn Torjo21-Dec-15 4:05 
QuestionUnsafe mode, pointer will be 100% faster Pin
Alexey KK18-Dec-15 16:19
professionalAlexey KK18-Dec-15 16:19 
Suggestionno need for _columnIndexes Pin
John Torjo17-Dec-15 8:20
professionalJohn Torjo17-Dec-15 8:20 
GeneralRe: no need for _columnIndexes Pin
webmaster44217-Dec-15 8:31
webmaster44217-Dec-15 8:31 
You are right, but that way the code runtime in the test was over 16000ms, so it's a huge performance improvement at large scale Smile | :) At lower scale it helps also, because accessing a variable & adding a value to it is generally faster than doing multiplications.

In a further implementation I might ad a flag to a ctor that disables the column index caching.
GeneralRe: no need for _columnIndexes Pin
John Torjo17-Dec-15 8:36
professionalJohn Torjo17-Dec-15 8:36 
GeneralRe: no need for _columnIndexes Pin
Siderite Zackwehdex18-Dec-15 0:07
Siderite Zackwehdex18-Dec-15 0:07 
GeneralRe: no need for _columnIndexes Pin
webmaster44218-Dec-15 6:29
webmaster44218-Dec-15 6:29 
GeneralRe: no need for _columnIndexes Pin
John Torjo20-Dec-15 19:24
professionalJohn Torjo20-Dec-15 19:24 
GeneralMessage Closed Pin
16-Jan-16 1:54
Member 1089184816-Jan-16 1:54 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.