Click here to Skip to main content
15,885,985 members
Articles / Entity Framework
Tip/Trick

Entity Framework 5.0 - Performance Tuning Using SqlBulkCopy

Rate me:
Please Sign up or sign in to vote.
5.00/5 (4 votes)
22 Jul 2016CPOL3 min read 14.7K   9   1
Performance tuning using SqlBulkCopy

Introduction

There are many ways of inserting huge amount of data (a million INSERTs) into the database using Entity Framework. You can also buy Entity Framework extension available in the market. However, the fastest way that I have used in one of my recent projects is the SqlBulkCopy. This will dramatically improve the Entity Framework Performance. All you need to have is the List<Entities> that you want to insert.

Background

Entity Framework uses SaveChanges() method to write the data back to the backend. There are many factors that you should consider before calling this method, like you want autoupdate ON/OFF, etc.

Calling SaveChanges() for each record slows bulk inserts extremely down. You can try few simple tests which will very likely improve the performance:

  • Call SaveChanges() once after ALL records.
  • Call SaveChanges() after for example 100 records.
  • Call SaveChanges() after for example 100 records and dispose the context and create a new one.
  • Disable change detection

For the performance, it is important to call SaveChanges() after "many" records ("many" around 100 or 1000). It also improves the performance to dispose the context after SaveChanges and create a new one. This clears the context from all entities, SaveChanges doesn't do that, the entities are still attached to the context in state Unchanged. It is the growing size of attached entities in the context what slows down the insertion step by step. So, it is helpful to clear it after some time.

To avoid the batch operation, you can use SqlBulkCopy on your Object Context. In my example, I am using this method on List<Entities> generated using EDMX feature. I have used Entity Framework 5.0 Database-First approach. For Entity Framework 6.0 or later, you may have to tweak this code a bit to avoid mappings.

Using the Code

In this section, we will provide you all the functions needed to bulk insert any entities (model objects). Root function, the only one called from the outside, is BulkInsertAll and it receives a List of your entity class. Here is the source code.

C#
using System;
using System.Collections;
using System.Collections.Generic;
using System.Data;
using System.Data.EntityClient;
using System.Data.Metadata.Edm;
using System.Data.Objects;
using System.Data.SqlClient;
using System.Linq;
using System.Reflection;

namespace Abhay
{
    partial class MyEntityContext
    {
        public void BulkInsertAll<T>(IEnumerable<T> entities)
        {
            try
            {
                entities = entities.ToArray();
                string cs = Connection.ConnectionString;
                var conn = new SqlConnection(cs);
                conn.Open();

                Type t = typeof(T);

                var workspace = this.MetadataWorkspace;
                
                Dictionary<string, string> mappings = null;

                try
                {
                    mappings = GetMappings(workspace, this.DefaultContainerName, typeof(T).Name);
                }
                catch (Exception)
                {
                    // there may be an exception depend case to case. 
                    // In my case I found some issue initially when my 
                    // EDMX was corrupted but now this is working fine. 
                    // This is just to see the exception
                }

                // var tableAttribute = (TableAttribute)t.GetCustomAttributes(
                // typeof(TableAttribute), false).Single();
                var bulkCopy = new SqlBulkCopy(conn)
                {
                    DestinationTableName = t.Name
                };

                var properties = t.GetProperties().Where(EventTypeFilter).ToArray();

                var table = new DataTable();

                foreach (var property in properties)
                {
                    try
                    {
                        Type propertyType = property.PropertyType;
                        if (propertyType.IsGenericType &&
                            propertyType.GetGenericTypeDefinition() == typeof(Nullable<>))
                        {
                            propertyType = Nullable.GetUnderlyingType(propertyType);
                        }

                        table.Columns.Add(new DataColumn(property.Name, propertyType));

                        var clrPropertyName = property.Name;

                        // In case mapping failed to compute, assuming the mapping between Entity and 
                        // Database Table is same as Generated Entity
                        var tableColumnName = mappings == null ? property.Name : mappings[property.Name];

                        bulkCopy.ColumnMappings.Add
                            (new SqlBulkCopyColumnMapping(clrPropertyName, tableColumnName));
                    }
                    catch (Exception)
                    {
                        //This is just to see the exception
                    }
                }

                foreach (var entity in entities)
                {
                    table.Rows.Add(properties.Select(
                      property => GetPropertyValue(
                      property.GetValue(entity, null))).ToArray());
                }

                bulkCopy.WriteToServer(table);
                conn.Close();
            }
            catch (Exception)
            {
                throw;
            }
        }

        private bool EventTypeFilter(System.Reflection.PropertyInfo p)
        {
            //var attribute = Attribute.GetCustomAttribute(p,
            //    typeof(AssociationAttribute)) as AssociationAttribute;

            //if (attribute == null) return true;
            //if (attribute.IsForeignKey == false) return true;

            //In this case I want to avoid the Reference Type (Custom Class) other than System.String
            //You can comment the blow case and use the above code

            if (p.PropertyType.IsClass && !p.PropertyType.Namespace.ToString().Equals("System"))
                return false;
            return true;
        }

        private object GetPropertyValue(object o)
        {
            if (o == null)
                return DBNull.Value;
            return o;
        }

        private Dictionary<string, string> GetMappings
             (MetadataWorkspace workspace, string containerName, string entityName)
        {
            var mappings = new Dictionary<string, string>();
            var storageMapping = workspace.GetItem<GlobalItem>(containerName, DataSpace.CSSpace);
            dynamic entitySetMaps = storageMapping.GetType().InvokeMember(
                "EntitySetMaps",
                BindingFlags.GetProperty | BindingFlags.NonPublic | BindingFlags.Instance,
                null, storageMapping, null);

            foreach (var entitySetMap in entitySetMaps)
            {
                var typeMappings = GetArrayList("TypeMappings", entitySetMap);
                dynamic typeMapping = typeMappings[0];
                dynamic types = GetArrayList("Types", typeMapping);

                if (types[0].Name == entityName)
                {
                    var fragments = GetArrayList("MappingFragments", typeMapping);
                    var fragment = fragments[0];
                    var properties = GetArrayList("AllProperties", fragment);
                    foreach (var property in properties)
                    {
                        var edmProperty = GetProperty("EdmProperty", property);
                        var columnProperty = GetProperty("ColumnProperty", property);
                        mappings.Add(edmProperty.Name, columnProperty.Name);
                    }
                }
            }

            return mappings;
        }

        private dynamic GetProperty(string property, object instance)
        {
            var type = instance.GetType();
            return type.InvokeMember(property, BindingFlags.GetProperty | 
            BindingFlags.NonPublic | BindingFlags.Instance, null, instance, null);
        }

        private ArrayList GetArrayList(string property, object instance)
        {
            var type = instance.GetType();
            var objects = (IEnumerable)type.InvokeMember(property, 
            BindingFlags.GetProperty | BindingFlags.NonPublic | 
            BindingFlags.Instance, null, instance, null);
            var list = new ArrayList();
            foreach (var o in objects)
            {
                list.Add(o);
            }
            return list;
        }
    }
}

You can simply have the same name as your Object Context, replace it with MyEntityContext and place this partial class on the same root folder as your Object Context class.

Call BulkInsertAll from your Data Services, or Business Services as below.

C#
MyEntityContext.BulkInsertAll(entities); // where entities is the List<Entity>

Points of Interest

Most of the performance issues that Entity Framework suffers comes from too frequent communication with the database server. Each record inserted requires two statements to be issued to the server, with two roundtrips to the server to fully execute. All the measurements presented in this section are based on SQL Server installed on local computer, with nearly zero roundtrip time. Consider database server installed on LAN or Shared with many other application databases, which is typical situation, with only one millisecond roundtrip time. Inserting one million objects from our experiment means three million records, with six million roundtrips to the database. With one millisecond irrevocably lost with each database record, we are already 100 minutes behind the schedule. At the same time, bulk implementation won't spend more than ten milliseconds for the same thing, because it sends less than a dozen of statements to the server. All the data actually transferred by the bulk implementation are actual data that will be saved into database records - communication overhead nears zero on such a large amount of useful data.

History

  • 23rd July, 2016: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Technical Lead Accenture
Canada Canada
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionEF performance Pin
Tiger_Steve25-Jul-16 23:14
Tiger_Steve25-Jul-16 23:14 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.