Insert millions of records in SQL Server table at once

In this post we will see how to insert a really large amount of records in a SQL Server Table at once. I decided to write about this cause recently I found myself having difficulties while trying to deploy a specific table to an Azure SQL Database. There are several tools and options for SQL Schema or SQL Data comparison with the Red Gate’s SQL Compare and Visual Studio’s SQL Server Data Tools being the most significant in my opinion. Sometimes though, even those tools can dissapoint you for unknown reasons while you have the urgent to deploy your new data. In my case, I had a table with 2 millions of records in my local SQL Server and I wanted to deploy all of them to the respective Azure SQL Database table. I ‘d never had problems deploying data to the cloud and even if I had due to certain circumstances (no comparison keys between tables, clustered indexes, etc..), there was always a walkthrough to fix the problem. In this case though, nothing seemed to work so I decided to write some simple code in a console applicaton to deploy the 2 millions of records.
In the following code I read all the records from the local SQL Server and in a foreach loop I insert each record into the cloud table. Let’s see it and discuss it.

class Program
    {
        private static string devConnString = @"Data Source=localhost;Persist Security Info=False;Initial Catalog=devDatabase;Integrated Security=SSPI;";
        private static string cloudConnString = @"Server=[cloud-server];Database=[cloudDatabase];User ID=[username];Password=[password];Trusted_Connection=False;Encrypt=True;Connection Timeout=30;";


        static void Main(string[] args)
        {
            DataTable millionRecordsDataTable = new DataTable();
            using (SqlConnection devConn = new SqlConnection(devConnString))
            {
                using (SqlCommand devCmd = new SqlCommand("SELECT * FROM dbo.millionRecordsDataTable", devConn))
                {
                    using (SqlDataAdapter devAd = new SqlDataAdapter(devCmd))
                    {
                        devAd.Fill(millionRecordsDataTable); // 2 million records
                    }
                }
            }
            // try to insert each record one by one to the cloud (bad practise for this situation)
            try
            {
                int counter = 0;

                foreach (DataRow row in millionRecordsDataTable.Rows)
                {
                    string sqlToRun = @"INSERT INTO [dbo].[millionRecordsDataTable] ([collumn1], [collumn2], [collumn3]) VALUES (@collumn1, @collumn2, @collumn3)";
                    using (SqlConnection cloudConn = new SqlConnection(cloudConnString))
                    {
                        using (SqlCommand cloudCmd = new SqlCommand(sqlToRun, cloudConn))
                        {
                            cloudCmd.CommandType = CommandType.Text;

                            cloudCmd.Parameters.Add("@collumn1", SqlDbType.Float).Value = float.Parse(row[0].ToString());
                            cloudCmd.Parameters.Add("@collumn2", SqlDbType.Float).Value = float.Parse(row[1].ToString());
                            cloudCmd.Parameters.Add("@collumn3", SqlDbType.NVarChar, 2).Value = row[2].ToString();
                            cloudConn.Open();
                            cloudCmd.ExecuteNonQuery();
                        }
                    }
                    Console.WriteLine("Added row " + counter++);
                }
            }
            catch (Exception exc)
            {
                Console.WriteLine(exc.Message);
            }

            Console.ReadKey();
        }
    }

While the preceding code is self descripted, easy to understand plus it does the job, it has the huge drawback of it’s slow execution time. Think for a litle what is actually happening: For each record we read, we open a connection to the cloud server, copy it to the table, close the connection and so on until all 2.000.000 records are successfully copied. Taking into the consideration that following this patern, you need approximately half a second to copy a record to the cloud, so in order to copy the entire table you need… 277 hours!! Yeah, I don’t have that much time so I simply changed the code to use the SqlBulkCopy instead the foreach loop. Take a look the change in the following code.

static void Main(string[] args)
        {
            DataTable millionRecordsDataTable = new DataTable();
            using (SqlConnection devConn = new SqlConnection(devConnString))
            {
                using (SqlCommand devCmd = new SqlCommand("SELECT * FROM dbo.millionRecordsDataTable", devConn))
                {
                    using (SqlDataAdapter devAd = new SqlDataAdapter(devCmd))
                    {
                        devAd.Fill(millionRecordsDataTable); // 2 million records
                    }
                }
            }

            try
            {
                using (SqlConnection connection = new SqlConnection(cloudConnString))
                {
                    connection.Open();

                    using (SqlBulkCopy bulkcopy = new SqlBulkCopy(connection))
                    {
                        bulkcopy.DestinationTableName = "dbo.millionRecordsDataTable";
                        bulkcopy.BulkCopyTimeout = 50000;
                        try
                        {
                            bulkcopy.WriteToServer(millionRecordsDataTable);
                        }
                        catch (Exception ex)
                        {
                            Console.WriteLine(ex.Message);
                        }

                        connection.Close();
                    }
                }
            }
            catch (Exception exc)
            {
                Console.WriteLine(exc.Message);
            }

            Console.ReadKey();
        }

SqlBulkCopy is the ultimate solution when you need to copy a large amount of data into another table. You can find it’s documentation here along with it’s respective options to use it. The main difference between our first attempt and the latter is that the SqlBulkCopy can send a batch of records at once instead of copying one by one each time. It took about half an hour to deploy all the records. The most significant properties are the following (taken from MSDN):

  1. BulkCopyTimeout that is the number of seconds for the operation to complete before it times out
  2. DestinationTableName that is the name of the destination table on the server
  3. BatchSize that is the number of rows in each batch (at the end of each batch, the rows in the batch are sent to the server)

The previous code will encapsulate the hall process under a single cloud transaction which means that you wont see any data into the cloud until the transaction is complete – commited. In my case, I was running in the Azure SQL Server the following commands to ensure that the data were being deploying

-- CHECK ALL Uncommited records
SELECT * FROM dbo.millionRecordsDataTable WITH (NOLOCK)

-- CHECK the number of uncommited records
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED 
SELECT COUNT(*) FROM dbo.millionRecordsDataTable

bulk-copy
That’s it, I hope you found helpfull this post in which we saw how to commit large amount of data to an SQL Server table.

P.S: Reaching the number of 100.000 views in less than a year, I would like to thank you all for visiting this blog, it really means a lot to me. I hope I had more time to write even more. Stay tuned till next time!
Advertisements


Categories: ADO.NET, Best practices

Tags:

4 replies

  1. If the applocation server & database server same then we can also use BULKINSERT.

  2. That’s true partha, thanks for your comment.

  3. I had the opposite need in a project once: to get data from a diverse database and pass it to another, and because of connectivity issues I could not get the entire thing at once. My solution was to get a bulk result set and pass it chunk by chunk. This is a small part of the code I used for getting the data, you might find it handy at similar cases:

                        #region Run Synch (By MaxFetchRows) 
    
                        #region Count All Table Rows From Current Table
                        string CountSQL= "Select Count(*) as RowCount From " + InternalTableName + " Where " + ActionColumnName + " = 'I' ";
                        int TotalRows = int.Parse(oDB2.ExecuteScalar(true, CountSQL));
                        #endregion
    
                        #region For Each Batch Of Rows (MaxFetchRows)
                        for (int z = MaxFetchRows + CurrentTableStartFromRow; z < TotalRows + MaxFetchRows; z = z + MaxFetchRows)
                        {
                            string Infimum = ((z - MaxFetchRows)).ToString();
                            string Supremum = (int.Parse(Infimum) + MaxFetchRows).ToString();
    
                            string FetchSQL =
                                              "Select " + strInternalFieldNames + ", " + ActionColumnName + " " +
                                              "From " + InternalTableName +
                                              " Where RRN(" + InternalTableName + ")  " + Infimum;
    
                            DataSet DSFetch = new DataSet();
                            DSFetch = oDB2.GetDataset(true, FetchSQL, InternalTableName);
    
                            #region If DsFetch == Null or 0 rows Move To Next Table 
                            if ((DSFetch == null) || (DSFetch.Tables[0].Rows.Count == 0))
                            {
                                string currval = "'No Dataset Fetched From Table [" + TableName + "]'";
                                oSQL.ExecuteNonQuery(OpenCloseSQLConn, "exec Synch_LogInsert " + currval + ", 'I', 'Dataset returned NULL or NoRows, Action I.', '" + FetchSQL + "'");
    
                                if (DSFetch.Tables[0].Rows.Count == 0)
                                    DSFetch.Dispose();
                            }
                            #endregion
                            else
                            {
                                retval = SynchInner(DSFetch, IsChild, isParams, OpenCloseSQLConn, ActionColumnName, TableId, TableName, InternalPrimaryKeys, IsComplex, InsertNames, MasterTableIds, UpdateNames, false, false);
    
    
                                DSFetch.Dispose();
                            }
                        }
                        #endregion
    
  4. Can above code work for sqlserver to mysql if yes then how can we implement the above code

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Chara Plessa

The purpose of this blog is to broaden my education, promote experimentation and enhance my professional development. Albert Einstein once said that “If you can’t explain it simply, you don’t understand it well enough” and I strongly believe him!

chsakell's Blog

Anything around ASP.NET MVC,WEB API, WCF, Entity Framework & AngularJS

Kumikoro

A Front End Developer's Blog

Muhammad Hassan

Full Stack developer with expertise in ASP.NET | MVC | WebAPI | Advanced Javascript | AngularJS | Angular2 | C# | ES6 | SQL | TypeScript | HTML5 | NodeJS, NUCES-FAST CS grad, MS candidate @LUMS, EX-Adjunct Faculty @NUCES-FAST, seasonal blogger & open-source contributor. Seattle, WA

Software Engineering

Web development

IEvangelist

.NET, ASP.NET, C#, MVC, TypeScript, AngularJS

leastprivilege.com

Dominick Baier on Identity & Access Control

Happy DotNetting

In Love with Technology

Knoldus

Knols of experience to your advantage

knowshnet

Search - Read - Request - Share

Rahul's space

Learn, Share and Grow with me !

Dhananjay Kumar

Developer Evangelist @Infragistics | MVP @Microsoft |

Journey to SQL Authority with Pinal Dave

SQL, SQL Server, MySQL, Big Data and NoSQL

Conficient Blog

Random bits of tech from @conficient

Code! Code! Code!

SOLID & KISS

Code Wala

Designing and coding

Microsoft Mentalist

A way to start with Microsoft Technologies

Tony Sneed's Blog

A glimpse into the lives of Tony & Zuzana Sneed

Sriramjithendra Nidumolu

Personal Notes of Sriramjithendra

%d bloggers like this: