Insert millions of records in SQL Server table at once

In this post we will see how to insert a really large amount of records in a SQL Server Table at once. I decided to write about this cause recently I found myself having difficulties while trying to deploy a specific table to an Azure SQL Database. There are several tools and options for SQL Schema or SQL Data comparison with the Red Gate’s SQL Compare and Visual Studio’s SQL Server Data Tools being the most significant in my opinion. Sometimes though, even those tools can dissapoint you for unknown reasons while you have the urgent to deploy your new data. In my case, I had a table with 2 millions of records in my local SQL Server and I wanted to deploy all of them to the respective Azure SQL Database table. I ‘d never had problems deploying data to the cloud and even if I had due to certain circumstances (no comparison keys between tables, clustered indexes, etc..), there was always a walkthrough to fix the problem. In this case though, nothing seemed to work so I decided to write some simple code in a console applicaton to deploy the 2 millions of records.
In the following code I read all the records from the local SQL Server and in a foreach loop I insert each record into the cloud table. Let’s see it and discuss it.

class Program
    {
        private static string devConnString = @"Data Source=localhost;Persist Security Info=False;Initial Catalog=devDatabase;Integrated Security=SSPI;";
        private static string cloudConnString = @"Server=[cloud-server];Database=[cloudDatabase];User ID=[username];Password=[password];Trusted_Connection=False;Encrypt=True;Connection Timeout=30;";


        static void Main(string[] args)
        {
            DataTable millionRecordsDataTable = new DataTable();
            using (SqlConnection devConn = new SqlConnection(devConnString))
            {
                using (SqlCommand devCmd = new SqlCommand("SELECT * FROM dbo.millionRecordsDataTable", devConn))
                {
                    using (SqlDataAdapter devAd = new SqlDataAdapter(devCmd))
                    {
                        devAd.Fill(millionRecordsDataTable); // 2 million records
                    }
                }
            }
            // try to insert each record one by one to the cloud (bad practise for this situation)
            try
            {
                int counter = 0;

                foreach (DataRow row in millionRecordsDataTable.Rows)
                {
                    string sqlToRun = @"INSERT INTO [dbo].[millionRecordsDataTable] ([collumn1], [collumn2], [collumn3]) VALUES (@collumn1, @collumn2, @collumn3)";
                    using (SqlConnection cloudConn = new SqlConnection(cloudConnString))
                    {
                        using (SqlCommand cloudCmd = new SqlCommand(sqlToRun, cloudConn))
                        {
                            cloudCmd.CommandType = CommandType.Text;

                            cloudCmd.Parameters.Add("@collumn1", SqlDbType.Float).Value = float.Parse(row[0].ToString());
                            cloudCmd.Parameters.Add("@collumn2", SqlDbType.Float).Value = float.Parse(row[1].ToString());
                            cloudCmd.Parameters.Add("@collumn3", SqlDbType.NVarChar, 2).Value = row[2].ToString();
                            cloudConn.Open();
                            cloudCmd.ExecuteNonQuery();
                        }
                    }
                    Console.WriteLine("Added row " + counter++);
                }
            }
            catch (Exception exc)
            {
                Console.WriteLine(exc.Message);
            }

            Console.ReadKey();
        }
    }

While the preceding code is self descripted, easy to understand plus it does the job, it has the huge drawback of it’s slow execution time. Think for a litle what is actually happening: For each record we read, we open a connection to the cloud server, copy it to the table, close the connection and so on until all 2.000.000 records are successfully copied. Taking into the consideration that following this patern, you need approximately half a second to copy a record to the cloud, so in order to copy the entire table you need… 277 hours!! Yeah, I don’t have that much time so I simply changed the code to use the SqlBulkCopy instead the foreach loop. Take a look the change in the following code.

static void Main(string[] args)
        {
            DataTable millionRecordsDataTable = new DataTable();
            using (SqlConnection devConn = new SqlConnection(devConnString))
            {
                using (SqlCommand devCmd = new SqlCommand("SELECT * FROM dbo.millionRecordsDataTable", devConn))
                {
                    using (SqlDataAdapter devAd = new SqlDataAdapter(devCmd))
                    {
                        devAd.Fill(millionRecordsDataTable); // 2 million records
                    }
                }
            }

            try
            {
                using (SqlConnection connection = new SqlConnection(cloudConnString))
                {
                    connection.Open();

                    using (SqlBulkCopy bulkcopy = new SqlBulkCopy(connection))
                    {
                        bulkcopy.DestinationTableName = "dbo.millionRecordsDataTable";
                        bulkcopy.BulkCopyTimeout = 50000;
                        try
                        {
                            bulkcopy.WriteToServer(millionRecordsDataTable);
                        }
                        catch (Exception ex)
                        {
                            Console.WriteLine(ex.Message);
                        }

                        connection.Close();
                    }
                }
            }
            catch (Exception exc)
            {
                Console.WriteLine(exc.Message);
            }

            Console.ReadKey();
        }

SqlBulkCopy is the ultimate solution when you need to copy a large amount of data into another table. You can find it’s documentation here along with it’s respective options to use it. The main difference between our first attempt and the latter is that the SqlBulkCopy can send a batch of records at once instead of copying one by one each time. It took about half an hour to deploy all the records. The most significant properties are the following (taken from MSDN):

  1. BulkCopyTimeout that is the number of seconds for the operation to complete before it times out
  2. DestinationTableName that is the name of the destination table on the server
  3. BatchSize that is the number of rows in each batch (at the end of each batch, the rows in the batch are sent to the server)

The previous code will encapsulate the hall process under a single cloud transaction which means that you wont see any data into the cloud until the transaction is complete – commited. In my case, I was running in the Azure SQL Server the following commands to ensure that the data were being deploying

-- CHECK ALL Uncommited records
SELECT * FROM dbo.millionRecordsDataTable WITH (NOLOCK)

-- CHECK the number of uncommited records
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED 
SELECT COUNT(*) FROM dbo.millionRecordsDataTable

bulk-copy
That’s it, I hope you found helpfull this post in which we saw how to commit large amount of data to an SQL Server table.

P.S: Reaching the number of 100.000 views in less than a year, I would like to thank you all for visiting this blog, it really means a lot to me. I hope I had more time to write even more. Stay tuned till next time!


Categories: ADO.NET, Best practices

Tags:

4 replies

  1. If the applocation server & database server same then we can also use BULKINSERT.

  2. That’s true partha, thanks for your comment.

  3. I had the opposite need in a project once: to get data from a diverse database and pass it to another, and because of connectivity issues I could not get the entire thing at once. My solution was to get a bulk result set and pass it chunk by chunk. This is a small part of the code I used for getting the data, you might find it handy at similar cases:

                        #region Run Synch (By MaxFetchRows) 
    
                        #region Count All Table Rows From Current Table
                        string CountSQL= "Select Count(*) as RowCount From " + InternalTableName + " Where " + ActionColumnName + " = 'I' ";
                        int TotalRows = int.Parse(oDB2.ExecuteScalar(true, CountSQL));
                        #endregion
    
                        #region For Each Batch Of Rows (MaxFetchRows)
                        for (int z = MaxFetchRows + CurrentTableStartFromRow; z < TotalRows + MaxFetchRows; z = z + MaxFetchRows)
                        {
                            string Infimum = ((z - MaxFetchRows)).ToString();
                            string Supremum = (int.Parse(Infimum) + MaxFetchRows).ToString();
    
                            string FetchSQL =
                                              "Select " + strInternalFieldNames + ", " + ActionColumnName + " " +
                                              "From " + InternalTableName +
                                              " Where RRN(" + InternalTableName + ")  " + Infimum;
    
                            DataSet DSFetch = new DataSet();
                            DSFetch = oDB2.GetDataset(true, FetchSQL, InternalTableName);
    
                            #region If DsFetch == Null or 0 rows Move To Next Table 
                            if ((DSFetch == null) || (DSFetch.Tables[0].Rows.Count == 0))
                            {
                                string currval = "'No Dataset Fetched From Table [" + TableName + "]'";
                                oSQL.ExecuteNonQuery(OpenCloseSQLConn, "exec Synch_LogInsert " + currval + ", 'I', 'Dataset returned NULL or NoRows, Action I.', '" + FetchSQL + "'");
    
                                if (DSFetch.Tables[0].Rows.Count == 0)
                                    DSFetch.Dispose();
                            }
                            #endregion
                            else
                            {
                                retval = SynchInner(DSFetch, IsChild, isParams, OpenCloseSQLConn, ActionColumnName, TableId, TableName, InternalPrimaryKeys, IsComplex, InsertNames, MasterTableIds, UpdateNames, false, false);
    
    
                                DSFetch.Dispose();
                            }
                        }
                        #endregion
    
  4. Can above code work for sqlserver to mysql if yes then how can we implement the above code

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: