Wednesday, 30 March 2016

Simple Data Import Test

Hi All,

Some time ago, one client asked me: "How fast can you upload data into the database". Unfortunately I had no answer for this question and actually no one has answer for this question. The problem here is just in the question itself - it has not enough information:
  • What data (documents) do you want to upload?
  • How many documents do you have?
  • How many lines per document do you have?
  • How many field per single line do you have?
  • How many Business Accounts do you have in the system?
  • How many Documents per Business Account?
  • Do you have Field-Level Audit?
  • Do you need to release documents right after import?
  • What do you want to do with this data after import?
  • Will some one else use system during import?
  • How do you plan to import data? 
    • Integration Services?
    • Web Service API?
    • Customization with direct import?
  • Do you plan to use multiple thread?

So there are a lot of questions. And especially it is very important to understand what type of data do you want to import. Sales orders will be definitely much slower that GL Transactions.

Here I want describe an optimistic scenario of importing simple tables into the database direct through Acumatica Framework. In this case we reading file through the code and uploading it directly to business logic container. We are personally responsible for correct flow of running data validation.

For testing I have prepared 2 different scenarios:
  • Uploading of data from Excel file directly to one table with some limited validation
  • Uploading of data into Journal Transactions as a separate batches with 500 or 1000 details lines.
IF you are interesting in result, welcome in this article details.

Disclaimer: This is my private investigation in the limited environment, without any optimization and on very simple data structure. Real live situation can be completely different.

Testing Environment is my personal laptop:
  • Processor - Intel(R) Core (TM) i7-3610QM CPU @ 2.30GHz
    • 4 CPU  Cores
  • Memory - 8.00 GB (7.70 GB is usable)
  • Operation System - Windows 7 x64
  • Sql Server 2014 Developer Edition
  • Acumatica 5.30.1367 in the trial mode
Testing Data
  • Excel file with 50 000 lines of exported GL transactions.

Testing Scenario 1
Uploading data to single table into the database. All  I have created a separate table for this task. I save records to the database every 1000 lines.
  • 50 000 lines + One thread - 105.99 seconds
  • 100 000 lines + Two threads - 107.16 seconds
  • 200 000 lines + Two threads - 223.18 seconds
As the result we can see that 1 core can process about 933 lines per second.
Also we can see that this process can be easily scaled with several cores.

Testing Scenario 2
Uploading data into the Journal Transactions graph. I create a separate batch document for each 1000 transactions lines. I do not want to care about balancing batches, so i save it on hold.
  • 50 000 lines + One thread - 415.05 seconds
  • 100 000 lines + Two threads - 429.68 seconds
As the result we can see that 1 core can process about 232 lines in a second.
Also we can see that this process can be easily scaled with several cores.

Bonus Testing Scenario
As a bonus, want to provide you a simple testing scenario of uploading Sales Orders into Acumatica with integration services. This is single thread.

Total Time of  Import 1024 Sales orders:
Timer Lines per Order
00:13:42 2 Lines
00:25:08 4 Lines
00:50:34 8 Lines

Dependency from lines of sales order is liner, so we can forecast it as 289,69 seconds for one order with 1024 lines.

Graph code snippet for my testing scenario:

Have a fast importing!

No comments: