One of the most common questions I get, regarding uploading data to Anaplan, is what the optimal chunk size is. This post will review my results for file uploads using cURL, Informatica, and Anaplan Connect v1.4.
First, a bit of background on chunking. Files in Anaplan are stored in chunks (unless under 1mb, the minimum chunk size). This is primarily to optimize upload speed, as well as minimize the load on the API server. With version 1.4 of Anaplan Connect we gave users the ability to define chunk size, which is also available in Mulesoft, Informatica, and Boomi. Our API understands this concept, but it's the software that performs the task. For those seeking to leverage the API with another tool, you must implement this yourself. We accept any chunk size from 1mb to 50mb, with the recommendation that you set the value to be 1% of the file size or 50mb, whichever is smaller.
Below you can see the analysis of my results for Anaplan Connect, Informatica, and cURL. All my results, thus far, show logarithmic growth, with diminishing returns after ~10mb chunk size. Additionally, the results for Anaplan Connect are a bit slower than for cURL, but I suspect this is due to where I conducted the test. Both cURL and Informatica uploads were from the US, but my Anaplan Connect testing was conducted in Singapore. Our API server is located in the US, so this would potentially account for the increased upload times for Anaplan Connect.
We're working on performing similar tests for all ETL offerings, in each region to provide more accurate details. Once complete, we will update this with the new data.
Attached are the spreadsheets where you can review all the tests results and full details.