cURL fetching an export from API intermittently failing with 500 Internal Server Error
Hi All,
We are trying to GET the exports from API but it is intermittently failing. At times when it is successful the response header will have content type returned as "Content-Type: application/octet-stream" and when it fails response header contains "Content-Type: application/json;charset=UTF-8"
FYR see below when it fails with 500:
> GET /2/0/workspaces/{workspaceID}/models/{modelID}/files/{fileID}/ HTTP/1.1 > Host: api.anaplan.com > User-Agent: curl/7.79.1 > Accept: */* > authorization: AnaplanAuthToken ***masked*** > Content-Type: application/octet-stream > * Mark bundle as not supporting multiuse < HTTP/1.1 500 Internal Server Error < Date: Thu, 11 Aug 2022 09:17:47 GMT < Content-Type: application/json;charset=UTF-8 < Transfer-Encoding: chunked < Connection: keep-alive < Cache-Control: no-cache < Pragma: no-cache < Expires: 0 < Strict-Transport-Security: max-age=31536000; includeSubDomains; preload < CF-Cache-Status: DYNAMIC < Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct" < X-Content-Type-Options: nosniff < Server: cloudflare < CF-RAY: 738fe151dd65756a-LHR
FYR see below when it is success:
> GET /2/0/workspaces/{workspaceID}/models/{modelID}/files/{fileID}/ HTTP/1.1 > Host: api.anaplan.com > User-Agent: curl/7.79.1 > Accept: */* > authorization: AnaplanAuthToken ****masked*** > Content-Type: application/octet-stream > * Mark bundle as not supporting multiuse < HTTP/1.1 200 OK < Date: Thu, 11 Aug 2022 09:17:52 GMT < Content-Type: application/octet-stream < Content-Length: 203 < Connection: keep-alive < Cache-Control: no-cache < Pragma: no-cache < Expires: 0 < Strict-Transport-Security: max-age=31536000; includeSubDomains; preload < CF-Cache-Status: DYNAMIC < Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct" < X-Content-Type-Options: nosniff < Server: cloudflare < CF-RAY: 738fe16e0bd8779d-LHR
This started recently while there are no changes on the script, model ID, workspace ID and file export numbers remains same. Any help or pointers would be greatly appreciated.
Best Answer
-
Hi @Sridattam,
Based on the facts that:
- I see that these tries are separated only by 5 seconds
- First fails, but second succeeds
- It only started to happen recently
I'd ask — once you trigger the export, do you check if the file is ready? I suspect what happens is once you start the export, you immediately try to download, and it worked at the beginning, when the file was small/worse ping/dev environment. But once file got bigger/better ping, when you start an export, it takes few seconds to generate the file, so if you try to GET it immediately, it fails, but after 5 seconds it is ready, so it succeeds. So can you recheck, if you are monitoring the task, and only try to download the file once it's finished? Like specified here: https://anaplanbulkapi20.docs.apiary.io/#MonitorExportTasks
2
Answers
-
Hi @M.Kierepka
First of all, thank you so much for the response.
Yes we do submit a export action (POST) in our script before requesting the files.
I just ran the script and I see below output where taskState is shown as "IN_PROGRESS", you mean to say we should wait for this taskState to be COMPLETED before we can request (GET) for the files?
{ "meta" : { "schema" : "https://api.anaplan.com/2/0/models/{modelID}/objects/task" }, "status" : { "code" : 200, "message" : "Success" }, "task" : { "taskId" : "297A9BB774FD4C4787E55EF672B6CA2B", "taskState" : "IN_PROGRESS", "creationTime" : 1660235336530 } }'
0 -
It's not about starting the task - I am sure you are POSTing an export, as in the end you get the file 😉
What I am referring to (and pasted links to in previous comment) is GET request to the taskID that you get in response to POST. You need to make it (possibly several times) to know if the task is completed. As you observed in original POST response, it was in progress.
You should almost follow the flow (like here https://www.postman.com/apiplan/workspace/official-anaplan-collection/folder/18986564-71a14979-3d71-4913-af41-4aa95a8c1ad0?ctx=documentation😞- Send POST to request start of action (here export, but can be process/import as well). In the response from server, you should get task ID of this run.
- Using just obtained task ID, monitor progress of the task using GETs - waiting can take several seconds, depending on the size of the file. In the response, you will be getting task state, and you need to wait as long as it's in progress.
- Once you receive a response that the task has finished (it can be failure for process/import), only then you can download the data (same for process/failure dump for import).
PS I am an author of still-in-progress API implementation in Python, it might help you understand some processes, here example workflow of export and import (it's naive busy-waiting, you might want to put some breaks between attempts to get the status of the task): https://github.com/DLZaan/apapi/blob/examples/examples/working_with_files.py
Here is the exact snippet that you might want to inspect:
def doing(response) -> bool: return response.json()["task"]["taskState"] != "COMPLETE" # run export - you should get task ID, which you can use to monitor the job e_task = conn.run_export(t["model_id"], t["export_id"]).json()["task"]["taskId"] while doing(tsk := conn.get_export_task(t["model_id"], t["export_id"], e_task)): pass # let's check if it was successful print(tsk.text) if not tsk.json()["task"]["result"]["successful"]: print("Export failed!") return print("Export OK - downloading file") # we use the fact (undocumented!) that for exports action_id=file_id # for bigger files: data_out = conn.download_file(t["model_id"], t["export_id"]) data_out = conn.get_file(t["model_id"], t["export_id"]).content
1