Exporting and Downloading Data Sets

The DiVA API enables you to define a Marqeta platform data set and export it as a compressed CSV or XLSX file. You can choose between Zip or Gzip compression. After export, you use the API to download the compressed file.

Exporting a data set as a file

You can export any data set as either a CSV or XLSX file by sending a GET request to the appropriate endpoint. To construct your endpoint URL, start with the URL you would use to retrieve that same data set in JSON format, for example:

     /views/authorizations/month?program=my_program

Then insert the export_type path parameter (either /csv or /xlsx) before the query string, for example:

     /views/authorizations/month/csv?program=my_program

By default, the resulting data set is compressed as a gz file. You can compress it as a zip file by including the compress query parameter, for example:

     /views/authorizations/month/csv?compress=zip&program=my_program

Because the export operation is processed asynchronously, you should receive an immediate 202 Accepted response. The JSON-formatted response body contains a token that you will use in downloading your data-set file, for example:

{
"token": "db63c24d8307c24b7e17d33735114dc8f807838a.csv.gz"
}


Downloading the exported file

The API returns up to 1,048,576 rows in a file export and can take several minutes to generate the file.

To retrieve your file, send a GET request to the /download?token={my_download_token} endpoint, where {my_download_token} is the value of the token field that was returned in response to your export request, for example:

     /download?token=db63c24d8307c24b7e17d33735114dc8f807838a.csv.gz

Note: The token value includes two filename extensions (for example, .csv.gz). You must include these extensions in your request URL.

The API returns one of these responses:

  • If the job is not finished – the 202 "Accepted" HTTP response code and a plain-text body containing the word Pending.
  • If the job is finished – the 200 "OK" HTTP response code and the file as an application/octet-stream.
  • If the job has expired – the 410 "Gone" HTTP response code. Completed jobs expire after 60 minutes.

When saving your file, use the same filename extensions you used in your URL request, for example: my_downloaded_file.csv.gz

The following example of Python code illustrates how you can download an exported report file in CSV format:

import requests
from requests.auth import HTTPBasicAuth
import time
import pandas as pd

# Constants for HTTP response codes
RC_SUCCESS = 200
RC_ACCEPTED = 202
RC_UNAUTHORIZED = 401

# Generate authentication string
username = 'APPLICATION_TOKEN' # replace APPLICATION_TOKEN with your application token
password = 'ACCESS_TOKEN' # replace ACCESS_TOKEN with your access token
basic_auth = HTTPBasicAuth(username, password)

# Download an exported file with the specified token
# Parameters:
# file_token - token of the file to download
# auth - authentication string
# base_url - base api path for download url
# retry_seconds - maximum time to retry, in seconds
def getCSV(file_token, auth, base_url, retry_seconds = 300):
           
            # Set timeout to current time plus maximum time to retry
            timeout = time.time() + retry_seconds

            # Build URL to download exported file
            download_file_url = base_url + '/download?token=' + file_token

            # Check status whether the file is ready for download
            code = requests.head(download_file_url, auth = basic_auth).status_code
            while (code != RC_SUCCESS) and time.time() < timeout:
                time.sleep(1)
                # Retry check status
                code = requests.head(download_file_url, auth = basic_auth).status_code
           
            if code == RC_SUCCESS:  # check status succeeded - the file is ready to download
                download_response = requests.get(download_file_url, auth = basic_auth)
               
                # Save the response content into a temporary file
                file = open('temp.csv.gz', 'wb')
                file.write(download_response.content)
                file.close()
              
                # Read the CSV content from the gzipped file
                data_out = pd.read_csv('temp.csv.gz', compression = 'gzip',
                                   error_bad_lines = False)
           
            else:
                data_out = 'no timely response' # check status timed out
               
            return data_out

# Build URL to export dataset for resource of interest (e.g. cards) in desired file format (e.g. CSV)
api_base_path = 'https://diva-api.marqeta.com/data/v2'
resource_format_path = '/views/cards/detail/csv'
program_selector = '?program=MY_PROGRAM' # replace MY_PROGRAM with the name of your program
export_dataset_url = api_base_path + resource_format_path + program_selector

# Invoke request to export the dataset
export_response = requests.get(export_dataset_url, auth = basic_auth)

if export_response.status_code == RC_ACCEPTED: # export request succeeded
   
    # Obtain the CSV file token from the response
    export_file_token = export_response.json().get('token')
   
    # Call the getCSV function to download the CSV file
    data = getCSV(file_token = export_file_token, auth = basic_auth, base_url = api_base_path)
   
    if data == 'no timely response':
        print('Failure: No timely response')
    else:
        print('Success: Dataset length = ' + str(len(data)))

elif export_response.status_code == RC_UNAUTHORIZED:
    print('Failure: Unauthorized access') # authentication failed

else:
    print('Failure: Unknown error') # export request failed