-
Notifications
You must be signed in to change notification settings - Fork 30
Python integration
The timeseries_client.py file is an API wrapper for AQTS systems for easy consumption in Python projects.
The AQUARIUS Samples API is also supported.
Requirements: Python 3.7-or-greater
Are you stuck in the Python 2.7 past? This older wrapper version should still work, but seriously, Python 2.x is dead. Join the 21st century. (2to3 is your friend for quickly bringing your old code into the new world.)
The timeseries_client.py wrapper class uses the awesome Requests for Humans package, plus some timezone parsing packages. Install the packages via pip.
$ pip install requests pytz pyrfc3339The timeseries_client.py wrapper exposes a few helper methods to make your python integrations simpler.
| Method | Description |
|---|---|
iso8601(datetime) |
Converts a Python datetime into an ISO 8601 text string. |
datetime(text) |
Converts an ISO8601 text string into a python datetime. |
getTimeSeriesUniqueId(timeSeriesIdentifier) |
Gets the unique ID from a text identifier. Will raise a ModelNotFoundException if the location or time-series does not exist. |
getLocationData(locationIdentifier) |
Gets the attributes of a location. |
getReportList() - Gets the list of generated reports on the system. |
|
deleteReport(reportUniqueId) |
Deletes a specific generated report. |
uploadExternalReport(locationUniqueId, pathToFile, title) |
Uploads a file as an externally-generated report. |
getTimeSeriesCorrectedData(identifier, queryFrom, QueryTo) |
Gets the corrected signal for one series. |
flattenResponse(timeseriesdataresponse) |
Projects points/grades/qualifiers/methods/notes to each point. |
getTimeSeriesData(seriesIds, queryFrom, queryTo) |
Gets data for up to 10 series, using interpolation rules to time-align the data from series 2 through 10 to the first series. |
appendPoints(series_identifier_or_unique_id, points, start, end) |
Appends points to a basic series. If start and end are specified, an overwrite append will be performed. Returns an identifier which can be used in subsequent waitForCompletedAppendRequest() or getAppendStatus() calls. |
appendReflectedPoints(series_identifier_or_unique_id, points, start, end) |
Appends points to a reflected series. Returns an identifier which can be used in subsequent waitForCompletedAppendRequest() or getAppendStatus() calls. |
getAppendStatus(append_request_identifier) |
Returns the status of an append request. |
waitForCompletedAppendRequest(append_request_identifier, timeout) |
Waits for an append request to be completed. The timeout defaults to 5 minutes. |
- 2022-Mar-06 - Fixed an IndexOutOfRangeError in the
flattenResponse()method - 2022-Feb-28 - Added
_create_authenticated_endpoint()method to access extended API endpoints - 2021-Dec-14 - Improved parsing of location identifiers, and automatically re-authenticate if a session times out.
- 2020-Dec-07 - Added
flattenResponse()method to project grades, approvals, qualifiers, and notes to points - 2021-Sep-04 - Improved format of web service error messages for AQTS and AQSamples
- 2021-Sep-03 -
timeseries_client.datetime()now handles with "24:00" timestamps correctly - 2021-Sep-01 - Fairly big internal refactoring, with minimal breaking external changes.
- Dropped Python 2.x support
- Added an improved User Agent header to all requests
- Fixed
send_batch_requests()for AQTS 2021.1+ while still working with AQTS 2020.4-and-older - Added AQUARIUS Samples API support
- 2020-Sep-03 - Eliminated the
.json()ceremony around each API operation - 2019-Dec-13 - Added field visit upload helper method
- 2018-Dec-10 - Added some helper methods
- 2017-Jun-08 - First release
Only major changes are listed above. See the change log for the detailed history of the Python API wrapper.
Step 1 - Import the API wrapper
# Import the class into your environment
>>> from timeseries_client import timeseries_clientStep 2 - Connect to the AQTS server using your credentials.
The hostname parameter of the timeseries_client constructor supports a number of formats:
-
'myserver'- Simple DNS name -
'123.231.132.213'- IP address (IPv4 or IPv6) -
'http://myserver'- HTTP URI -
'https://myserver'- HTTPS URI (if you have enabled HTTPS on your AQTS server) -
'https://myinstance.aquaticinformatics.net'- HTTPS URI for an AQUARIUS Cloud instance
# Connect to the server
>>> timeseries = timeseries_client('myserver', 'myusername', 'mypassword')Now the timeseries object represents an authenticated AQTS session.
Note: AQTS API access from python requires a credentialed account. You cannot use an ActiveDirectory or OpenIDConnect account for API access.
Step 3 - Make requests from the public API endpoints
The timeseries object has publish, acquisition, and provisioning properties that are Session objects, enabling fluent API requests from those public API endpoints, using the get(), post(), put(), and delete() methods.
# Grab the list of parameters from the server
>>> parameters = timeseries.publish.get('/GetParameterList')["Parameters"]Your authenticated AQTS session will expire one hour after the last authenticated request made.
It is a recommended (but not required) practice to disconnect from the server when your code is finished making API requests.
Your code can choose to immediately disconnect from AQTS by manually calling the disconnect method of the timeseries object.
>>> timeseries.disconnect()
# Any further requests made will fail with 401: UnauthorizedYou can also wrap you code in a python with statement, to automatically disconnect when the code block exits (even when an error is raised).
>>> with timeseries_client('localhost', 'admin', 'admin') as timeseries:
... parameters = timeseries.publish.get('/GetParameterList')["Parameters"]
...
>>> # We're logged out now.Any errors contained in the HTTP response will be raised through the raise_for_status() method from the Requests package.
# Issue a request to an unknown route
>>> r = timeseries.publish.get('/someinvalidroute')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "timeseries_client.py", line 39, in get
return response_or_raise(r)
File "timeseries_client.py", line 22, in response_or_raise
response.raise_for_status()
File "C:\Python27\lib\site-packages\requests\models.py", line 862, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://myserver/AQUARIUS/Publish/v2/someinvalidrouteYour code can use standard python error handling techniques (like try ... except ... blocks) to handle these exceptions as you'd like.
from requests.exceptions import HTTPError
try:
data = timeseries.publish.get('/GetSomeInvalidOperation')
print('Yay it worked!')
except HTTPError as error:
print(f'Something went wrong: {error}')The publish, acquisition, and provisioning Session objects expose get(), post(), put(), and delete() methods for making authenticated HTTP requests and returning response objects.
The standard API request/response pattern is essentially response = timeseries.{endpoint}.{verb}('/route', request params...)
- Where
{endpoint}is one of:publish,acquisition, orprovisioning - Where
{verb}is one of:get,post,put, ordelete - Where
/routeis the route required by the REST operation - Where
request params...are any extra request parameters, passed in the URL or in the body. See below for examples. - The returned JSON response is automatically converted into a python dictionary, or
Noneif a204 (No Content)response is received.
Remember that any HTTP errors will automatically be raised by the wrapper class.
See the Requests quickstart guide for full details.
The HTTP spec does not permit GET requests to include a JSON payload.
Any GET request parameters not contained within the route should be specified as a Python dictionary in the params argument of the get() method. The Requests library will automatically perform URL encoding of these query parameters, to ensure that the HTTP request is well-formed.
# Get a list of time-series at a location
>>> payload = {'LocationIdentifier': 'MyLocation'}
>>> list = timeseries.publish.get('/GetTimeSeriesDescriptionList', params=payload)["TimeSeriesDescriptions"]Non-GET requests should specify any request parameters as a Python dictionary in the json argument. The Requests library will convert the dictionary to a JSON stream and send the request.
# Create a new location
>>> payload = {'LocationIdentifier': 'Loc2', 'LocationName': 'My second location', 'LocationPath': 'All Locations', 'LocationType': 'Hydrology Station'}
>>> location = timeseries.provisioning.post('/locations', json=payload)# Change the display name of an existing parameter
# First fetch all the parameters in the system
>>> parameters = timeseries.provisioning.get('/parameters')['Results']
# Find the Stage parameter by parameter ID
>>> stage = next(p for p in parameters if p['ParameterId'] == 'HG')
# Change the identifier
>>> stage['Identifier'] = 'Stagey Thing'
# Issue the PUT request with the modified object
>>> timeseries.provisioning.put('/parameters/'+stage['UniqueId'], json=stage)"The API didn't work" - Usually that isn't true.
TL;DR - Run a web proxy like Fiddler in the background, while your Python script runs. If something fails unexpectedly, look at the traffic for a big red line indicating a failed API operation.
REST APIs rarely actually fail. If your HTTP request receives an HTTP response, then the API operation has technically "worked as designed".
Most of the time, an unexpected server response raises an HTTP error in your Python script because:
- the request might be incorrect, causing the operation to respond with a 4xx response status.
- a server-side error might have been encountered, causing the operation to respond with a 5xx response status.
But receiving a 4xx or 5xx status code (instead of the expected 2xx status code for successful operations) is still a working API.
The more common problem is that your script isn't handling errors in a meaningful way, but it should. See the Error Handling section for details.
Usually the true source of the error can be quickly understood by examining the HTTP request and its HTTP response together. There is usually a clue somewhere in those two items which explains why things are going sideways.
The API wrapper includes automatic support for the excellent-and-free Fiddler web debugging proxy for Windows.
If your Python script is run on a Windows system while Fiddler is running in the background, all the API requests from your script will be routed through Fiddler.
When an API request fails, the server will respond with a 4xx or 5xx status code, which shows as a red session in the Fiddler capture window.
Fiddler does have support for HTTPS traffic capture, but it needs to be expilictly enabled on your development system, so that it can install a self-signed certificate to intercept and re-encrypt the traffic.
If your target AQUARIUS server has HTTPS enabled (it should, and all our AQUARIUS Cloud instances only accept HTTPS requests), then you will need to do two things to allow Fiddler to capture your script's API traffic:
- Configure Fiddler to capture HTTPS traffic and trust the Fiddler Root certificate.
- Specify the
verify=Falseparameter when connecting, to tell Python to ignore certificate validation errors
# Connect to the HTTPS server, allowing for Fiddler traffic interception
>>> timeseries = timeseries_client('https://myserver.aquaticinformatics.net', 'myusername', 'mypassword', verify=False)There is a small one-time price paid when your script is run, since the API wrapper attempts to detect if a Fiddler.exe process is running on your system.
If you don't want your script's traffic to be routed through Fiddler (or to not even try to detect if Fiddler is running), you can do any of the following before connecting to the server:
- Set the
PYTHON_DISABLE_FIDDLERenvironment variable to any value - Set the
http_proxyorhttps_proxyenvironment variables to any string value, including and empty string""
Since python inherits its environment variables from the operating system, you can set PYTHON_DISABLE_FIDDLER on your system, and all Python scripts run on your system will avoid the Fiddler-detection logic.
import os
# Disable any automatic Fiddler capture
os.environ['PYTHON_DISABLE_FIDDLER'] = True
# Now connect to your server
timeseries = timeseries_client('https://myserver.aquaticinformatics.net', 'myusername', 'mypassword')
# ... make API requests ...You will need to refer to the appropriate AQUARIUS Time-Series API reference guide for any request-specific details. Simply browse to the API endpoint to view the API reference guide.
http://myserver/AQUARIUS/Publish/v2 will show the Publish API Reference Guide.
Also be sure to read the Common API Reference Guide section on JSON serialization, which describes the expected JSON formats for various data types.
AQTS APIs use ISO 8601 timestamps to represent times unambiguously, either as UTC times or with an explicit offset from UTC.
yyyy-MM-ddTHH:mm:ss.fffffffZ or:yyyy-MM-ddTHH:mm:ss.fffffff+HH:mm or:yyyy-MM-ddTHH:mm:ss.fffffff-HH:mm
Up to 7 digits can be specified to represent fractional seconds, yielding a maximum resolution of 100 nanoseconds. Fractional seconds are completely optional. All other fields (including the T separating the date and time components) are required.
The wrapper class exposed two helper methods for converting between ISO 8601 timestamp strings and python datetime objects.
The datetime(isoText) method converts an ISO 8601 string into a python datetime object.
The iso8601(dt) method does the reverse, converting a python datetime into an ISO 8601 timestampt string that AQTS can understand.
>>> ts['LastModified']
'2016-09-12T23:12:37.9704111+00:00'
>>> timeseries.datetime(ts['LastModified'])
datetime.datetime(2016, 9, 12, 23, 12, 37, 970411, tzinfo=<UTC>)
>>> timeseries.iso8601(timeseries.datetime(ts['LastModified']))
'2016-09-12T23:12:37.970411Z'Dealing with timestamps unambigously and correctly can be incredibly difficult. (Trust us! It's our job!) Also, see this if you are curious.
Every major programming language we've dealt with over the years has implemented wrong on their first try. This is true for C, C++, Java, and .NET. It should come as no surprise that Python messed it up as well.
Using python datetime objects without locking them down to an unambiguous offset from UTC is surprising common and is fraught with error.
For instance, did you know that datetime.utcnow() does not create a a UTC timestamp, as its name implies. Instead it just grabs the current UTC time and throws away the fact that it came from the UTC timezone! (sigh)
Python refers to a datetime object without a known timezone as a naive datetime.
The iso8601(dt) helper method will raise an error if it is given a naive datetime.
>>> from datetime import datetime
>>> timeseries.iso8601(datetime.now())
... Boom!
ValueError: naive datetime and accept_naive is False
>>> timeseries.iso8601(datetime.utcnow())
... Boom!
ValueError: naive datetime and accept_naive is FalseThe decision to reject naive datetimes will force your python code to be explicit about the timestamps it provides to AQTS.
To construct correct datetime objects, you will need to use the pytz library and associate every datetime with a timezone (the simplest being UTC).
>>> from datetime import datetime
>>> import pytz
>>> utc = pytz.UTC
>>> world_water_day = datetime(2017, 3, 22, 12, 0, 0, 0, utc) # Noon UTC
>>> canada_day = datetime(2017, 7, 1, 4, 0, 0, 0, utc) # Midnight, Ottawa
>>> timeseries.iso8601(world_water_day)
'2017-03-22T12:00:00.000000Z'
>>> timeseries.iso8601(canada_day)
'2017-07-01T04:00:00.000000Z'
>>> timeseries.iso8601(timeseries.datetime('2017-07-01T00:00:00-04:00'))
'2017-07-01T04:00:00.000000Z'This example will use the /GetTimeSeriesDescriptionList operation from the Publish API to find the unique ID of a time-series.
Most of the time, we refer an AQTS time-series by its identifier string, in <Parameter>.<Label>@<Location> format.
But parameter names, labels, and location identifiers can change over time.
Each time-series has a UniqueId string property which remains unchanged for the lifetime of the time-series.
Many AQTS APIs which operate on a time-series require this UniqueId value as an input.
>>> identifier = "Stage.Working@MyLocation"
# Parse out the location from the time-series identifier string
>>> location = identifier.split('@')[1]
>>> location
'MyLocation'
# Grab all the time-series at the location
>>> descriptions = timeseries.publish.get('/GetTimeSeriesDescriptionList', json={'LocationIdentifier':location})["TimeSeriesDescriptions"]
# Use a '[list comprehension]' to find the exact match
>>> ts = [d for d in descriptions if d['Identifier'] == identifier][0]
# Now grab the UniqueId property
>>> tsUniqueId = ts['UniqueId']
>>> tsUniqueId
'4d5acfc21eb44ab6902dc6547ab82935'Since this operation is common enough, the wrapper includes a getTimeSeriesUniqueId() method which does this work for you.
# Use the helper method to do all the work
>>> tsUniqueId = timeseries.getTimeSeriesUniqueId("Stage.Working@MyLocation")
>>> tsUniqueId
'4d5acfc21eb44ab6902dc6547ab82935'The getTimeSeriesUniqueId() method is also smart enough to recognize unique IDs as input, so if your code passes in a unique ID, the method just returns it as-is. This gives your code a bit more flexibility in the types of arguments it can accept for your scripting tasks.
>>> tsUniqueId = timeseries.getTimeSeriesUniqueId("4d5acfc21eb44ab6902dc6547ab82935")
>>> tsUniqueId
'4d5acfc21eb44ab6902dc6547ab82935'This example will build on Example 1 and append 2 points to the time-series using the Acquisition API.
The 2 points to append will be:
| Time | Value | Description |
|---|---|---|
2017-03-22T12:00:00.000000Z |
24 | 24th anniversary of World Water Day 2017, Noon UTC |
2017-07-01T04:00:00.000000Z |
150 | 150th anniversay of Canada Day 2017, Midnight Ottawa |
>>> from datetime import datetime
>>> import pytz
>>> utc = pytz.UTC
>>> world_water_day = datetime(2017, 3, 22, 12, 0, 0, 0, utc) # Noon UTC
>>> canada_day = datetime(2017, 7, 1, 4, 0, 0, 0, utc) # Midnight, Ottawa
# Create the points array
>>> points = [{'Time': timeseries.iso8601(world_water_day), 'Value': 23},
{'Time': timeseries.iso8601(canada_day), 'Value': 149}]
# Append these points to the time-series from Example 1
>>> response = timeseries.acquisition.post('/timeseries/'+ts['UniqueId']+'/append', json={'Points': points})
>>> job = response['AppendRequestIdentifier']
>>> job
'775775'
# The points were queued up for processing by AQTS as append request #775775
# Poll the server for the status of that append job
>>> response = timeseries.acquisition.get('/timeseries/appendstatus/'+job)
>>> response
{'NumberOfPointsAppended': 2, 'NumberOfPointsDeleted': 0, 'AppendStatus': 'Completed'}
# The job status is no longer 'Pending' so we are done.This example demonstrates how to fetch data about all the locations in your system.
Some data is not fully available in a single API call. One common use case is to find details of all the locations in your system.
Your code will need to make multiple requests, 1 + #NumberOfLocations:
- An initial
GET /AQUARIUS/Publish/v2/GetLocationDescriptionListrequest, to fetch the known location identifiers, unique IDs, names, and folder properties. - Multiple
GET /AQUARIUS/Provisioning/v1/locations/{uniqueId}requests - Or multiple
GET /AQUARIUS/Publish/v2/GetLocationData?LocationIdentifer={locationIdentifier}requests.
That NumberOfLocations might be very large for your system. Maybe thousands, or tens of thousands of locations.
(Using the Publish API is a bit slower to retrieve location details than Provisioning API, since the Publish API response also includes location names, datums, and reference point information, which take a bit more time to fetch from the database.)
You could make those multiple requests in a loop, one at a time:
# Fetch the initial location description list
locationDescriptions = timeseries.publish.get('/GetLocationDescriptionList')['LocationDescriptions']
# Fetch each location separately
locations = [timeseries.provisioning.get('/locations/'+loc['UniqueId']) for loc in locationDescriptions]That loop may take 5-6 minutes to fetch 20K locations, depending mainly on:
- the speed of the network between your python code and your AQUARIUS application server
- the speed of the network between your app server and your database server
- the speed of your database server
This wrapper also includes a send_batch_requests() helper method, to make repeated API requests in small batches, which can often double your perceived throughput.
The send_batch_requests(url, requests) method takes a URL pattern and a collection of request objects to fetch.
-
urlis a the url you would normally use in theget()method. If the route contains parameters, enclose them in{curlyBraces}. -
requestsis a collection of request objects -
batch_sizeis an optional parameter, which defaults to 100 requests per batch. -
verbis an optional parameter, which defaults toGET.
The same batch-fetch of Provisioning information may take only 2-or-3 minutes:
# Fetch the initial location description list
locationDescriptions = timeseries.publish.get('/GetLocationDescriptionList')['LocationDescriptions']
# Fetch the full location information from the Provisioning API (faster)
locations = timeseries.provisioning.send_batch_requests('/locations/{Id}', [{'LocationUniqueId': loc['UniqueId']} for loc in locationDescriptions])
# Or, alternatively fetch the full location information from the Publish API, but this is slower
locationData = timeseries.publish.send_batch_requests('/GetLocationData', [{'LocationIdentifier': loc['Identifier']} for loc in locationDescriptions])Either approach is fine, but sometimes the batch-fetch might save a few minutes in a long-running script.
The GET /AQUARIUS/Publish/v2/GetTimeSeriesCorrectedData operation (exposed by the getTimeSeriesCorrectedData() wrapper method) will return a list of points, plus separate lists of time-ranged metadata, where each item has a StartTime and EndTime property:
- Grades
- Approval Levels
- Method Codes
- Qualifiers
- Gap Tolerances
- Notes
Integrations will often want to know which metadata applies to which points. The flattenResponse() helper method will apply the appropriate metadata to each received point, so your integration can more easily put it in a table-shape for further analysis.
# Fetch a series from a specific start time
series = client.getTimeSeriesCorrectedData('Stage.Telemetry@5007', queryFrom='2022-02-01T00:00:00-05:00')
# each series['Points'][i] will have two 'Timestamp' and 'Value' properties
# Now flatten the metadata, projecting each metadata onto each point
client.flattenResponse(series)
# each series['Points'][i] will have six more properties added: 'Timestamp' and 'Value', plus 'GradeCode", 'Approval', 'Method', 'GapTolerance', 'Qualifiers', and 'Notes'The AQUARIUS Samples API can also be consumed from your Python using the API wrapper classes.
The API wrapper usage for Samples is similar to the Time-Series API, with some minor differences:
- You authenticate using an API token, rather than an AQUARIUS Time-Series username and password.
- Browse to https://yourinstance.aqsamples.com/api and follow the instructions to obtain an API token.
- URLs for Samples API operations begin with a "/v1" or "/v2"
- Some
get()requests can respond with many pages of data. Use thepaginated_get()method instead
>>> from timeseries_client import SamplesSession
>>> samples = SamplesSession("https://myorg.aqsamples.com", "01234567890123456789012345678901")
>>> projects = samples.get("/v1/projects")Some of the Samples API responses can return many results, split over multiple pages of data.
Dealing with paginated results is important, especially when the result count can be millions or tens-of-millions of items.
Some API operations like get('/v1/projects') to get all the configured projects return everything at once, since the list will never get too big. Other API operations like get('/v2/observations') may return tens of millions of records.
It is sometimes difficult to know which operations are going to be paginated and may require multiple GET requests to fetch all the data.
The API signature for a paginated operation is an operation which meets all of these criteria:
- The request supports an optional
limitparameter to control the size of each page of data. - The request supports an optional
cursorparameter to provide a continuation context for fetching the next page of data. - The response includes a
totalCountinteger property, indicating how many items exist in the entire result set. - The response includes a
cursorstring property, to be provided on the next GET request to fetch the next page of data. - The response includes a
domainObjectscollection property, containing one page of items.
If any of the above 5 criteria are not met, then using the get() method to issue a single request will be sufficient.
When all 5 criteria are met, then you should use the paginated_get() method to fetch all the pages of matching data.
>>> from timeseries_client import SamplesSession
>>> samples = SamplesSession("https://myorg.aqsamples.com", "01234567890123456789012345678901")
>>> projects = samples.get("/v1/projects")
# Oops! This API is actually paginated
>>> locations = samples.get("/v1/samplinglocations")
WARNING: Only 100 or 4049 items received. Try using the paginated_get() method instead.
# Call the paginated version instead and be willing to wait for all pages to be fetched.
locations = samples.paginated_get("/v1/samplinglocations", params={'limit':1000})
Fetching next page of 1000 items ... 25% complete: 1000 of 4049 items received.
Fetching next page of 1000 items ... 49% complete: 2000 of 4049 items received.
Fetching next page of 1000 items ... 74% complete: 3000 of 4049 items received.
Fetching next page of 1000 items ... 99% complete: 4000 of 4049 items received.The API wrapper will print a warning if your script makes a get() request that only returns a subset of
data and the API wrapper will print a progress message as each page of data is received as the paginated_get() method
fetches page after page of results.
These messages are enabled by default, to provide a hint that your script might need some rework, or might
take a long time to complete. If your organization has 15 million Samples observations, then what looks
like a simple call to observations = samples.paginated_get('/v2/observations) may actually take 3 weeks to complete!
Without these warnings and progress messages enabled, it might appear that your small script is hung, when in fact it is just doing a LOT of data retrieval.
The SamplesSession class has a callbacks dictionary, which allows you to override the default behavior of
these events:
-
samples.callbacks['pagination_warning']- Usesdefault_pagination_warning()as the pagination warning event -
samples.callbacks['pagination_progress']- Usesdefault_pagination_progress()as the pagination progress event -
samples.callbacks['on_connected']- Usesdefault_on_connected()as the connection message
You can set any of these callbacks to None to disable any messages from being logged.
You might not actually want to fetch all 15 million observations from your system.
If you do need all 15 million records, there isn't much you can do other than wait 3 calendar weeks for
paginated_get() to finish fetching the hundreds of thousands of pages of data, and hope that your Python interpreter has enough free memory (many gigabytes!) to store the entire collection of received records.
But most operations which support paginated results also support optional request parameters, which can be used to filter the large result space into smaller pieces:
- Filtering by time is quite common
- Filtering by sampling location or project is also common
- Filters are additive, so the more filters you provide, the smaller your result set becomes.
Please refer to the Samples API Swagger page to see the list of filter parameters which can be applied to each paginated API operation.