[NDH-640][SPIKE] Replacing a base table with a materialized view #283
+208
−75
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Warning
This PR is a work in progress and intended for demonstration purposes. Some functionality is broken by this change.
Jira Ticket NDH-640
Problem
The
/fhir/Organization/API demonstrates a SQL query pattern with extremely high operational costs in terms of database latency.In our testing environment with a closer-to-full-size dataset, requesting the 1001st page with
page_sizeof 100 takes almost 20 seconds.In simple load testing (10 URLs requested three at a time) I was able to crash the nonproduction service ~60% of the time.
Solution
Introduce a postgres materialized view over the appropriate sorting field, attach a Django model to it, and use that model to replace the model used by the
FHIROrganizationViewSetviewlistmethod to select organizations ordered by name.Result
Original base query to support the
listview inbackend/npdfhir/views.py:363-388:generates the raw SQL:
whose explanation shows no indexes and two full tables scans:
The updated materialized view query in this PR
backend/npdfhir/views.py:360-386:shows a much cleaner underlying query:
which exclusively relies on an index scan:
Risks
This approach will require maintaining the materialized view, for which there is a Postgres command:
This command will have to be run after every update to
organizationororganization_to_name, after every migration from ETL to NPD, or on a schedule within NPD itself.