Skip to content
This repository was archived by the owner on Dec 12, 2023. It is now read-only.

Commit b11ca47

Browse files
authored
Jlinoff 20210122 demo02 (#25)
Updated demo02 to improve the analysis, the SQL, the dashboard layout and the run flow using the new upload-json-dashboard.sh and csv2sql.py scripts. Cleaned up hygiene issues in various tools including the scripts which lead to adding shellcheck targets to the Makefile. Added tests to verify all of the tools including the scripts.
1 parent 1543c99 commit b11ca47

16 files changed

Lines changed: 16994 additions & 4934 deletions

Makefile

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ SHELL = bash
33
PKG ?= grape
44
WHEEL_DEPS := README.md setup.py $(shell find grape -type f | fgrep -v cache)
55
.DEFAULT_GOAL := default
6-
SRCS = $(PKG) test
6+
SRCS = $(PKG) test tools
77

88
# Store the virtual environment in the project space.
99
export PIPENV_VENV_IN_PROJECT=1
@@ -54,6 +54,13 @@ pylint: init ## Lint the source code.
5454
$(call hdr,"$@")
5555
pipenv run pylint --disable=duplicate-code $(SRCS)
5656

57+
# shellcheck
58+
.PHONY: shellcheck
59+
shellcheck: ## Lint the bash scripts. Only works if shellcheck is installed.
60+
$(call hdr,"$@")
61+
shellcheck tools/runpga.sh
62+
shellcheck tools/upload-json-dashboard.sh
63+
5764
# mypy
5865
.PHONY: mypy
5966
mypy: init ## Type check the source code.

README.md

Lines changed: 66 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,10 @@ Grafana Prototyping Environment
1919
1. [Export](#export)
2020
1. [Status](#status)
2121
1. [Tree](#tree)
22+
1. [Tools](#tools)
23+
1. [csv2sql.py](#csv2sqlpy)
24+
1. [runpga.sh](#runpgash)
25+
1. [upload-json-dashboard.sh](#upload-json-dashboardsh)
2226
1. [Samples](#samples)
2327
1. [demo01](#demo01)
2428
1. [demo02](#demo02)
@@ -28,8 +32,6 @@ Grafana Prototyping Environment
2832
1. [Grafana](#grafana)
2933
1. [Postres](#postgres)
3034
1. [pgAdmin](#pgadmin)
31-
1. [runpga.sh](#runpgash)
32-
1. [upload-json-dashboard.sh](#upload-json-dashboardsh)
3335
1. [Acknowledgments](#acknowledgments)
3436

3537
</details>
@@ -321,7 +323,68 @@ jbhgr:4640
321323
└─ Northstar:2
322324
└─ dashboards
323325
├─ Jenkins Build Health Details:id=4:uid=ir0QjX-Mz:panels=9
324-
└─ Jenkins Build Health:id=3:uid=6Q0QCuaGk:panels=70```
326+
└─ Jenkins Build Health:id=3:uid=6Q0QCuaGk:panels=70
327+
```
328+
329+
330+
### Tools
331+
This section describes the tools in the local `tools` directory. They
332+
are not integrated into `grape` at this time because they don't
333+
fit the grape idiom but that is a completely subjective decision
334+
which can be revisitied at any time.
335+
336+
337+
#### csv2sql.py
338+
This is a standalone tool to read a CSV data file with a header
339+
row and convert it to SQL instructions to create and populate a table
340+
generically.
341+
342+
It is generic because it figures out the field types by analyzing the
343+
data.
344+
345+
There are a number of options for specifying the output, how to
346+
convert certain values and what SQL types to use for integers,
347+
floats, dates and strings.
348+
349+
It is useful for adding CSV data to your dashboards.
350+
351+
See the help (`-h`) for more detailed information.
352+
353+
354+
#### runpga.sh
355+
There is script called `tools/runpga.sh` that will create a pgAdmin
356+
container for you.
357+
358+
For `demo01` you would run it like this:
359+
360+
```bash
361+
$ tools/runpga.sh demo01pg
362+
```
363+
364+
When it completes it prints out the information necessary to
365+
login into the pgAdmin and connect to the database.
366+
367+
368+
#### upload-json-dashboard.sh
369+
There is a script called `tools/upload-json-dashboard.sh` that will upload
370+
a JSON dashboard to a Grafana server from the command line.
371+
372+
The upload is limited to servers with simple authentication based on a
373+
username and password unless you override it using `-x` and `-n`.
374+
375+
The local dashboard JSON file is creatined by exporting the dashboard
376+
from the Grafana UI with the "Export for sharing externally" checkbox
377+
checked.
378+
379+
This script is useful for transferring a single dashboard from one
380+
server to another.
381+
382+
Although the same function can be accomplished in the UI, this script
383+
allows updates to be automated from the command line.
384+
385+
This script requires that "curl" is installed.
386+
387+
See the script help (`-h`) for more information and examples.
325388

326389

327390
### Samples
@@ -441,42 +504,6 @@ or demo01 or `172.17.0.1L4411` for demo02 when referenced from the
441504
`pgAdmin` docker container created above: `http://localhost:4450`.
442505

443506

444-
#### runpga.sh
445-
There is script called `tools/runpga.sh` that will create a pgAdmin
446-
container for you.
447-
448-
For `demo01` you would run it like this:
449-
450-
```bash
451-
$ tools/runpga.sh demo01pg
452-
```
453-
454-
When it completes it prints out the information necessary to
455-
login into the pgAdmin and connect to the database.
456-
457-
458-
#### upload-json-dashboard.sh
459-
There is a script called `tools/upload-json-dashboard.sh` that will upload
460-
a JSON dashboard to a Grafana server from the command line.
461-
462-
The upload is limited to servers with simple authentication based on a
463-
username and password unless you override it using `-x` and `-n`.
464-
465-
The local dashboard JSON file is creatined by exporting the dashboard
466-
from the Grafana UI with the "Export for sharing externally" checkbox
467-
checked.
468-
469-
This script is useful for transferring a single dashboard from one
470-
server to another.
471-
472-
Although the same function can be accomplished in the UI, this script
473-
allows updates to be automated from the command line.
474-
475-
This script requires that "curl" is installed.
476-
477-
See the script help (`-h`) for more information and examples.
478-
479-
480507
### Acknowledgments
481508

482509
* Many thanks to Deron Ferguson for helping me track down and debug problems on windows 10.

grape/__version__

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.4.5
1+
0.5.0

img/demo02.png

124 KB
Loading

samples/demo02/README.md

Lines changed: 165 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,27 @@
11
# demo02
2-
This directory contains a sample that is fully contained that is a bit
3-
more realistic than demo01 because it shows graph data from a public
4-
datasurce.
2+
This demo is a bit more realistic than `demo01` because it shows how
3+
to populate the database and construct a dashboard from CSV data.
54

6-
It will create a demo02 grafana system on host port 4410 that reads
7-
data from the associated database managed by the `demo02pg` container.
5+
It does this by creating a grape system on host port 4410 that reads
6+
data from the associated database managed by the `demo02pg` container
7+
and displays it in the grafana server container: `demo2gr`.
88

9-
The purpose is to show how to create an environment from scratch and then
10-
modify it. In the sample the modification comes from the `grafana.json`
11-
file and the `etl.py` tool that converts the downloaded data to database
12-
tables and views.
9+
The demo is run by executing the `run.sh` script. That script first
10+
creates the grafana and postgres servers: `demo02gr` and `demo02pg`
11+
using grape.
1312

13+
After that the table creation SQL is generated from
14+
`all_weekly_excess_deaths.csv` by converting it using the `csv2sql.py`
15+
tool from the `tools` directory. The generated SQL is stored in a
16+
place that the database server can see
17+
(`demo02pg/mnt/all_weekly_excess_deaths.csv`) afterwhich `psql` is run
18+
to create the data table.
19+
20+
Finally, the dashboard is created by running
21+
`upload-json-dashboard.sh` using `dash.json`.
22+
23+
24+
### How to run it
1425
In an interactive environment you would mostly likely make all of the
1526
changes in grafana and the database directly.
1627

@@ -19,12 +30,20 @@ To run the demo:
1930
$ ./run.sh
2031
```
2132

33+
> Note that this script uses `grape/tools/upload-json-dashboard.sh` to
34+
> load the dashboard into the server and it uses
35+
> `grape/tools/csv2sql.py` to process the raw CSV and convert it to
36+
> SQL table creation commands.
37+
38+
39+
### Result
2240
When the run has completed, you will be able navigate to
2341
http://localhost:4410 to see the newly created dashboard.
2442

2543
It looks like this:
2644
!['demo02'](/img/demo02.png)
2745

46+
### Discussion
2847
This demo analyzes an open-source dataset from the Economist that
2948
allows it to grab information about COVID-19 mortality along with
3049
information about excess deaths. It then uses that information to
@@ -33,8 +52,9 @@ characterized as non-covid when they were actually Covid (using the
3352
excess deaths as a baseline).
3453

3554
Please note that this demo is _not_ about the metholodogy or the
36-
results, which may well be flawed. Instead it is meant as a to help
37-
you understand how to do visualizations.
55+
results, which is very likely flawed. Instead it is meant to help
56+
you understand how to do visualizations from third party sources
57+
that provide CSV data.
3858

3959
Also note that this is not meant to be guide to using postgres or
4060
grafana in any detail. It will merely help you set it up so that you
@@ -44,7 +64,141 @@ This document only shows simple time series data in the graph but be
4464
aware that you can include moving averages and other trend analysis by
4565
creating appropriate SQL queries.
4666

67+
68+
### Raw Data
4769
The raw data in `all_weekly_excess_deaths.csv` was manually
4870
downloaded from
4971
[this](https://raw.githubusercontent.com/TheEconomist/covid-19-excess-deaths-tracker/master/output-data/excess-deaths/all_weekly_excess_deaths.csv)
5072
site.
73+
74+
75+
### csv2sql.py
76+
The `csv2sql.py` tool reads the `all_weekly_excess_deaths.csv` and
77+
processes the CSV to automatically figure out the column types before
78+
writing out the SQL which makes it very useful for arbitrary
79+
datasets.
80+
81+
For more information about this tool specify the help (`-h`) option.
82+
83+
It is important to note that this tool and the subsequent database
84+
load must be run _before_ the JSON dashboard is uploaded.
85+
86+
This is the sequence of commands that are used to populate the database.
87+
88+
```bash
89+
$ pipenv run python ../../tools/csv2sql.py -c NA=0 -v all_weekly_excess_deaths.csv -o demo02pg/mnt/all_weekly_excess_deaths.sql
90+
.
91+
.
92+
$ docker exec -it demo02pg psql -U postgres -d postgres -f /mnt/all_weekly_excess_deaths.sql
93+
.
94+
.
95+
```
96+
97+
The first command creates the SQL file and the second one updates the database.
98+
99+
Once the update is complete, the newly created table can be viewed like this.
100+
101+
```bash
102+
$ docker exec -it demo02pg psql -U postgres -d postgres -c '\dS+ all_weekly_excess_deaths'
103+
.
104+
.
105+
```
106+
107+
108+
### upload-json-dashboard.sh
109+
The `upload-json-dashboard.sh` tool reads a JSON file that is created
110+
from a single dashboard in the grafana UI that has the `Export for
111+
sharing externally` checkbox checked. Setting that flag causes the
112+
datasources used by the dashboard to be defined as variables that can
113+
be overwritten. In this example there is a single datasource variable
114+
named `DS_DEMO02PG`.
115+
116+
For more information about this tool specify the help (`-h`) option.
117+
118+
It is important to note that this tool must be run _after_ the
119+
database is populated to avoid issues.
120+
121+
The command to upload the dashboard looks like this.
122+
123+
```bash
124+
$ ../../tools/upload-json-dashboard.sh -f 0 -j dash.json -d "demo02pg" -g "http://localhost:4410"
125+
.
126+
.
127+
```
128+
129+
130+
### SQL Table Schema
131+
This is the SQL table schema that was generated by `csv2sql.py`.
132+
133+
```
134+
$ docker exec -it demo02pg psql -U postgres -d postgres -c '\dS+ all_weekly_excess_deaths'
135+
Table "public.all_weekly_excess_deaths"
136+
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
137+
--------------------------+-----------------------------+-----------+----------+------------------------------------------------------+----------+--------------+-------------
138+
id | integer | | not null | nextval('all_weekly_excess_deaths_id_seq'::regclass) | plain | |
139+
country | text | | | | extended | |
140+
region | text | | | | extended | |
141+
region_code | text | | | | extended | |
142+
start_date | timestamp without time zone | | | | plain | |
143+
end_date | timestamp without time zone | | | | plain | |
144+
year | integer | | | | plain | |
145+
week | integer | | | | plain | |
146+
population | integer | | | | plain | |
147+
total_deaths | integer | | | | plain | |
148+
covid_deaths | integer | | | | plain | |
149+
expected_deaths | numeric | | | | main | |
150+
excess_deaths | numeric | | | | main | |
151+
non_covid_deaths | integer | | | | plain | |
152+
covid_deaths_per_100k | numeric | | | | main | |
153+
excess_deaths_per_100k | numeric | | | | main | |
154+
excess_deaths_pct_change | numeric | | | | main | |
155+
Indexes:
156+
"all_weekly_excess_deaths_pkey" PRIMARY KEY, btree (id)
157+
Access method: heap
158+
```
159+
160+
161+
### SQL Query
162+
The basic SQL query used for the time series graph looks like this:
163+
164+
```sql
165+
WITH bigtime AS
166+
(SELECT
167+
*,
168+
to_date(year::text || ' ' || week::text, 'IYYYIW') AS time,
169+
-- uncounted_deaths is the number of unexpected deaths minus the covid_deaths
170+
-- which assumes that all covid death are unexpected
171+
-- if total_deaths < expected_deaths then this is not accurate
172+
total_deaths - expected_deaths as unexpected_deaths,
173+
greatest(total_deaths - expected_deaths - covid_deaths, 0) as uncounted_deaths
174+
FROM all_weekly_excess_deaths)
175+
SELECT
176+
$__timeGroup(time, '1w'),
177+
-- total_deaths as "Total Deaths",
178+
-- expected_deaths as "Expected Deaths",
179+
covid_deaths as "COVID Deaths Reported",
180+
uncounted_deaths as "COVID Deaths Not Reported",
181+
'weekly:' as metric
182+
FROM
183+
bigtime
184+
WHERE
185+
$__timeFilter(time)
186+
AND country in ($country)
187+
AND region in ($region)
188+
GROUP BY
189+
time,
190+
total_deaths,
191+
expected_deaths,
192+
unexpected_deaths,
193+
uncounted_deaths,
194+
non_covid_deaths,
195+
covid_deaths
196+
ORDER BY
197+
time,
198+
total_deaths,
199+
expected_deaths,
200+
unexpected_deaths,
201+
uncounted_deaths,
202+
non_covid_deaths,
203+
covid_deaths ASC
204+
```

0 commit comments

Comments
 (0)