11# demo02
2- This directory contains a sample that is fully contained that is a bit
3- more realistic than demo01 because it shows graph data from a public
4- datasurce.
2+ This demo is a bit more realistic than ` demo01 ` because it shows how
3+ to populate the database and construct a dashboard from CSV data.
54
6- It will create a demo02 grafana system on host port 4410 that reads
7- data from the associated database managed by the ` demo02pg ` container.
5+ It does this by creating a grape system on host port 4410 that reads
6+ data from the associated database managed by the ` demo02pg ` container
7+ and displays it in the grafana server container: ` demo2gr ` .
88
9- The purpose is to show how to create an environment from scratch and then
10- modify it. In the sample the modification comes from the ` grafana.json `
11- file and the ` etl.py ` tool that converts the downloaded data to database
12- tables and views.
9+ The demo is run by executing the ` run.sh ` script. That script first
10+ creates the grafana and postgres servers: ` demo02gr ` and ` demo02pg `
11+ using grape.
1312
13+ After that the table creation SQL is generated from
14+ ` all_weekly_excess_deaths.csv ` by converting it using the ` csv2sql.py `
15+ tool from the ` tools ` directory. The generated SQL is stored in a
16+ place that the database server can see
17+ (` demo02pg/mnt/all_weekly_excess_deaths.csv ` ) afterwhich ` psql ` is run
18+ to create the data table.
19+
20+ Finally, the dashboard is created by running
21+ ` upload-json-dashboard.sh ` using ` dash.json ` .
22+
23+
24+ ### How to run it
1425In an interactive environment you would mostly likely make all of the
1526changes in grafana and the database directly.
1627
@@ -19,12 +30,20 @@ To run the demo:
1930$ ./run.sh
2031```
2132
33+ > Note that this script uses ` grape/tools/upload-json-dashboard.sh ` to
34+ > load the dashboard into the server and it uses
35+ > ` grape/tools/csv2sql.py ` to process the raw CSV and convert it to
36+ > SQL table creation commands.
37+
38+
39+ ### Result
2240When the run has completed, you will be able navigate to
2341http://localhost:4410 to see the newly created dashboard.
2442
2543It looks like this:
2644![ 'demo02'] ( /img/demo02.png )
2745
46+ ### Discussion
2847This demo analyzes an open-source dataset from the Economist that
2948allows it to grab information about COVID-19 mortality along with
3049information about excess deaths. It then uses that information to
@@ -33,8 +52,9 @@ characterized as non-covid when they were actually Covid (using the
3352excess deaths as a baseline).
3453
3554Please note that this demo is _ not_ about the metholodogy or the
36- results, which may well be flawed. Instead it is meant as a to help
37- you understand how to do visualizations.
55+ results, which is very likely flawed. Instead it is meant to help
56+ you understand how to do visualizations from third party sources
57+ that provide CSV data.
3858
3959Also note that this is not meant to be guide to using postgres or
4060grafana in any detail. It will merely help you set it up so that you
@@ -44,7 +64,141 @@ This document only shows simple time series data in the graph but be
4464aware that you can include moving averages and other trend analysis by
4565creating appropriate SQL queries.
4666
67+
68+ ### Raw Data
4769The raw data in ` all_weekly_excess_deaths.csv ` was manually
4870downloaded from
4971[ this] ( https://raw.githubusercontent.com/TheEconomist/covid-19-excess-deaths-tracker/master/output-data/excess-deaths/all_weekly_excess_deaths.csv )
5072site.
73+
74+
75+ ### csv2sql.py
76+ The ` csv2sql.py ` tool reads the ` all_weekly_excess_deaths.csv ` and
77+ processes the CSV to automatically figure out the column types before
78+ writing out the SQL which makes it very useful for arbitrary
79+ datasets.
80+
81+ For more information about this tool specify the help (` -h ` ) option.
82+
83+ It is important to note that this tool and the subsequent database
84+ load must be run _ before_ the JSON dashboard is uploaded.
85+
86+ This is the sequence of commands that are used to populate the database.
87+
88+ ``` bash
89+ $ pipenv run python ../../tools/csv2sql.py -c NA=0 -v all_weekly_excess_deaths.csv -o demo02pg/mnt/all_weekly_excess_deaths.sql
90+ .
91+ .
92+ $ docker exec -it demo02pg psql -U postgres -d postgres -f /mnt/all_weekly_excess_deaths.sql
93+ .
94+ .
95+ ```
96+
97+ The first command creates the SQL file and the second one updates the database.
98+
99+ Once the update is complete, the newly created table can be viewed like this.
100+
101+ ``` bash
102+ $ docker exec -it demo02pg psql -U postgres -d postgres -c ' \dS+ all_weekly_excess_deaths'
103+ .
104+ .
105+ ```
106+
107+
108+ ### upload-json-dashboard.sh
109+ The ` upload-json-dashboard.sh ` tool reads a JSON file that is created
110+ from a single dashboard in the grafana UI that has the `Export for
111+ sharing externally` checkbox checked. Setting that flag causes the
112+ datasources used by the dashboard to be defined as variables that can
113+ be overwritten. In this example there is a single datasource variable
114+ named ` DS_DEMO02PG ` .
115+
116+ For more information about this tool specify the help (` -h ` ) option.
117+
118+ It is important to note that this tool must be run _ after_ the
119+ database is populated to avoid issues.
120+
121+ The command to upload the dashboard looks like this.
122+
123+ ``` bash
124+ $ ../../tools/upload-json-dashboard.sh -f 0 -j dash.json -d " demo02pg" -g " http://localhost:4410"
125+ .
126+ .
127+ ```
128+
129+
130+ ### SQL Table Schema
131+ This is the SQL table schema that was generated by ` csv2sql.py ` .
132+
133+ ```
134+ $ docker exec -it demo02pg psql -U postgres -d postgres -c '\dS+ all_weekly_excess_deaths'
135+ Table "public.all_weekly_excess_deaths"
136+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
137+ --------------------------+-----------------------------+-----------+----------+------------------------------------------------------+----------+--------------+-------------
138+ id | integer | | not null | nextval('all_weekly_excess_deaths_id_seq'::regclass) | plain | |
139+ country | text | | | | extended | |
140+ region | text | | | | extended | |
141+ region_code | text | | | | extended | |
142+ start_date | timestamp without time zone | | | | plain | |
143+ end_date | timestamp without time zone | | | | plain | |
144+ year | integer | | | | plain | |
145+ week | integer | | | | plain | |
146+ population | integer | | | | plain | |
147+ total_deaths | integer | | | | plain | |
148+ covid_deaths | integer | | | | plain | |
149+ expected_deaths | numeric | | | | main | |
150+ excess_deaths | numeric | | | | main | |
151+ non_covid_deaths | integer | | | | plain | |
152+ covid_deaths_per_100k | numeric | | | | main | |
153+ excess_deaths_per_100k | numeric | | | | main | |
154+ excess_deaths_pct_change | numeric | | | | main | |
155+ Indexes:
156+ "all_weekly_excess_deaths_pkey" PRIMARY KEY, btree (id)
157+ Access method: heap
158+ ```
159+
160+
161+ ### SQL Query
162+ The basic SQL query used for the time series graph looks like this:
163+
164+ ``` sql
165+ WITH bigtime AS
166+ (SELECT
167+ * ,
168+ to_date(year::text || ' ' || week::text , ' IYYYIW' ) AS time ,
169+ -- uncounted_deaths is the number of unexpected deaths minus the covid_deaths
170+ -- which assumes that all covid death are unexpected
171+ -- if total_deaths < expected_deaths then this is not accurate
172+ total_deaths - expected_deaths as unexpected_deaths,
173+ greatest(total_deaths - expected_deaths - covid_deaths, 0 ) as uncounted_deaths
174+ FROM all_weekly_excess_deaths)
175+ SELECT
176+ $__timeGroup(time , ' 1w' ),
177+ -- total_deaths as "Total Deaths",
178+ -- expected_deaths as "Expected Deaths",
179+ covid_deaths as " COVID Deaths Reported" ,
180+ uncounted_deaths as " COVID Deaths Not Reported" ,
181+ ' weekly:' as metric
182+ FROM
183+ bigtime
184+ WHERE
185+ $__timeFilter(time )
186+ AND country in ($country)
187+ AND region in ($region)
188+ GROUP BY
189+ time ,
190+ total_deaths,
191+ expected_deaths,
192+ unexpected_deaths,
193+ uncounted_deaths,
194+ non_covid_deaths,
195+ covid_deaths
196+ ORDER BY
197+ time ,
198+ total_deaths,
199+ expected_deaths,
200+ unexpected_deaths,
201+ uncounted_deaths,
202+ non_covid_deaths,
203+ covid_deaths ASC
204+ ```
0 commit comments