Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 6 additions & 9 deletions README-PYPI.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,10 @@ DP Wizard demonstrates how to calculate DP statistics or create a synthetic data

(If differential privacy is new to you, [these slides](https://opendp.github.io/dp-wizard/) provide some background, and explain how DP Wizard works.)

You can run DP Wizard locally and upload your own CSV,
or use the [cloud deployment](https://mccalluc-dp-wizard.share.connect.posit.cloud/) and only provide column names to protect your private data.
In either case, you'll be prompted to describe your privacy budget and the analysis you need.
With that information, DP Wizard provides:

- A Jupyter notebook which demonstrates how to use the [OpenDP Library](https://docs.opendp.org/).
After selecting a local CSV, you'll be prompted to describe the analysis you need.
Output options include:
- A Jupyter notebook which demonstrates how to use
the [OpenDP Library](https://docs.opendp.org/).
- A plain Python script.
- Text and CSV reports.

Expand All @@ -39,17 +37,16 @@ The exact upgrade process will depend on your environment and operating system.
Install with `pip install 'dp_wizard[app]'` and you can start DP Wizard from the command line.

```
usage: dp-wizard [-h] [--sample | --cloud]
usage: dp-wizard [-h] [--sample]

DP Wizard makes it easier to get started with Differential Privacy.

options:
-h, --help show this help message and exit
--sample Generate a sample CSV: See how DP Wizard works without providing
your own data
--cloud Prompt for column names instead of CSV upload

Unless you have set "--sample" or "--cloud", you will specify a CSV
Unless you have set "--sample", you will specify a CSV
inside the application.

Provide a "Private CSV" if you only have a private data set, and want to
Expand Down
15 changes: 6 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,10 @@ DP Wizard demonstrates how to calculate DP statistics or create a synthetic data

(If differential privacy is new to you, [these slides](https://opendp.github.io/dp-wizard/) provide some background, and explain how DP Wizard works.)

You can run DP Wizard locally and upload your own CSV,
or use the [cloud deployment](https://mccalluc-dp-wizard.share.connect.posit.cloud/) and only provide column names to protect your private data.
In either case, you'll be prompted to describe your privacy budget and the analysis you need.
With that information, DP Wizard provides:

- A Jupyter notebook which demonstrates how to use the [OpenDP Library](https://docs.opendp.org/).
After selecting a local CSV, you'll be prompted to describe the analysis you need.
Output options include:
- A Jupyter notebook which demonstrates how to use
the [OpenDP Library](https://docs.opendp.org/).
- A plain Python script.
- Text and CSV reports.

Expand All @@ -39,17 +37,16 @@ The exact upgrade process will depend on your environment and operating system.
Install with `pip install 'dp_wizard[app]'` and you can start DP Wizard from the command line.

```
usage: dp-wizard [-h] [--sample | --cloud]
usage: dp-wizard [-h] [--sample]

DP Wizard makes it easier to get started with Differential Privacy.

options:
-h, --help show this help message and exit
--sample Generate a sample CSV: See how DP Wizard works without providing
your own data
--cloud Prompt for column names instead of CSV upload

Unless you have set "--sample" or "--cloud", you will specify a CSV
Unless you have set "--sample", you will specify a CSV
inside the application.

Provide a "Private CSV" if you only have a private data set, and want to
Expand Down
2 changes: 1 addition & 1 deletion docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -645,7 +645,7 @@ <h1>Return to the class grades example ✋</h1>
<tr>
<td>
<p><strong><a
href="https://pypi.org/project/dp_wizard/"><code>pip install 'dp_wizard[app]'</code></a><br><code>dp_wizard --cloud</code><br><small>(requires
href="https://pypi.org/project/dp_wizard/"><code>pip install 'dp_wizard[app]'</code></a><br><code>dp_wizard --sample</code><br><small>(requires
Python&gt;=3.10)</small></strong></p>
</td>
<td>
Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -388,7 +388,7 @@ Divide into four teams, and on one computer either:
<tr>
<td>

**[`pip install 'dp_wizard[app]'`](https://pypi.org/project/dp_wizard/)<br>`dp_wizard --cloud`<br><small>(requires Python>=3.10)</small>**
**[`pip install 'dp_wizard[app]'`](https://pypi.org/project/dp_wizard/)<br>`dp_wizard --sample`<br><small>(requires Python>=3.10)</small>**

</td>
<td>
Expand Down
1 change: 0 additions & 1 deletion dp_wizard/shiny/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,6 @@ def server(input: Inputs, output: Outputs, session: Session): # pragma: no cove
state = AppState(
# CLI options:
is_sample_csv=cli_info.is_sample_csv,
in_cloud=cli_info.is_cloud_mode,
qa_mode=cli_info.is_qa_mode,
# Reactive bools:
is_tutorial_mode=reactive.value(cli_info.get_is_tutorial_mode()),
Expand Down
2 changes: 0 additions & 2 deletions dp_wizard/shiny/components/summaries.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,6 @@ def dataset_summary(state: AppState): # pragma: no cover
sources.append("Private CSV")
if state.public_csv_path():
sources.append("Public CSV")
if state.in_cloud:
sources.append("Field List")

contributions = state.contributions()
entity = state.contributions_entity()
Expand Down
1 change: 0 additions & 1 deletion dp_wizard/shiny/panels/analysis_panel/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,6 @@ def analysis_server(
): # pragma: no cover
# CLI options:
is_sample_csv = state.is_sample_csv
# in_cloud = state.in_cloud

# Reactive bools:
is_tutorial_mode = state.is_tutorial_mode
Expand Down
27 changes: 3 additions & 24 deletions dp_wizard/shiny/panels/dataset_panel/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,6 @@ def dataset_server(
): # pragma: no cover
# CLI options:
is_sample_csv = state.is_sample_csv
in_cloud = state.in_cloud

# Reactive bools:
is_tutorial_mode = state.is_tutorial_mode
Expand Down Expand Up @@ -169,12 +168,6 @@ def _on_private_csv_path_change():
private_csv_path.set(path)
csv_info.set(CsvInfo(Path(path)))

@reactive.effect
@reactive.event(input.all_column_names)
def _on_column_names_change():
# Only used when the user is supplying column names in cloud mode.
csv_info.set(infer_csv_info(input.all_column_names()))

@reactive.calc
def csv_column_mismatch_calc() -> Optional[tuple[set, set]]:
public = public_csv_path()
Expand Down Expand Up @@ -223,7 +216,6 @@ def welcome_ui():
@render.ui
def csv_or_columns_ui():
return data_source.csv_or_columns_ui(
in_cloud=in_cloud,
is_tutorial_mode=is_tutorial_mode,
csv_info=csv_info,
)
Expand Down Expand Up @@ -359,7 +351,7 @@ def set_is_dataset_selected():
and not info.get_is_error()
and len(info.get_all_column_names()) > 0
and not get_row_count_errors(max_rows())
and (in_cloud or not csv_column_mismatch_calc())
and not csv_column_mismatch_calc()
)

@reactive.calc
Expand Down Expand Up @@ -388,26 +380,13 @@ def contributions_validation_ui():

@render.ui
def python_tutorial_ui():
cloud_extra_markdown = (
"""
Because this instance of DP Wizard is running in the cloud,
we don't allow private data to be uploaded.
When run locally, DP Wizard can also run an analysis
on your data and return results,
and not just an unexecuted notebook.
"""
if in_cloud
else ""
)
return tutorial_box(
is_tutorial_mode(),
f"""
"""
Along the way, code samples demonstrate
how the information you provide is used in the
OpenDP Library, and at the end you can download
a notebook for the entire calculation.

{cloud_extra_markdown}
""",
responsive=False,
)
Expand Down Expand Up @@ -465,7 +444,7 @@ def define_analysis_button_ui():
return [
button,
f"""
Specify {'columns' if in_cloud else 'CSV'}, unit of privacy,
Specify CSV, unit of privacy,
and maximum row count before proceeding.
""",
]
Expand Down
51 changes: 7 additions & 44 deletions dp_wizard/shiny/panels/dataset_panel/data_source.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,57 +19,21 @@


def csv_or_columns_ui(
in_cloud: bool,
is_tutorial_mode: reactive.Value[bool],
csv_info: reactive.Value[CsvInfo],
): # pragma: no cover
if in_cloud:
content = [
ui.markdown(
"""
Provide the names of columns you'll use in your analysis,
one per line, with a sample value for each. For example:

```
name: Chuck
age: 48
```
"""
),
tutorial_box(
is_tutorial_mode(),
"""
When [installed and run
locally](https://pypi.org/project/dp_wizard/),
DP Wizard allows you to specify a private and public CSV,
but for the safety of your data, in the cloud
DP Wizard only accepts column names.

If you don't have other ideas, we can imagine
a CSV of student quiz grades: Enter `student_id`,
`quiz_id`, `grade`, and `class_year_str` below,
each on a separate line.
""",
responsive=False,
),
ui.input_text_area("all_column_names", "CSV Column Names", rows=5),
]
else:
content = [
ui.markdown(
f"""
return [
ui.markdown(
f"""
Choose **Private CSV** {PRIVATE_TEXT}

Choose **Public CSV** {PUBLIC_TEXT}

Choose both **Private CSV** and **Public CSV** {PUBLIC_PRIVATE_TEXT}
"""
),
ui.output_ui("input_files_ui"),
ui.output_ui("csv_message_ui"),
]

content += [
"""
),
ui.output_ui("input_files_ui"),
ui.output_ui("csv_message_ui"),
code_sample(
"Context",
Template(
Expand Down Expand Up @@ -98,7 +62,6 @@ def csv_or_columns_ui(
),
ui.output_ui("python_tutorial_ui"),
]
return content


def input_files_ui(
Expand Down
76 changes: 28 additions & 48 deletions dp_wizard/shiny/panels/results_panel/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,6 @@ def results_server(
): # pragma: no cover
# CLI options:
# is_sample_csv = state.is_sample_csv
in_cloud = state.in_cloud
qa_mode = state.qa_mode

# Reactive bools:
Expand Down Expand Up @@ -209,66 +208,47 @@ def download_results_ui():
]
if product() == Product.SYNTHETIC_DATA:
downloads.append("Contingency Table")
return (
ui.markdown(
"""
When [installed and run
locally](https://pypi.org/project/dp_wizard/),
there are more download options because DP Wizard
can read your private CSV and release differentially
private statistics.
return [
tutorial_box(
is_tutorial_mode(),
"""
)
if in_cloud
else [
tutorial_box(
is_tutorial_mode(),
"""
Now you can download a notebook for your analysis.
The Jupyter notebook could be used locally or on Colab,
but the HTML version can be viewed in the brower.
""",
responsive=False,
),
download_button(
"Package",
primary=True,
disabled=disabled,
),
ui.br(),
"Contains:",
ui.tags.ul(
*[
ui.tags.li(
download_link(
download,
disabled=disabled,
)
responsive=False,
),
download_button(
"Package",
primary=True,
disabled=disabled,
),
ui.br(),
"Contains:",
ui.tags.ul(
*[
ui.tags.li(
download_link(
download,
disabled=disabled,
)
for download in downloads
]
),
]
)
)
for download in downloads
]
),
]

@render.ui
def download_code_ui():
disabled = not weights()
return [
tutorial_box(
is_tutorial_mode(),
(
"""
In the cloud, DP Wizard only provides unexecuted
notebooks and scripts.
"""
if in_cloud
else """
Alternatively, you can download a script or unexecuted
notebook that demonstrates the steps of your analysis,
but does not contain any data or analysis results.
"""
),
"""
Alternatively, you can download a script or unexecuted
notebook that demonstrates the steps of your analysis,
but does not contain any data or analysis results.
""",
responsive=False,
),
download_button("Notebook (unexecuted)", disabled=disabled),
Expand Down
1 change: 0 additions & 1 deletion dp_wizard/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,6 @@ def get_is_error(self) -> bool:
class AppState:
# CLI options:
is_sample_csv: bool
in_cloud: bool
qa_mode: bool

# Reactive bools:
Expand Down
Loading
Loading