Use of multithreading to improve areas where parallel processing could have major benefits to response times

We would like to explore the use of multithreading in areas where parallel processing could have major benefits to our response times throughout the API. This will add additional complexity though, so it is something that we should have a careful eye for as we implement it in the future. An example of how this could be used is shown here for the zonal statistics:
https://github.com/ua-snap/data-api/blob/9d6cd622a058f34796c1f71530f8758242f74eb0/zonal_stats.py#L224-L264

	# Use threading with a small number of workers for parallel
	# processing of the zonal statistics
	workers = min(4, max(1, cpu_count() // 2))

	# Only use parallelization if we have enough combinations
	# to justify overhead of starting them up
	use_parallel = MULTIPROCESSING and workers > 1 and len(dimension_combinations) > 20

	if use_parallel:
	with ThreadPoolExecutor(max_workers=workers) as threads:
	results = list(
	threads.map(
	lambda combo: (
	combo,
	calculate_zonal_stats(
	da_i.sel(combo),
	rasterized_polygon_array,
	x_dim,
	y_dim,
	compute_full_stats,
	),
	),
	dimension_combinations,
	)
	)
	else:
	results = [
	(
	combo,
	calculate_zonal_stats(
	da_i.sel(combo),
	rasterized_polygon_array,
	x_dim,
	y_dim,
	compute_full_stats,
	),
	)
	for combo in dimension_combinations
	]

	return results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of multithreading to improve areas where parallel processing could have major benefits to response times #699

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Use of multithreading to improve areas where parallel processing could have major benefits to response times #699

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions