-
Notifications
You must be signed in to change notification settings - Fork 119
Description
Hi,
I'm having an issue where after the last training epoch ends, when the validation set should be evaluated, a threading error caused by batchgenerators occurs.
I'm using the provided Docker image, with the predefined env_det_num_threads=6 and OMP_NUM_THREADS=1. ( I have additionally tried making a container with env_det_num_threads=1 to see if the problem was related to this flag, but the problem persisted).
To check if the problem was from our dataset I also tried this on the example dataset 000, and exact same problem happened.
Following is a set of traces of the problems that arise. It's always a thread error/exception. Looking at it, it feels like the batchgenerators class is having problems closing some threads?
Any help would be greatly appreciated :)
Best regards,
Nuno
Exception in thread Thread-5:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/opt/conda/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/lib/python3.8/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 92, in results_loop
raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the print"
RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message
Exception ignored in: <function MultiThreadedAugmenter.__del__ at 0x7f3efd1a5ca0>
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 294, in __del__
self._finish()
File "/opt/conda/lib/python3.8/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 276, in _finish
self._queues[i].close()
File "/opt/conda/lib/python3.8/multiprocessing/queues.py", line 137, in close
self._reader.close()
File "/opt/conda/lib/python3.8/multiprocessing/connection.py", line 177, in close
self._close()
File "/opt/conda/lib/python3.8/multiprocessing/connection.py", line 361, in _close
_close(self._handle)
OSError: [Errno 9] Bad file descriptor
Exception in thread Thread-3:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/opt/conda/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/lib/python3.8/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 92, in results_loop
raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the print"
RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message