Hey!
Was running your program with some tests on my local cluster and noticed following things:
- First stage of deciding on frames and resolutions script tend to fail on videos longer than ~20 minutes on following resources:
- RAM 64GB, CPU: 16
- The input was perfect podcast video with 3 usual camera perspectives
- Framing on the clips is static, so it only takes the frame space once and if speaker moves a little - it does not readjust to it, sometimes even on static recordings it tend to take speaker space wrongly
Are there are solutions to look for regarding those issues? I understand dynamic frame capture is not possible with current script basis, but maybe you do know what could be changed or added for it to work. Same regarding the video length
Attached a usual failure error on which script stops working:
2025-05-10 19:35:46,935 - DEBUG - Using 1 batches to extract and detect frames. Need 3.834 GiB of CPU memory per batch and 0.000 GiB of GPU memory per batch
2025-05-10 19:35:50,061 - ERROR - mmco: unref short failure
2025-05-10 19:36:27,752 - DEBUG - Detecting faces in 953 frames.
2025-05-10 19:37:34,757 - DEBUG - Detected faces in 953 frames.
2025-05-10 19:37:34,759 - DEBUG - Need 0.180 GiB to extract (at most) 70 frames
2025-05-10 19:37:34,759 - DEBUG - Face detection dimensions: 540x960
2025-05-10 19:37:34,760 - DEBUG - Need 0.101 GiB to detect faces from (at most) 70 frames
Thanks for amazing program!
Hey!
Was running your program with some tests on my local cluster and noticed following things:
Are there are solutions to look for regarding those issues? I understand dynamic frame capture is not possible with current script basis, but maybe you do know what could be changed or added for it to work. Same regarding the video length
Attached a usual failure error on which script stops working:
Thanks for amazing program!