This is the directories the application will use. Should be configure in the docker-compose.yml or passed as arguments in the docker run command.
The input directory is where the application will search for the input movie file. The application will search for the file with the name configured in the IN_MOVIE environment variable.
- Directory:
/data/in-data - Access Mode:
roRead Only - Mandatory: Yes
The output directory is where the application will save the output transcript file. The application will save the file with the name configured in the OUT_TRANSCRIPT environment variable.
- Directory:
/data/out-data - Access Mode:
rwRead/Write - Mandatory: No
The temporary directory is where the application will save temporary files like audio extracts.
- Directory:
/data/tmp-data - Access Mode:
rwRead/Write - Mandatory: No
The application uses environment variables to configure the behavior of the application. The following variables are available:
The IN_MOVIE environment variable is used to configure the input movie file. The application will search for the file with the name configured in this variable in the /data/in-data directory.
- Variable:
IN_MOVIE - Default:
movie.mp4 - Mandatory: No
The OUT_TRANSCRIPT environment variable is used to configure the output transcript file. The application will save the file with the name configured in this variable in the /data/out-data directory.
- Variable:
OUT_TRANSCRIPT - Default:
transcript-{yyyyMMddHHmmss}.txt - Mandatory: No
The IN_MOVIE_LANG environment variable is used to configure the language of the audio in the input movie file. The application will use this language to extract the audio from the movie file.
- Variable:
IN_MOVIE_LANG - Default:
en-US - Mandatory: No
When the CHUNK_SIZE environment variable is set, the application will split the audio in chunks of this duration.
- Variable:
CHUNK_SIZE - Default: None
- Unit: Seconds
- Mandatory: No
When the OVERLAP_DURATION environment variable is set, the application will overlap the start and end of the chunks into the next and previous chunks.
It is useful to avoid cutting words in the middle of the chunks.
This configuration is only used when the CHUNK_SIZE is set.
- Variable:
OVERLAP_DURATION - Default: 0
- Unit: Seconds
- Mandatory: No