This README documentation is no longer maintained. Please refer to the official documentation page for the most up-to-date documentation version.
Simple tool for extracting frames from stereo videos and saving as images as described in the documentation and demonstrated in frameExtractionFromVideo.ipynb.
Version History
- Version 1.0 (Sep 2023): Initial version
- Version 1.1 (Oct 2023): Revised to include:
- Appending fish length to image file name
- Ability to control where images are read from and saved
- Storing images in species-specific subdirectories
Feature Requests
- None
This script reads a data spreadsheet exported from fish measurement software and extracts the frame number corresponding to each annotation in each stereo video. The database is expected to contain the following columns of information:
- FilenameLeft - name of video files corresponding to left stereo channel
- FilenameRight - name of video files corresponding to right stereo channel
- FrameLeft - frame numbers to be extracted from FilenameLeft
- FrameRight - frame numbers to be extracted from FilenameRight
- Length - length of fish appearing in (FrameLeft, FrameRight) frame pair
- Family - taxonomic family classification of fish appearing in (FrameLeft, FrameRight) frame pair. If missing, will be filled with "fam".
- Genus - taxonomic genus classification of fish appearing in (FrameLeft, FrameRight) frame pair. If missing, will be filled with "gen".
- Species - taxonomic species classification of fish appearing in (FrameLeft, FrameRight) frame pair. If missing, will be filled with "sp".
Optionally, a window can be passed to extract a specified number of frames on each side of the specified frame number -- see below for details. Images are written as jpg files following the naming convention:
videoFileName_frame-N_length-L.jpgwhere N is the zero-padded frame number extracted from video file videoFileName and L is the length of the fish in the image. New image files are saved in species-specific subdirectories ./family/genus/species/, which are automatically created if they do not already exist. See below for instructions on how to control where these subdirectories are created.
The preferred method of usage is to run the script within its Docker container. You will need to install Docker or Docker Desktop (recommended) locally for this to work. The container includes the following components:
- frameXtract.py: Python script
- requirements.txt: Python library dependencies
- Dockerfile and docker-compose.yml: Docker container configuration files
-
Install GitHub Desktop or GitHub Command Line Interface (CLI).
-
Clone this GitHub repository. Alternatively, download the files listed above from the repository, either separately or as a .zip package, ensuring that they are all end up co-located in the same local directory.
a. By default, the Python script, the data spreadsheet, and video files are all expected to be in the same directory, and the images will also be written to this same directory. In this case, either download the container files listed above into the same directory as the data and video files, or download the container files into a new directory and then copy or move the data and video files into that directory.
b. If the data spreadsheet file and video files are in separate directories, and/or if the images are to be written elsewhere, the
docker-compose.ymlfile needs to be modified as follows:- Data file: Under the 'volumes' heading, replace the "." before the colon (:) for the data directory with the full local directory of the data file. For example, if the data file is on a local Windows desktop, the new entry should read:
volumes: # Application directory - .:/home # Image directory - .:/images # Video directory - .:/videos # Data directory - C:\Users\user.name\Desktop:/data
- Video files: Replace the "." before the colon (:) for the video directory with the full local directory containing the video files. For example, if the video files are on another drive, the new entry might read:
volumes: # Application directory - .:/home # Image directory - .:/images # Video directory - D:\myVideos:/videos # Data directory - C:\Users\user.name\Desktop:/data
- Image files: To change where the images are written locally, replace the "." before the colon (:) for the image directory with the full local directory where the new image files, sorted by fish species, should be written. For example, if the image files are to be written in the local Documents folder, the new entry should read:
volumes: # Application directory - .:/home # Image directory - C:\Users\user.name\Documents\images:/images # Video directory - D:\myVideos:/videos # Data directory - C:\Users\user.name\Desktop:/data
- Application file (rare): In the unlikely event that the Python script is not co-located with the container files, replace the "." before the colon (:) for the application directory with the full local directory containing the script. This should rarely, if ever, be necessary.
Some things to note:
- Docker is platform-agnostic. Mac OS or Linux directory chains can be used here as well.
- One can set any or all of the images, videos, or data directories. Any combination will work. The leading period (.) means "here" and is used to specify the current directory. Thus, if either the videos or the database annotations file are located in the same directory as the program, the volume mapping should retain the default ".".
- DO NOT change the directory chains after the colons. Doing so will break the script.
- This section of
docker-compose.ymlmaps local directories to independent directories inside the container. The container will only be able to see the contents of local directories mounted here. If you run into "file not found" errors, look here first.
- Data file: Under the 'volumes' heading, replace the "." before the colon (:) for the data directory with the full local directory of the data file. For example, if the data file is on a local Windows desktop, the new entry should read:
-
Open a Command Prompt (Windows) or Terminal (Mac) and navigate to the directory containing the container files.
-
Launch Docker or Docker Desktop, which must be running in order for the next commands to work.
-
Build the Docker container:
docker-compose build
-
To execute, run
docker-compose run framextract -f dataFilename.ext
where
dataFilename.extis the name of the annotations database file. (Do not pass the full directory chain; just the file name.) Additional options are described below.
- Clone this GitHub repository. Alternatively, download the script file
frameXtract.pyand the package dpendency filerequirements.txtfrom the repository. - Download and install Python if needed. This program was written in Python 3.11.
- Highly recommended: Create a virtual environment and install the package dependencies in
requirements.txt. - Execute by passing a full directory path for the data spreadsheet file using
-for--fileand a full directory path for the videos to-vor--videoand (optionally) a full directory path for new images to-ior--image. This will tell the script that it is being run stand-alone instead of within a container:Note that this method may be finicky due to potential version conflicts if the virtual environment does not get set up properly.python framextract.py --file full/path/to/dataFilename.ext --video full/path/to/videoFiles --image full/path/for/imageFiles
This program contains a number of options to customize usage, a summary of which can be accessed by using the -h or --help flag, for example:
docker-compose run framextract -hThis is the only required argument when using Method 1 above (preferred). It specifies the data spreadsheet file to consult for extracting frames. It must contain the following columns of information:
- FilenameLeft - name of video files corresponding to left stereo channel
- FilenameRight - name of video files corresponding to right stereo channel
- FrameLeft - frame numbers to be extracted from FilenameLeft
- FrameRight - frame numbers to be extracted from FilenameRight
- Length - length of fish appearing in (FrameLeft, FrameRight) frame pair
- Family - taxonomic family classification of fish appearing in (FrameLeft, FrameRight) frame pair. If missing, will be filled with "fam".
- Genus - taxonomic genus classification of fish appearing in (FrameLeft, FrameRight) frame pair. If missing, will be filled with "gen".
- Species - taxonomic species classification of fish appearing in (FrameLeft, FrameRight) frame pair. If missing, will be filled with "sp".
Example:
docker-compose run framextract -f dataFilename.extThis specifies the separator (delimiter) for the data spreadsheet file. If no separator is provided at execution, the script tries to determine the separator from the file extention (e.g., "tsv"==tab-delimited, "csv"==comma-separated) or let the Python parsing engine try to automatically determine it. If these attempts fail, the program will exit and the user will be prompted to re-launch using this flag. This flag is generally not needed in most cases if the file is tab or comma separated, as the automatic determination should rarely fail in either of these cases.
Example:
docker-compose run framextract --file dataFilename.ext --sep \t For some applications, it may be appropriate to extract a range of frames rather than a single frame. This flag allows the user to specify a number of frames before and after the frame number reported in the spreadsheet to extract. For example, if this is set to 5 and the data contains an annotation at frame 45, then frames 40-50 will be extracted and saved separately (45
Example:
docker-compose run framextract --file dataFilename.ext --window 5This provides the user the ability to specify where the videos are located. It is only used if, and required when, the program is run stand-alone outside of its Docker container, which is not recommended. This is never needed if the program is run from its Docker container -- in fact, it will be ignored in this case, since the video location would be provided in the docker-compose.yml file, as decribed above.
This provides the user the ability to specify where the images should be saved. It is only used if the program is run stand-alone outside of its Docker container, which is not recommended. This is never needed if the program is run from its Docker container -- in fact, it will be ignored in this case, since the image location would be provided in the docker-compose.yml file, as decribed above.
This takes no arguments; passing it will print updates of successful file reading and writing to the screen.
Example:
docker-compose run framextract --file dataFilename.ext --window 5 -pDisplays the help documentation.
This program remains under active development. This page will be updated as the script evolves.
A demonstration and explanation of the script workflow is provided in an accompanying Jupyter notebook.
This repository is a scientific product and is not official communication of the National Oceanic and Atmospheric Administration, or the United States Department of Commerce. All NOAA GitHub project code is provided on an "as is" basis and the user assumes responsibility for its use. Any claims against the Department of Commerce or Department of Commerce bureaus stemming from the use of this GitHub project will be governed by all applicable Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the Department of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC bureau, shall not be used in any manner to imply endorsement of any commercial product or activity by DOC or the United States Government.