Skip to content

Questions about data preprocessing #52

@asm3242

Description

@asm3242

Looking at run_detect_segment, we can guess that it requires an annotation file for the video, and that file consists of a start time, an end time, and a text prompt. (The text prompt is not used in the code.)

I wonder if these annotations are created manually, or if they can be created automatically.

Also, when extracting features through the CLIP encoder in the run_clip_filtering file, what text input is required?

Finally, when will the pre-training dataset be released?

Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions