CascadedViT: Cascaded Chunk-FeedForward and Cascaded Group Attention Vision Transformer

Overview

This is the official repository of CascadedViT: Cascaded Chunk-FeedForward and Cascaded Group Attention Vision Transformer Srivathsan Sivakumar, Faisal Z. Qureshi

CascadedViT uses Chunk-FFNs with a cascading nature to produce a family of light-weight, compute-efficient and high-speed vision transformers.

CascadedViT models consistently achieve top-ranking efficiency on a new metric called Accuracy-Per-FLOP (APF), which quantifies compute efficiency relative to accuracy

Image Classification

CascadedViT models were trained on ImageNet-1K for classification. For further details please refer to classification.

Downstream Tasks

ImageNet-1K pretrained CascadedViT-L was used for transfer learning experiments on the MS-COCO dataset for Object Detection and Instance Segmentation. For further details please refer to downstream.

License

Please find our license here.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
classification		classification
docs		docs
downstream		downstream
LICENSE		LICENSE
README.md		README.md
apf_bar_with_log.png		apf_bar_with_log.png
ccffn.gif		ccffn.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CascadedViT: Cascaded Chunk-FeedForward and Cascaded Group Attention Vision Transformer

Overview

Image Classification

Downstream Tasks

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CascadedViT: Cascaded Chunk-FeedForward and Cascaded Group Attention Vision Transformer

Overview

Image Classification

Downstream Tasks

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages