Skip to content
@PolyU-VCLab

PolyU-VCLab

Visual Computing Lab, The Hong Kong Polytechnic University

Welcome to the Visual Computing Lab at HK PolyU 👋

The Visual Computing Lab at The Hong Kong Polytechnic University, led by Prof. Lei Zhang, works on image/video restoration and quality assessment, multimodal perception and reasoning, image/video synthesis, 3D perception and generation, efficient architectures and training, as well as benchmarks and datasets.

Here we share our projects, code, models, benchmarks, datasets, and demos.

𝕏 X🤗 Hugging Face📕 Xiaohongshu


🔥 Research Areas and Representative Works

🖼️ Image / Video Restoration, Enhancement and Quality Assessment

Real-world image and video restoration, enhancement, super-resolution, and perceptual quality assessment.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

🤖 Multimodal Perception, Understanding and Reasoning

MLLM-based visual perception, grounding, OOD detection, dense understanding, and multimodal reasoning.

2026

2025

2024

🎨 Image and Video Synthesis and Generation

Efficient, controllable, and high-quality generative models for image synthesis, editing, and video generation.

2026

2025

2024

2023

2022

🌍 3D Perception, Reconstruction and Generation

3D reconstruction, scene generation, and versatile 3D editing from images, videos, and language prompts.

2026

2025

2024

2023

⚡ Architecture and Training Paradigms

New model architectures and efficient training paradigms for vision models, diffusion transformers, LLMs, and VLMs.

2026

2025

2024

2023

2022

2020

  • Gradient Centralization (ECCV 2020)Paper | Code
  • SA-SSD (CVPR 2020)Paper | Code

2018

📊 Benchmarks and Datasets

Benchmarks and datasets for rigorous evaluation and reproducible progress in visual computing.

2026

2023

2021

2019


📚 More

For a broader list of our papers, models, and datasets, please visit our Hugging Face Collections.

If you are interested in our work, welcome to follow the organization and star our repositories ⭐

Popular repositories Loading

  1. GGT-100K GGT-100K Public

    GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration

    Python 65 4

  2. OpenOOD-VLM OpenOOD-VLM Public

    ECCV24, NeurIPS24, CVPR26*2, ECCV26, Benchmarking Generalized Out-of-Distribution Detection with Vision-Language Models

    Python 38 7

  3. DepthMaster DepthMaster Public

    DepthMaster: Unified Monocular Depth Estimation for Perspective and Panoramic Images

    Python 24

  4. WRC WRC Public

    Weighted Reverse Convolution for Feature Upsampling

    Python 23

  5. TVEdit TVEdit Public

    22

  6. DEL DEL Public

    Digit Entropy Loss for Numerical Learning of LLMs

    Python 1

Repositories

Showing 9 of 9 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…