Building interactive AI vision tools for the browser

In this talk we show how we developed AI vision tools that can be used to draw bounding boxes or segmentation masks on users' vision datasets. These annotation tools run locally in the browser without any server communication.

We first review some classical annotation tools that we implemented for Intel Geti based on classical computer vision using OpenCV.

Then we dive into integrating ONNXRuntime which we used to implement interactive auto segmentation tools such as Meta's Segment Anything Model (SAM). We will cover some implementation details such as the encoder/decoder strategies, pre processing (image resizing, color normalization) and post processing steps.

Some practical optimization tips will be shared such as using WebWorkers, choosing the right encoder model and efficiently computing the post processing steps for high resolution images. Looking ahead we discuss emerging technologies like WebGPU and WebNN that allows us to perform inference outside of the CPU.

About the slides

Important

While I've tried to clean up the code of the slides, the code itself is not production ready. It generally lacks error handling as the primary use case of the code is to be used when presenting this talk. Please checkout Geti's codebase if you're interested in seeing the code we use in production.

These slides were made in React, using Spectacle. My goal for these slides was that they are interactive and less boring than a plain powerpoint or google slides deck.

The code for these slides borrow concepts from Geti's Web UI code.

The Using OpenCV to implement Grabcut and SAM demo slides use our custom compiled OpenCV js code as well as ONNXRuntime to run AI models in the browser.

Train your own AI vision models with Geti

The tools shown in this presentation were originally developed for Geti. If your team is looking into training custom AI vision models then have a look at our github or our docs and deploy Geti for free on your own hardware.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
screenshots		screenshots
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
.prettierrc		.prettierrc
README.md		README.md
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
rsbuild.config.ts		rsbuild.config.ts
serve.json		serve.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building interactive AI vision tools for the browser

About the slides

Train your own AI vision models with Geti

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Building interactive AI vision tools for the browser

About the slides

Train your own AI vision models with Geti

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages