Two-handed gesture mouse and system controller for Linux. Use your webcam to move the cursor, click, drag, scroll, adjust volume, switch workspaces, and launch apps β all without touching anything.
Right hand controls the mouse. Left hand controls system functions. Built on MediaPipe hand-tracking, with engineering tricks to make gestures stable, low-latency, and immune to camera distance.
Tested on Pop!_OS 22.04 with Wayland. Should work on any Linux distro with X11 or Wayland.
Right hand β mouse control
- Move the cursor with an open hand
- Pinch thumb + index β click & drag
- Pinch thumb + middle β right click
- Pinch thumb + ring β middle click (scroll button)
- Bring index + middle close β scroll up/down
- Closed fist β pause everything
Left hand β system control
- π OK sign β toggle app launcher
- βοΈ Peace sign β unlock/lock volume slider
- π€ Shaka sign β lock all hand controls (security)
- Thumb + middle pinch (with index up) β Super+Tab (workspace switch)
- Open hand (after unlock) β adjust volume by thumb-index distance
Engineering smarts that make it actually usable
- 1-Euro filter β no jitter at rest, no lag during fast moves
- Hysteresis on every gesture β separate engage/release thresholds prevent flicker
- Scale-invariant detection β works the same at any camera distance
- Cursor freeze on pinch β prevents pull-off when clicking small UI targets
- Edge-reachability scaling β camera area 15β85% maps to full screen, so corners are reachable
- Anti-background-hand filter β ignores tiny/distant hands so passersby can't take over your mouse
- Volume rollback safety β cancel a volume gesture and it reverts to where it was 0.5s before you started
- Baton transfer β left hand becomes the mouse during launcher mode, then hands control back
- OS: Linux (Wayland or X11)
- Python: 3.10+
- Hardware: any webcam β built-in or USB
- System packages:
ydotool,xdotool,wireplumber(wpctl),libnotify(notify-send),wl-clipboard
# Ubuntu / Debian / Pop!_OS
sudo apt install ydotool xdotool wireplumber libnotify-bin wl-clipboard
# Fedora
sudo dnf install ydotool xdotool wireplumber libnotify wl-clipboard
# Arch
sudo pacman -S ydotool xdotool wireplumber libnotify wl-clipboardydotool needs a background daemon (ydotoold) to send input events:
systemctl --user enable --now ydotoold.service
systemctl --user status ydotooldIf you get permission errors, add your user to the input group, then log out and back in:
sudo usermod -aG input $USERpip install opencv-python mediapipegit clone https://github.com/YOUR_USERNAME/gesture-mouse-control.git
cd gesture-mouse-control
python3 mouse_control.pyPress q in the camera preview window to quit.
| Gesture | Action | Notes |
|---|---|---|
| Open hand π | Move cursor | Point with whole hand |
| Pinch thumb + index π€ | Click / Drag | Hold to drag, release to click |
| Pinch thumb + middle | Right click | Fires once per gesture |
| Pinch thumb + ring | Middle click | The scroll-wheel button |
| Index + middle close together | Scroll | Move hand up/down to scroll |
| Closed fist β | Pause | Cursor freezes, no clicks fire |
| Gesture | Action | Notes |
|---|---|---|
| OK sign π | Open/close app launcher | Switches "mouse" to left hand |
| Peace sign βοΈ | Unlock/lock volume | Hold ring + pinky down with thumb |
| Shaka π€ | Toggle hand-control lock | Pinky up only β disables all controls |
| Pinch thumb + middle (index up) | Super+Tab | Workspace switcher |
| Open hand (after Peace unlock) | Adjust volume | Thumb-index distance maps to 0β100% |
All tunables are at the top of mouse_control.py:
FRAME_WIDTH = 480 # webcam capture width
FRAME_HEIGHT = 360 # webcam capture height
PINCH_THRESHOLD = 0.4 # how close fingers must be to register a pinch
SCROLL_THRESHOLD = 0.3 # closeness for scroll gesture
PAUSE_THRESHOLD = 1.0 # closeness for pause gesture
# Camera mapping bounds β crops the camera area so screen edges are reachable
CAM_X_MIN = 0.15
CAM_X_MAX = 0.85
CAM_Y_MIN = 0.15
CAM_Y_MAX = 0.85By default, mouse_control.py looks for app_launcher_gui.py in the same folder as itself. If you keep them apart, edit this near the top of mouse_control.py:
APP_LAUNCHER_PATH = "/full/path/to/app_launcher_gui.py"The default launcher has three buttons: Chrome, Spotify, and AWS Kiro CLI. To change them, edit app_launcher_gui.py:
btn1 = tk.Button(f, text="π Chrome",
command=lambda: launch_app("google-chrome"),
**btn_style)
btn2 = tk.Button(f, text="π΅ Spotify",
command=lambda: launch_app("spotify"),
**btn_style)
btn3 = tk.Button(f, text="π€ AI Terminal",
command=lambda: launch_app("gnome-terminal -- kiro-cli"),
**btn_style)Replace the shell command inside each launch_app("...") with whatever app you want to open. You can also add or remove buttons.
The third button (π€ AI Terminal) opens AWS Kiro CLI in a new terminal. If you don't have Kiro installed, the button does nothing. You have two options:
- Install Kiro CLI β see AWS Kiro documentation for instructions
- Replace the button β edit
app_launcher_gui.pyand changekiro-clito whatever AI tool or terminal command you prefer
- Auto-recenter is unreliable. When the right hand re-enters the frame after being absent for 0.5s+, the program tries to snap the cursor to screen center. On some Wayland/ydotool combinations the cursor lands at the bottom-right corner instead. The rest of the gesture system is unaffected. If this annoys you, comment out the
snap_to_center(sw, sh)call inside the main loop. - Linux only. Depends on
ydotool,wpctl, andwl-paste. macOS and Windows are not supported. - First-frame jitter. MediaPipe needs ~0.4s to stabilize when a hand enters the frame. The program already debounces the left hand for this reason.
Make sure the daemon is running and your user is in the input group:
systemctl --user status ydotoold
groups | grep inputThe program tries camera index 2 (external webcam) first, then falls back to 0 (built-in). If you have a different setup, edit lines 326β328 in mouse_control.py.
You may see these warnings β they are all harmless:
AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'β MediaPipe / protobuf version mismatch. Doesn't affect functionality.qt.qpa.plugin: Could not find the Qt platform plugin "wayland"β OpenCV's Qt rendering on Wayland. Camera preview still works.QFontDatabase: Cannot find font directoryβ Cosmetic. Silence withsudo apt install fonts-dejavu.
- Lighting matters β MediaPipe needs to see your hand clearly
- Try adjusting
PINCH_THRESHOLDand other CONFIG values - Lower webcam framerate makes the 1-Euro filter underperform; try a different camera if available
Use the q key in the camera preview window instead of Ctrl+C. The 'q' path runs cleanup; Ctrl+C bypasses it.
- MediaPipe Hands β 21-landmark hand tracking
- OpenCV β webcam capture and preview rendering
- ydotool β uinput-based input control (Wayland-compatible)
- xdotool β X11 display geometry
- WirePlumber β PipeWire volume control
- tkinter β app launcher GUI
gesture-mouse-control/
βββ mouse_control.py # Main program β gesture detection + cursor control
βββ app_launcher_gui.py # Tkinter launcher (opened by π OK sign)
βββ README.md # This file
βββ .gitignore
βββ LICENSE
MIT β see LICENSE for details.
Built by Sheikh Saud β B.Tech Data Science student, embedded systems & maker hobbyist.