Skip to content

sssssaud/gesture-mouse-control

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Gesture OS Control πŸ€šπŸ–±οΈ

Two-handed gesture mouse and system controller for Linux. Use your webcam to move the cursor, click, drag, scroll, adjust volume, switch workspaces, and launch apps β€” all without touching anything.

Right hand controls the mouse. Left hand controls system functions. Built on MediaPipe hand-tracking, with engineering tricks to make gestures stable, low-latency, and immune to camera distance.

Tested on Pop!_OS 22.04 with Wayland. Should work on any Linux distro with X11 or Wayland.


✨ Features

Right hand β€” mouse control

  • Move the cursor with an open hand
  • Pinch thumb + index β†’ click & drag
  • Pinch thumb + middle β†’ right click
  • Pinch thumb + ring β†’ middle click (scroll button)
  • Bring index + middle close β†’ scroll up/down
  • Closed fist β†’ pause everything

Left hand β€” system control

  • πŸ‘Œ OK sign β†’ toggle app launcher
  • ✌️ Peace sign β†’ unlock/lock volume slider
  • πŸ€™ Shaka sign β†’ lock all hand controls (security)
  • Thumb + middle pinch (with index up) β†’ Super+Tab (workspace switch)
  • Open hand (after unlock) β†’ adjust volume by thumb-index distance

Engineering smarts that make it actually usable

  • 1-Euro filter β€” no jitter at rest, no lag during fast moves
  • Hysteresis on every gesture β€” separate engage/release thresholds prevent flicker
  • Scale-invariant detection β€” works the same at any camera distance
  • Cursor freeze on pinch β€” prevents pull-off when clicking small UI targets
  • Edge-reachability scaling β€” camera area 15–85% maps to full screen, so corners are reachable
  • Anti-background-hand filter β€” ignores tiny/distant hands so passersby can't take over your mouse
  • Volume rollback safety β€” cancel a volume gesture and it reverts to where it was 0.5s before you started
  • Baton transfer β€” left hand becomes the mouse during launcher mode, then hands control back

πŸ“‹ Requirements

  • OS: Linux (Wayland or X11)
  • Python: 3.10+
  • Hardware: any webcam β€” built-in or USB
  • System packages: ydotool, xdotool, wireplumber (wpctl), libnotify (notify-send), wl-clipboard

πŸ› οΈ Installation

1. Install system tools

# Ubuntu / Debian / Pop!_OS
sudo apt install ydotool xdotool wireplumber libnotify-bin wl-clipboard

# Fedora
sudo dnf install ydotool xdotool wireplumber libnotify wl-clipboard

# Arch
sudo pacman -S ydotool xdotool wireplumber libnotify wl-clipboard

2. Set up the ydotool daemon

ydotool needs a background daemon (ydotoold) to send input events:

systemctl --user enable --now ydotoold.service
systemctl --user status ydotoold

If you get permission errors, add your user to the input group, then log out and back in:

sudo usermod -aG input $USER

3. Install Python dependencies

pip install opencv-python mediapipe

4. Clone and run

git clone https://github.com/YOUR_USERNAME/gesture-mouse-control.git
cd gesture-mouse-control
python3 mouse_control.py

Press q in the camera preview window to quit.


🀲 Gesture Reference

Right hand β†’ mouse

Gesture Action Notes
Open hand πŸ‘‹ Move cursor Point with whole hand
Pinch thumb + index 🀏 Click / Drag Hold to drag, release to click
Pinch thumb + middle Right click Fires once per gesture
Pinch thumb + ring Middle click The scroll-wheel button
Index + middle close together Scroll Move hand up/down to scroll
Closed fist ✊ Pause Cursor freezes, no clicks fire

Left hand β†’ system

Gesture Action Notes
OK sign πŸ‘Œ Open/close app launcher Switches "mouse" to left hand
Peace sign ✌️ Unlock/lock volume Hold ring + pinky down with thumb
Shaka πŸ€™ Toggle hand-control lock Pinky up only β€” disables all controls
Pinch thumb + middle (index up) Super+Tab Workspace switcher
Open hand (after Peace unlock) Adjust volume Thumb-index distance maps to 0–100%

βš™οΈ Configuration

All tunables are at the top of mouse_control.py:

FRAME_WIDTH = 480              # webcam capture width
FRAME_HEIGHT = 360             # webcam capture height
PINCH_THRESHOLD = 0.4          # how close fingers must be to register a pinch
SCROLL_THRESHOLD = 0.3         # closeness for scroll gesture
PAUSE_THRESHOLD = 1.0          # closeness for pause gesture

# Camera mapping bounds β€” crops the camera area so screen edges are reachable
CAM_X_MIN = 0.15
CAM_X_MAX = 0.85
CAM_Y_MIN = 0.15
CAM_Y_MAX = 0.85

App launcher path

By default, mouse_control.py looks for app_launcher_gui.py in the same folder as itself. If you keep them apart, edit this near the top of mouse_control.py:

APP_LAUNCHER_PATH = "/full/path/to/app_launcher_gui.py"

Customizing launcher buttons

The default launcher has three buttons: Chrome, Spotify, and AWS Kiro CLI. To change them, edit app_launcher_gui.py:

btn1 = tk.Button(f, text="🌐 Chrome",
                 command=lambda: launch_app("google-chrome"),
                 **btn_style)

btn2 = tk.Button(f, text="🎡 Spotify",
                 command=lambda: launch_app("spotify"),
                 **btn_style)

btn3 = tk.Button(f, text="πŸ€– AI Terminal",
                 command=lambda: launch_app("gnome-terminal -- kiro-cli"),
                 **btn_style)

Replace the shell command inside each launch_app("...") with whatever app you want to open. You can also add or remove buttons.

About the AWS Kiro CLI button

The third button (πŸ€– AI Terminal) opens AWS Kiro CLI in a new terminal. If you don't have Kiro installed, the button does nothing. You have two options:

  1. Install Kiro CLI β€” see AWS Kiro documentation for instructions
  2. Replace the button β€” edit app_launcher_gui.py and change kiro-cli to whatever AI tool or terminal command you prefer

⚠️ Known Issues

  • Auto-recenter is unreliable. When the right hand re-enters the frame after being absent for 0.5s+, the program tries to snap the cursor to screen center. On some Wayland/ydotool combinations the cursor lands at the bottom-right corner instead. The rest of the gesture system is unaffected. If this annoys you, comment out the snap_to_center(sw, sh) call inside the main loop.
  • Linux only. Depends on ydotool, wpctl, and wl-paste. macOS and Windows are not supported.
  • First-frame jitter. MediaPipe needs ~0.4s to stabilize when a hand enters the frame. The program already debounces the left hand for this reason.

🩹 Troubleshooting

ydotool fails with permission denied

Make sure the daemon is running and your user is in the input group:

systemctl --user status ydotoold
groups | grep input

Camera not opening

The program tries camera index 2 (external webcam) first, then falls back to 0 (built-in). If you have a different setup, edit lines 326–328 in mouse_control.py.

Cosmetic warnings on startup

You may see these warnings β€” they are all harmless:

  • AttributeError: 'MessageFactory' object has no attribute 'GetPrototype' β†’ MediaPipe / protobuf version mismatch. Doesn't affect functionality.
  • qt.qpa.plugin: Could not find the Qt platform plugin "wayland" β†’ OpenCV's Qt rendering on Wayland. Camera preview still works.
  • QFontDatabase: Cannot find font directory β†’ Cosmetic. Silence with sudo apt install fonts-dejavu.

Cursor moves erratically

  • Lighting matters β€” MediaPipe needs to see your hand clearly
  • Try adjusting PINCH_THRESHOLD and other CONFIG values
  • Lower webcam framerate makes the 1-Euro filter underperform; try a different camera if available

Quitting with Ctrl+C leaves modifier keys held

Use the q key in the camera preview window instead of Ctrl+C. The 'q' path runs cleanup; Ctrl+C bypasses it.


🧰 Tech Stack

  • MediaPipe Hands β€” 21-landmark hand tracking
  • OpenCV β€” webcam capture and preview rendering
  • ydotool β€” uinput-based input control (Wayland-compatible)
  • xdotool β€” X11 display geometry
  • WirePlumber β€” PipeWire volume control
  • tkinter β€” app launcher GUI

πŸ“ Project Structure

gesture-mouse-control/
β”œβ”€β”€ mouse_control.py       # Main program β€” gesture detection + cursor control
β”œβ”€β”€ app_launcher_gui.py    # Tkinter launcher (opened by πŸ‘Œ OK sign)
β”œβ”€β”€ README.md              # This file
β”œβ”€β”€ .gitignore
└── LICENSE

πŸ“œ License

MIT β€” see LICENSE for details.


Built by Sheikh Saud β€” B.Tech Data Science student, embedded systems & maker hobbyist.

About

Two-handed gesture mouse and system controller for Linux

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages