Skip to content

Conversation

@quanru
Copy link
Collaborator

@quanru quanru commented Jan 7, 2026

Summary

  • Add new @midscene/computer package for AI-driven desktop automation on Windows/macOS/Linux
  • Implement mouse operations (tap, double-click, right-click, hover, drag-and-drop)
  • Implement keyboard operations (key press, text input, keyboard shortcuts)
  • Add multi-display support with display selection capability
  • Include MCP server for AI assistant integration

Implementation Details

  • Uses @computer-use/libnut for native mouse/keyboard control with lazy loading
  • Uses screenshot-desktop for cross-platform screen capture
  • Supports multi-display setups with displayId configuration
  • Platform-specific keyboard shortcuts handling (macOS vs Windows/Linux)

Test Plan

  • Unit tests for device and agent creation
  • AI tests for basic desktop interactions (mouse movement)
  • AI tests for keyboard shortcuts (Cmd+Space/Windows key, Cmd+Tab/Alt+Tab)
  • AI tests for multi-display operations
  • AI tests for web browser automation

@netlify
Copy link

netlify bot commented Jan 7, 2026

Deploy Preview for midscene ready!

Name Link
🔨 Latest commit 203cd6c
🔍 Latest deploy log https://app.netlify.com/projects/midscene/deploys/6969f5941735df00086ffe5c
😎 Deploy Preview https://deploy-preview-1734--midscene.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@yuyutaotao
Copy link
Collaborator

  1. The action “hover” is incorrectly named on the PC side; it should be mouseMove.

2 . Is ⁠@computer-use/libnut open source? If not, we have to use the community version.

@quanru quanru marked this pull request as draft January 8, 2026 03:25
@quanru quanru force-pushed the feat/computer-package branch from 4dcece2 to 7ed1fed Compare January 9, 2026 07:31
@quanru quanru requested a review from Copilot January 12, 2026 03:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 31 out of 33 changed files in this pull request and generated 2 comments.

Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@quanru quanru force-pushed the feat/computer-package branch 2 times, most recently from d1275f1 to 1540b3b Compare January 12, 2026 11:20
@quanru quanru marked this pull request as ready for review January 13, 2026 03:09
@quanru quanru force-pushed the feat/computer-package branch 4 times, most recently from 9c690bd to 26750f3 Compare January 15, 2026 12:38
quanru added 12 commits January 16, 2026 16:23
This package provides AI-driven automation for desktop computers (Windows/macOS/Linux) using:
- @computer-use/libnut for mouse/keyboard control
- screenshot-desktop for screen capture
- Multi-display support with displayId selection

Features:
- Mouse operations: tap, double-click, right-click, hover, drag-and-drop
- Keyboard operations: key press, text input, shortcuts
- Screen operations: screenshot, scroll, multi-display listing
- MCP server for AI assistant integration
Add normalizeKeyName function to map common key names to libnut-compatible names:
- Windows/win -> win
- Cmd -> command
- Ctrl -> control
- Esc -> escape
- Arrow keys -> up/down/left/right

This fixes "Invalid key code specified" error when pressing Windows key on Windows.
…consistency across documentation and codebase
- Rename scrollToEventName to scrollType for consistency with parameter name
- Add delay after backspace in ClearInput action to ensure operation completes
@quanru quanru force-pushed the feat/computer-package branch from 26750f3 to 203cd6c Compare January 16, 2026 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants