Stagehand - Telegram Image Queue Bot#
A Telegram bot that takes links from supported websites, extracts images and videos, and queues them for posting to a Telegram channel at scheduled intervals.
Features#
- Extract images and videos from various supported websites
- Process forwarded messages containing links (including button links)
- Queue media for scheduled posting
- Customizable posting schedule using cron syntax
- Access control to limit who can use the bot
- Post media with source attribution and link back to original
- Modular design for easy addition of new website scrapers
- Interactive visual queue management with inline buttons
- Intelligent queue monitoring with automatic alerts
- Perceptual image hashing for duplicate detection
- Automatic media cache management with recaching support
- Scheduled announcements with individual cron schedules
- Auto-updater for seamless updates from Git repository
- Discord webhook integration (optional)
Supported Websites#
- e621 - Uses OpenGraph scraping and Cheerio DOM parsing to extract images and videos
- FurAffinity - Leverages the FA Export API to fetch submission data and direct download links
- Bluesky - Uses ATProto API Library with BskyAgent for full support of posts, images, and videos
- SoFurry - Utilizes SoFurry's own APIs to fetch and download submissions.
- Weasyl - Implements Weasyl's API, you'll need to supply your own API key from Weasyl
Methodology for Supported Sites#
Bluesky#
- Uses the official
@atproto/apilibrary with BskyAgent - Parses URLs in the format
bsky.app/profile/{handle}/post/{id} - Extracts user DIDs and content identifiers
- Supports both image and video content extraction
- Handles thumbnails for video posts
- Works with public posts without requiring authentication
- Processes quoted content and multiple images in a single post
e621#
- Uses Cheerio to parse the HTML DOM of e621 pages
- Extracts media URLs from OpenGraph tags and direct DOM elements
- Handles both image and video content
- Processes and caches media files locally
- Supports fallback methods if primary extraction fails
- Ensures proper URL resolution for relative paths
FurAffinity#
- Extracts submission IDs from URLs in the format
furaffinity.net/view/{id} - Uses the (FAExport) (API) to fetch submission data
- Extracts direct download URLs, titles, and artist information
- Handles both image and video content
- Preserves proper attribution and metadata
SoFurry#
- Uses direct access to the SoFurry API with OAuth authentication
- Extracts submission IDs from URLs
- Requires manual setup:
- Create an application on the SoFurry Developer Portal
- Generate an OAuth access token manually
- Add the access token to your
.envfile asSOFURRY_ACCESS_TOKEN
- Fetches submission data, including display URLs and metadata
- Supports both image and video content
- Handles proper attribution with author and title information
Weasyl (Currently Broken)#
- Implements Weasyl's API for submission data
- Requires an API key from Weasyl
- Add your API key to
.envasWEASYL_API_KEY - Extracts submission information and media URLs
Media Caching and Transcoding#
Stagehand uses a sophisticated media caching and transcoding system to efficiently handle images and videos from various sources:
Media Caching#
- All downloaded media is cached locally to reduce bandwidth usage and improve performance
- Files are stored in organized directories:
cache/images/- For static imagescache/videos/- For original video filescache/transcoded/- For processed/transcoded videos
- Filenames are generated using MD5 hashes of source URLs to ensure uniqueness
- Cache is automatically cleaned up, with files older than 15 days (configurable) being removed
Media Processing#
- Content type detection based on HTTP headers and URL patterns
- Intelligent fallback mechanisms if metadata is unavailable
- Special handling for different media sources (e.g., Bluesky API)
- File extension determination from both URL and content type
- Maximum download size limit of 50MB to prevent abuse
Video Transcoding#
- Videos are automatically transcoded to H.264 MP4 format for maximum compatibility with Telegram
- Uses FFmpeg (via fluent-ffmpeg) with optimized settings:
- H.264 video codec for wide compatibility
- AAC audio codec at 128kbps
- Medium preset balancing quality and processing speed
- CRF 23 for good quality-to-size ratio
- MP4 container with faststart flag for immediate playback
- YUV420p pixel format for maximum device compatibility
- Animated GIFs are handled appropriately based on content type
This system ensures that all media is properly optimized before being sent to Telegram, providing reliable playback across all devices while managing bandwidth and storage efficiently.
Bot Architecture#
Stagehand uses a modular bot architecture for better maintainability and extensibility:
Modular Design#
- Commands: Each bot command (
/start,/help,/queue, etc.) is implemented as a separate module - Helpers: Shared functionality (auth, media posting, queue management) is organized into helper classes
- Registry System: A central command registry coordinates all modules and their dependencies
Migration Status#
- ✅ Current Version: Modular architecture (active)
- 📁 Legacy Backup: Original monolithic version saved as
telegramBot.js.backup - 🔄 Migration Script: Use
./migrate-bot.shto switch between versions if needed
Benefits#
- Easier Maintenance: Changes to one command don't affect others
- Better Testing: Individual modules can be tested in isolation
- Extensibility: New commands and features can be added easily
- Code Organization: Clear separation of concerns
For detailed architecture documentation, see docs/bot-architecture.md.
Installation#
Automated Installation (Recommended)#
Linux (Ubuntu, Fedora, Arch)#
Run the automated installation script:
./install.sh
This script will:
- Detect your Linux distribution and install required dependencies
- Set up your
.envconfiguration interactively - Install Node.js dependencies
- Create necessary cache directories
- Optionally set up a systemd service for automatic startup
Windows#
Run the automated installation batch file:
install.bat
This script will:
- Check for required dependencies (Node.js, Git, FFmpeg)
- Install missing dependencies using Chocolatey or winget if available
- Guide you through an interactive configuration process
- Install Node.js dependencies
- Create necessary cache directories
- Optionally set up a Windows service using PM2
Manual Installation#
If you prefer to install manually:
- Clone this repository
- Install dependencies:
npm install - Copy
.env.exampleto.envand fill in your credentials:cp .env.example .env - Edit the
.envfile with your Telegram bot token and channel ID - Ensure PM2 is installed globally:
npm install -g pm2
Installation Script Details#
The automated installation scripts provide several advantages:
-
Dependency Management
- Automatically installs Node.js, Git, FFmpeg, and PM2 if missing
- Uses native package managers (apt, dnf, pacman) on Linux
- Utilizes Chocolatey or winget on Windows if available
-
Interactive Configuration
- Guides you through setting up all required environment variables
- Provides sensible defaults for optional settings
- Validates that required fields like bot token are entered
-
System Integration
- Configures systemd service on Linux
- Sets up Windows service with PM2
- Creates proper directory structure for media caching
Environment Variables#
The bot is configured via environment variables in a .env file. The installation scripts will help you create this file interactively, but here's a reference of available settings:
Required Settings#
BOT_TOKEN: Your Telegram bot token from BotFatherCHANNEL_ID: Your Telegram channel ID or username (e.g., @mychannel)
User Access Control#
AUTHORIZED_USERS: Comma-separated list of Telegram user IDs that are allowed to use the botOWNER_ID: (Optional) User ID of the bot owner for admin-level commands
Integration Options#
WEASYL_API_KEY: (Optional) API key for Weasyl integration - Get yours from WeasylSOFURRY_ACCESS_TOKEN: (Optional) OAuth access token for SoFurry API - Create an app and generate a token at the SoFurry Developer PortalDISCORD_WEBHOOK_URL: (Optional) For Discord integrationDISCORD_ENABLED: Set to 'true' to enable Discord posting
Queue Configuration#
DEFAULT_CRON_SCHEDULE: When to post images (cron format, default: '0 */1 * * *' - every hour)IMAGES_PER_INTERVAL: Number of images to post each time (default: 1)
Queue Monitoring#
QUEUE_LOW_THRESHOLD: Alert when queue has this many items or fewer (default: 10)QUEUE_EMPTY_THRESHOLD: Alert when queue hits this level (default: 0)QUEUE_ALERTS_ENABLED: Enable/disable queue monitoring (default: true)QUEUE_ALERT_COOLDOWN_HOURS: Hours between repeated alerts (default: 24)
Media Cache#
MAX_CACHE_AGE_DAYS: Days to keep cached media files (default: 15)
Running the Bot#
This bot uses PM2 by default to ensure it runs persistently and automatically restarts after crashes or system reboots.
Starting the bot#
npm start
Stopping the bot#
npm run stop
Restarting the bot#
npm run restart
Viewing logs#
npm run logs
Checking status#
npm run status
Setting up automatic startup on system boot#
pm2 startup
Then follow the instructions provided by the command.
Saving the current PM2 process list#
After starting your bot, run:
pm2 save
This ensures your bot restarts automatically if the system reboots.
Development mode#
For development with auto-reload on file changes:
npm run dev
Commands#
/start- Start the bot/help- Show help information/queue- Show current queue status with interactive management/status- Show detailed queue status and alert information/send- Post the next image in the queue immediately/schedule [cron]- Set posting schedule using cron syntax/setcount [number]- Set number of images per post interval/shuffle- Toggle shuffle mode (randomizes queue after each post, persists between restarts)/clear- Clear the entire queue/cleancache- Clean expired items from media cache/recache- Recache missing files and remove items that fail after 3 attempts/announce- Create a new announcement (with custom text and schedule)/announcements- Manage existing announcements (view, edit, delete)/update- Check for updates, stash changes, pull latest code, and restart bot (owner only)
Adding Images to Queue#
Direct Links#
Send a link from any supported website to the bot in a direct message. The bot will extract the image and add it to the queue.
Multiple Links in Messages#
The bot can process any message containing multiple links to supported websites. This includes:
- Multiple URLs in Text: Messages containing several URLs will have all valid links processed
- Mixed Content: Messages with both text and inline keyboard buttons containing URLs
- Embedded URLs: Links that appear anywhere within message text (not just at the beginning)
- Button Links: URLs embedded in inline keyboard buttons are automatically detected and processed
- Error Handling: Invalid links are silently discarded - the bot will only show an error if zero valid links are found
Forwarded Messages#
The bot can also process forwarded messages that contain links to supported websites with the same capabilities as regular messages.
Usage: Send any message containing supported website links to the bot. The bot will automatically detect and process all valid URLs while silently discarding invalid ones.
Interactive Queue Management#
The /queue command displays an interactive visual interface for managing queued items:
- Page Navigation: Browse through the queue using Previous/Next buttons
- Item Preview: View a preview of any queued item before it's posted
- Item Removal: Remove specific items from the queue with a single click
- Reordering: Move any item to the top of the queue to be posted next
- Pagination: Easily navigate through pages of queued items
- Shuffle Mode: Automatically randomize the queue after each post with
/shufflecommand (setting persists between restarts)
The interface shows important information about each queued item including:
- Item position in queue
- Content type (image or video)
- Title and source website
- Controls for managing each item
Queue Monitoring and Alerts#
Stagehand includes an intelligent queue monitoring system that automatically alerts authorized users when the queue needs attention:
Features#
- Real-time monitoring: Checks queue levels every 30 seconds
- Smart notifications: Configurable thresholds for low and empty queue alerts
- 24-hour cooldown: Prevents alert spam with time-based rate limiting
- Multi-user support: All authorized users receive alerts simultaneously
- Admin controls: Test and manage alerts via the
/statuscommand
Configuration#
Add these variables to your .env file to customize alert behavior:
# Queue Alert Configuration
QUEUE_LOW_THRESHOLD=10 # Alert when queue ≤ this number
QUEUE_EMPTY_THRESHOLD=0 # Alert when queue is critically low/empty
QUEUE_ALERTS_ENABLED=true # Enable/disable monitoring
QUEUE_ALERT_COOLDOWN_HOURS=24 # Hours between repeated alerts
Commands#
/status- View detailed queue status and alert configuration- Use admin controls in
/statusto test alerts and reset cooldowns
For detailed configuration options, see Queue Monitoring Configuration.
Documentation#
This project includes comprehensive documentation in the docs/ directory:
- Bot Architecture - Detailed architecture documentation for the modular bot system
- Bot Architecture Migration - Migration guide from monolithic to modular architecture
- Queue Monitoring Implementation - Complete implementation details for the queue monitoring and alert system
- Queue Monitoring Configuration - Configuration guide for queue alerts and thresholds
- Image Hashing - Documentation on perceptual image hashing for duplicate detection
Advanced Features#
Perceptual Image Hashing#
Stagehand includes a sophisticated image hashing system to detect duplicate and similar images:
- Automatic Processing: Images are automatically hashed when cached
- SQLite Database: Stores perceptual hashes with URLs and metadata
- Similarity Detection: Find visually similar images using Hamming distance
- Duplicate Prevention: Helps identify duplicate content from different sources
- Cleanup Management: Automatically removes database entries for deleted files
The image hashing system uses the imghash library to generate perceptual hashes (pHash) that can detect similar images even after resizing, compression, or minor modifications.
For detailed information, see Image Hashing Documentation.
Announcement System#
Create and manage scheduled text announcements that are posted to your channel:
- Multiple Announcements: Support for multiple independent announcements
- Individual Schedules: Each announcement can have its own cron schedule
- Interactive Management: Create, edit, and delete announcements through the bot
- Persistent Storage: Announcements are saved and survive bot restarts
- Test Functionality: Test announcements before scheduling them
Use /announce to create new announcements and /announcements to manage existing ones.
Auto-Updater#
Stagehand includes an automatic update system that keeps your bot up to date:
- Periodic Checks: Automatically checks for updates every 12 hours
- Git Integration: Pulls updates from your Git repository
- PM2 Restart: Automatically restarts the bot using PM2 after updating
- Manual Updates: Use
/updatecommand to check and apply updates immediately - Owner-Only: Update command is restricted to the bot owner
- Dev Mode Exclusion: Auto-updater is disabled in development mode
Update Flow#
When an update is triggered (either automatically or via /update command), the bot follows this process:
- Stash Local Changes: Any uncommitted local modifications are automatically stashed to prevent conflicts
- Fetch Remote Changes: Retrieves the latest commits from the remote repository
- Check for Updates: Verifies if there are new commits to pull
- Pull Changes: Applies the updates using
git pull - Save Queue: Forces a save of the current queue to disk to prevent data loss
- Display Commit Info: Shows the latest commit message with author and timestamp
- Restart Bot: Gracefully restarts the bot via PM2 with updated environment variables
Error Handling#
Each step in the update process includes comprehensive error handling:
- Stash Failures: Reports if local changes cannot be stashed
- Fetch Failures: Reports connection or repository access issues
- Pull Failures: Reports merge conflicts or pull errors
- Restart Failures: Provides instructions to manually restart if PM2 fails
- Descriptive Messages: Each error includes specific information about what went wrong
The updater handles the normal PM2 SIGINT signal correctly and won't report it as an error during successful restarts.
Media Recaching#
Automatically handle missing or corrupted cache files:
- Automatic Detection: Identifies missing cache files in the queue
- Redownload Support: Automatically redownloads missing media files
- Failure Tracking: Tracks failed attempts and removes items after 3 failures
- Manual Trigger: Use
/recachecommand to manually trigger recaching - Scheduled Execution: Can be scheduled to run automatically via cron
- Progress Reporting: Reports how many items were processed and removed
This ensures your queue remains healthy and all media files are available when needed.
Discord Integration#
Optional Discord webhook support for cross-platform posting:
- Webhook Support: Post media to Discord channels via webhooks
- Parallel Posting: Post to both Telegram and Discord simultaneously
- Independent Configuration: Enable/disable Discord without affecting Telegram
- Media Compatibility: Handles both images and videos
Configure using the DISCORD_WEBHOOK_URL and DISCORD_ENABLED environment variables.
Documentation#
This project includes comprehensive documentation in the docs/ directory:
- Bot Architecture - Detailed architecture documentation for the modular bot system
- Bot Architecture Migration - Migration guide from monolithic to modular architecture
- Queue Monitoring Implementation - Complete implementation details for the queue monitoring and alert system
- Queue Monitoring Configuration - Configuration guide for queue alerts and thresholds
- Image Hashing - Documentation on perceptual image hashing for duplicate detection
License#
GPL V3
ToDo List#
- ATProto Implementation
- Basic e621 Scraper
- FurAffinity Scraper
- SoFurry Scraper
- Weasyl Scraper
- Interactive Graphical Queue Manager
- Add shuffle mode for queue
- Add perceptual hashing
- Redo Bluesky Module
- Redo Telegram Module
- Queue monitoring and alerts
- Announcement system
- Auto-updater
- Media recaching system
- Redo Queue Manager
- Redo Discord Module