Import your Last.fm and Spotify listening history to the AT Protocol network using the fm.teal.alpha.feed.play lexicon.
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

REWRITE

Ewan 68589d22 590df1f1

+1289 -712
+263 -46
README.md
··· 4 4 5 5 (Also [on Tangled!](https://tangled.org/@did:plc:ofrbh253gwicbkc5nktqepol/atproto-lastfm-importer)) 6 6 7 + ## Features 8 + 9 + - ✅ **Rate Limiting**: Automatically limits imports to 1K records per day to prevent rate limiting your entire PDS 10 + - ✅ **Multi-Day Imports**: Large imports (>1K records) automatically span multiple days with 24-hour pauses 11 + - ✅ **Resume Support**: Safe to stop (Ctrl+C) and restart - continues from where it left off 12 + - ✅ **Graceful Cancellation**: Press Ctrl+C to stop after the current batch completes 13 + - ✅ **Identity Resolution**: Resolves ATProto handles/DIDs using Slingshot 14 + - ✅ **PDS Auto-Discovery**: Automatically connects to your personal PDS 15 + - ✅ **Dry Run Mode**: Preview records without publishing 16 + - ✅ **Batch Processing**: Configurable batching with rate limit safety 17 + - ✅ **Progress Tracking**: Real-time progress with time estimates 18 + - ✅ **Error Handling**: Continues on errors with detailed reporting 19 + - ✅ **MusicBrainz Support**: Preserves MusicBrainz IDs when available 20 + - ✅ **Chronological Ordering**: Processes oldest first (or newest with `-r` flag) 21 + 22 + ## Important: Rate Limits 23 + 24 + ⚠️ **CRITICAL**: Bluesky's AppView has rate limits on PDS instances. Exceeding 10K records per day can rate limit your **ENTIRE PDS**, affecting all users on your instance! 25 + 26 + This importer automatically: 27 + - Limits imports to **1,000 records per day** (90% of safe limit) 28 + - Calculates optimal batch sizes and delays 29 + - Pauses 24 hours between days for large imports 30 + - Shows clear progress and time estimates 31 + 32 + See: [Bluesky Rate Limits Documentation](https://docs.bsky.app/blog/rate-limits-pds-v3) 33 + 7 34 ## Setup 8 35 9 36 ```bash 10 37 npm install 38 + npm run build 11 39 ``` 12 40 13 41 ## Usage 14 42 15 43 ### Interactive Mode 44 + 45 + The simplest way to use the importer - just run it and follow the prompts: 16 46 17 47 ```bash 18 - node importer.js 48 + npm start 19 49 ``` 20 50 21 - ### With Command Line Arguments 51 + ### Command Line Mode 22 52 23 - **Full automation:** 53 + For automation or scripting, provide all parameters via flags: 24 54 25 55 ```bash 26 - node importer.js -f lastfm.csv -i alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y 27 - ``` 56 + # Full automation 57 + npm start -- -f lastfm.csv -i alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y 28 58 29 - **Dry run (preview without publishing):** 59 + # Preview without publishing 60 + npm start -- -f lastfm.csv --dry-run 30 61 31 - ```bash 32 - node importer.js -f lastfm.csv --dry-run 62 + # Custom batch settings (advanced users) 63 + npm start -- -f lastfm.csv -i alice.bsky.social -b 20 -d 3000 64 + 65 + # Process newest tracks first 66 + npm start -- -f lastfm.csv -i alice.bsky.social -r -y 33 67 ``` 34 68 35 - **Custom batch settings:** 69 + ## Command Line Options 70 + 71 + | Option | Short | Description | Default | 72 + |--------|-------|-------------|---------| 73 + | `--help` | `-h` | Show help message | - | 74 + | `--file <path>` | `-f` | Path to Last.fm CSV export file | (prompted) | 75 + | `--identifier <id>` | `-i` | ATProto handle or DID | (prompted) | 76 + | `--password <pass>` | `-p` | ATProto app password | (prompted) | 77 + | `--batch-size <num>` | `-b` | Records per batch | Auto-calculated | 78 + | `--batch-delay <ms>` | `-d` | Delay between batches in ms | 2000 (min: 1000) | 79 + | `--yes` | `-y` | Skip confirmation prompt | false | 80 + | `--dry-run` | `-n` | Preview without publishing | false | 81 + | `--reverse-chronological` | `-r` | Process newest first | false (oldest first) | 36 82 37 - ```bash 38 - node importer.js -f lastfm.csv -i alice.bsky.social -b 20 -d 3000 39 - ``` 83 + ### Batch Settings 84 + 85 + The importer automatically calculates optimal batch settings based on your total record count and rate limits. You generally **don't need** to specify batch settings unless you have specific requirements. 86 + 87 + **Automatic behavior:** 88 + - For imports < 1K records: Uses default settings (10 records/batch, 2s delay) 89 + - For imports > 1K records: Automatically calculates settings to spread across multiple days 40 90 41 - ## Options 91 + **Manual override** (advanced): 92 + - `--batch-size`: Number of records processed per batch (1-50) 93 + - `--batch-delay`: Milliseconds to wait between batches (min: 1000) 42 94 43 - - `-h, --help` - Show help message 44 - - `-f, --file <path>` - Path to Last.fm CSV export file 45 - - `-i, --identifier <id>` - ATProto handle or DID 46 - - `-p, --password <pass>` - ATProto app password 47 - - `-b, --batch-size <num>` - Records per batch (default: 10) 48 - - `-d, --batch-delay <ms>` - Delay between batches in ms (default: 2000) 49 - - `-y, --yes` - Skip confirmation prompt 50 - - `-n, --dry-run` - Preview records without publishing 95 + ⚠️ Lower delays increase speed but risk hitting rate limits. The automatic calculation is recommended. 51 96 52 97 ## Getting Your Last.fm Data 53 98 54 99 1. Go to <https://lastfm.ghan.nl/export/> 55 - 2. Request your data export in CSV 100 + 2. Request your data export in CSV format 56 101 3. Download the CSV file when ready 57 102 4. Use the CSV file path with this script 58 103 59 - ## Features 104 + ## What Gets Imported 105 + 106 + Each Last.fm scrobble becomes an `fm.teal.alpha.feed.play` record with: 60 107 61 - - ✅ Resolves ATProto handles/DIDs using Slingshot 62 - - ✅ Connects to your personal PDS 63 - - ✅ Converts Last.fm scrobbles to `fm.teal.alpha.feed.play` records 64 - - ✅ Follows the official lexicon schema 65 - - ✅ Batch publishing with configurable rate limiting 66 - - ✅ Dry run mode for previewing 67 - - ✅ Progress tracking and error reporting 68 - - ✅ Preserves MusicBrainz IDs when available 108 + ### Required Fields 109 + - **trackName**: The name of the track 110 + - **artists**: Array of artist objects (requires `artistName`, optional `artistMbId`) 111 + - **playedTime**: ISO 8601 timestamp of when you listened 112 + - **submissionClientAgent**: Identifies this importer (`lastfm-importer/v0.0.2`) 113 + - **musicServiceBaseDomain**: Always set to `last.fm` 69 114 70 - ## Record Format 115 + ### Optional Fields (when available) 116 + - **releaseName**: Album/release name 117 + - **releaseMbId**: MusicBrainz release ID 118 + - **recordingMbId**: MusicBrainz recording/track ID 119 + - **originUrl**: Link to the track on Last.fm 71 120 72 - Each scrobble is converted according to the `fm.teal.alpha.feed.play` lexicon: 121 + ### Example Record 73 122 74 123 ```json 75 124 { ··· 86 135 "recordingMbId": "3a390ad3-fe56-45f2-a073-bebc45d6bde1", 87 136 "playedTime": "2025-11-13T23:49:36Z", 88 137 "originUrl": "https://www.last.fm/music/Cjbeards/_/Paint+My+Masterpiece", 89 - "submissionClientAgent": "lastfm-importer/v1.0.0", 138 + "submissionClientAgent": "lastfm-importer/v0.0.2", 90 139 "musicServiceBaseDomain": "last.fm" 91 140 } 92 141 ``` 93 142 94 - ### Required Fields 143 + ## Processing Order 144 + 145 + By default, records are processed **oldest first** (chronological order). This means your earliest scrobbles will appear first in your ATProto feed. 146 + 147 + Use the `--reverse-chronological` or `-r` flag to process **newest first** instead. 148 + 149 + ## Multi-Day Imports 150 + 151 + For imports exceeding 1,000 records (after applying the 90% safety margin), the importer automatically: 152 + 153 + 1. **Calculates a schedule**: Splits your import across multiple days 154 + 2. **Shows the plan**: Displays which records will be imported each day 155 + 3. **Processes Day 1**: Imports the first batch of records 156 + 4. **Pauses 24 hours**: Waits a full day before continuing 157 + 5. **Repeats**: Continues until all records are imported 158 + 159 + **Important notes:** 160 + - You can safely stop (Ctrl+C) and restart the importer 161 + - Progress is preserved - it continues where it left off 162 + - Each day's progress is clearly displayed 163 + - Time estimates account for multi-day duration 164 + 165 + Example output for a 5,000 record import: 166 + ``` 167 + 📊 Rate Limiting Information: 168 + Total records: 5,000 169 + Daily limit: 900 records/day 170 + Estimated duration: 6 days 171 + Batch size: 10 records 172 + Batch delay: 9600.0s 173 + ``` 174 + 175 + ## Dry Run Mode 176 + 177 + Preview what will be imported without actually publishing: 178 + 179 + ```bash 180 + npm start -- -f lastfm.csv --dry-run 181 + ``` 182 + 183 + Dry run shows: 184 + - Total record count 185 + - Rate limiting schedule (if applicable) 186 + - Multi-day import plan (if needed) 187 + - Preview of first 5 records with full details 188 + - MusicBrainz IDs when available 189 + 190 + ## Error Handling 191 + 192 + The importer is designed to be resilient: 193 + 194 + - **Network errors**: Records that fail are logged but don't stop the import 195 + - **Invalid data**: Skipped with error messages 196 + - **Authentication issues**: Clear error messages with suggested fixes 197 + - **Rate limit hits**: Automatic adjustment and retry logic 198 + - **Ctrl+C handling**: Gracefully stops after current batch 199 + 200 + Failed records are logged but don't prevent the rest of your import from completing. 201 + 202 + ## Project Structure 203 + 204 + ``` 205 + atproto-lastfm-importer/ 206 + ├── src/ 207 + │ ├── lib/ 208 + │ │ ├── auth.ts # Authentication & identity resolution 209 + │ │ ├── cli.ts # Command line argument parsing 210 + │ │ ├── csv.ts # CSV parsing & record conversion 211 + │ │ └── publisher.ts # Batch publishing with rate limiting 212 + │ ├── utils/ 213 + │ │ ├── helpers.ts # Utility functions (timing, formatting) 214 + │ │ ├── input.ts # User input handling (prompts, passwords) 215 + │ │ └── rate-limiter.ts # Rate limiting calculations 216 + │ ├── config.ts # Configuration constants 217 + │ └── types.ts # TypeScript type definitions 218 + ├── lexicons/ # fm.teal.alpha lexicon definitions 219 + │ └── fm.teal.alpha/ 220 + │ └── feed/ 221 + │ └── play.json # Play record schema 222 + ├── package.json 223 + ├── tsconfig.json 224 + └── README.md 225 + ``` 95 226 96 - - `trackName` - The name of the track 97 - - `artists` - Array of artist objects with `artistName` (required) and optional `artistMbId` 227 + ## Development 98 228 99 - ### Optional Fields 229 + ```bash 230 + # Type checking 231 + npm run type-check 100 232 101 - - `releaseName` - Album name 102 - - `releaseMbId` - MusicBrainz release ID 103 - - `recordingMbId` - MusicBrainz recording ID 104 - - `playedTime` - ISO 8601 datetime 105 - - `originUrl` - Link to the track 106 - - `submissionClientAgent` - Client identifier 107 - - `musicServiceBaseDomain` - Service domain (e.g., "last.fm") 233 + # Build 234 + npm run build 235 + 236 + # Development mode (rebuild + run) 237 + npm run dev 238 + 239 + # Clean build artifacts 240 + npm run clean 241 + ``` 242 + 243 + ## Technical Details 244 + 245 + ### Authentication 246 + - Uses Slingshot resolver to discover your PDS from your handle/DID 247 + - Requires an ATProto app password (not your main password) 248 + - Automatically configures the agent for your personal PDS 249 + 250 + ### Rate Limiting Algorithm 251 + 1. Calculates safe daily limit (90% of 1K = 900 records/day) 252 + 2. Determines how many days needed for your import 253 + 3. Calculates optimal batch size and delay to spread records evenly 254 + 4. Enforces minimum 1 second delay between batches 255 + 5. Shows clear schedule before starting 256 + 257 + ### Record Processing 258 + 1. Parses CSV using `csv-parse` library 259 + 2. Sorts records chronologically (or reverse if `-r` flag) 260 + 3. Converts Last.fm format to `fm.teal.alpha.feed.play` schema 261 + 4. Validates required fields 262 + 5. Publishes in batches with configurable delays 263 + 264 + ### Data Mapping 265 + - **Track info**: Direct mapping from CSV columns 266 + - **Timestamps**: Converts Unix timestamps to ISO 8601 267 + - **MusicBrainz IDs**: Preserved when present in CSV 268 + - **URLs**: Generated from artist/track names 269 + - **Artists**: Wrapped in array format with optional MBID 108 270 109 271 ## Lexicon Reference 110 272 111 - This importer follows the lexicon defined in `/lexicons/fm.teal.alpha/feed/play.json`. 273 + This importer follows the official `fm.teal.alpha` lexicon defined in `/lexicons/fm.teal.alpha/feed/play.json`. 274 + 275 + The lexicon defines: 276 + - Required and optional field types 277 + - String length constraints 278 + - Array formats 279 + - Timestamp formatting 280 + - URL validation 281 + 282 + ## Troubleshooting 283 + 284 + ### "Handle not found" 285 + - Verify your ATProto handle is correct (e.g., `alice.bsky.social`) 286 + - Make sure you're using a valid DID or handle 287 + 288 + ### "Invalid credentials" 289 + - Use an **app password**, not your main account password 290 + - Generate app passwords in your account settings 291 + 292 + ### "Rate limit exceeded" 293 + - The importer should prevent this automatically 294 + - If you see this, wait 24 hours before retrying 295 + - Consider reducing batch size or increasing delay 296 + 297 + ### "Connection refused" 298 + - Check your internet connection 299 + - Verify your PDS is accessible 300 + - Some PDSs may have firewall rules 301 + 302 + ### Import seems stuck 303 + - Check progress messages - large imports take time 304 + - Multi-day imports pause for 24 hours between days 305 + - You can safely stop (Ctrl+C) and resume later 306 + 307 + ## Contributing 308 + 309 + Contributions welcome! Please: 310 + 1. Fork the repository 311 + 2. Create a feature branch 312 + 3. Make your changes with tests 313 + 4. Submit a pull request 314 + 315 + ## License 316 + 317 + MIT License - See LICENSE file for details 318 + 319 + ## Credits 320 + 321 + - Uses [@atproto/api](https://www.npmjs.com/package/@atproto/api) for ATProto interactions 322 + - CSV parsing via [csv-parse](https://www.npmjs.com/package/csv-parse) 323 + - Identity resolution via [Slingshot](https://slingshot.danner.cloud) 324 + - Follows the `fm.teal.alpha` lexicon standard 325 + 326 + --- 327 + 328 + **Note**: This tool is for personal use. Respect Last.fm's terms of service and rate limits when exporting your data.
-126
STRUCTURE.md
··· 1 - # Last.fm to ATProto Importer - Modular Structure 2 - 3 - ## Project Structure 4 - 5 - ```plaintext 6 - lastfm-importer/ 7 - ├── src/ 8 - │ ├── index.js # Main entry point 9 - │ ├── config.js # Configuration constants 10 - │ ├── lib/ # Core library modules 11 - │ │ ├── auth.js # Authentication & login 12 - │ │ ├── cli.js # CLI argument parsing & help 13 - │ │ ├── csv.js # CSV parsing & conversion 14 - │ │ └── publisher.js # Record publishing logic 15 - │ └── utils/ # Utility functions 16 - │ ├── helpers.js # Helper functions (formatting, batch calculation) 17 - │ ├── input.js # User input & password masking 18 - │ └── killswitch.js # Graceful shutdown handling 19 - ├── importer.js # Wrapper for backwards compatibility 20 - └── importer.old.js # Original monolithic version (backup) 21 - ``` 22 - 23 - ## Module Responsibilities 24 - 25 - ### `/src/config.js` 26 - 27 - - Configuration constants 28 - - Batch size calculation parameters 29 - - API endpoints and client information 30 - 31 - ### `/src/lib/auth.js` 32 - 33 - - ATProto authentication 34 - - Identity resolution via Slingshot 35 - - Login error handling 36 - 37 - ### `/src/lib/cli.js` 38 - 39 - - Command-line argument parsing 40 - - Help text display 41 - - Input validation 42 - 43 - ### `/src/lib/csv.js` 44 - 45 - - CSV file parsing 46 - - Record conversion to ATProto format 47 - - Chronological sorting 48 - 49 - ### `/src/lib/publisher.js` 50 - 51 - - Batch publishing with rate limiting 52 - - Dry-run preview mode 53 - - Progress tracking and reporting 54 - - Killswitch integration 55 - 56 - ### `/src/utils/helpers.js` 57 - 58 - - Duration formatting 59 - - Optimal batch size calculation (logarithmic algorithm) 60 - - Generic utility functions 61 - 62 - ### `/src/utils/input.js` 63 - 64 - - Interactive prompts 65 - - Password masking with asterisks 66 - - Backspace support 67 - 68 - ### `/src/utils/killswitch.js` 69 - 70 - - SIGINT handler 71 - - Graceful shutdown state management 72 - - Force-quit on second Ctrl+C 73 - 74 - ## Benefits of Modular Structure 75 - 76 - 1. **Maintainability**: Each module has a single responsibility 77 - 2. **Testability**: Individual modules can be tested in isolation 78 - 3. **Reusability**: Modules can be imported and reused 79 - 4. **Readability**: Smaller files are easier to understand 80 - 5. **Collaboration**: Multiple developers can work on different modules 81 - 6. **Debugging**: Easier to locate and fix issues 82 - 83 - ## Usage 84 - 85 - The wrapper file (`importer.js`) maintains backwards compatibility: 86 - 87 - ```bash 88 - # Still works exactly as before 89 - node importer.js -f lastfm.csv -i handle.bsky.social 90 - 91 - # Or use the modular version directly 92 - node src/index.js -f lastfm.csv -i handle.bsky.social 93 - ``` 94 - 95 - ## Algorithm Details 96 - 97 - ### Batch Size Calculation 98 - 99 - Located in `/src/utils/helpers.js`: 100 - 101 - ```javascript 102 - batchSize = BASE + (log2(records/MIN) * SCALING_FACTOR) 103 - ``` 104 - 105 - - **Time Complexity**: O(n) - each record processed once 106 - - **Space Complexity**: O(b) where b is batch size 107 - - **Rate Limit Strategy**: Token bucket approach 108 - - **Adaptive**: Adjusts based on total records and delay settings 109 - 110 - ### Processing Order 111 - 112 - - Default: Chronological (oldest first) 113 - - Option: `--reverse-chronological` for newest first 114 - - Sorted by `playedTime` field 115 - 116 - ## Future Improvements 117 - 118 - With the modular structure, it's now easier to: 119 - 120 - - Add unit tests for each module 121 - - Implement different authentication methods 122 - - Support multiple export formats (JSON, XML) 123 - - Add progress persistence (resume interrupted imports) 124 - - Implement retry logic with exponential backoff 125 - - Add statistics and analytics 126 - - Create a web UI that imports these modules
-5
importer.js
··· 1 - #!/usr/bin/env node 2 - 3 - // Wrapper file for backwards compatibility 4 - // This imports and runs the modular version 5 - import './src/index.js';
+40 -2
package-lock.json
··· 1 1 { 2 2 "name": "lastfm-importer", 3 - "version": "1.0.0", 3 + "version": "0.0.2", 4 4 "lockfileVersion": 3, 5 5 "requires": true, 6 6 "packages": { 7 7 "": { 8 8 "name": "lastfm-importer", 9 - "version": "1.0.0", 9 + "version": "0.0.2", 10 10 "license": "MIT", 11 11 "dependencies": { 12 12 "@atproto/api": "^0.13.0", 13 13 "csv-parse": "^5.5.0" 14 + }, 15 + "bin": { 16 + "lastfm-import": "dist/index.js" 17 + }, 18 + "devDependencies": { 19 + "@types/node": "^20.0.0", 20 + "typescript": "^5.3.0" 14 21 } 15 22 }, 16 23 "node_modules/@atproto/api": { ··· 74 81 "dependencies": { 75 82 "@atproto/lexicon": "^0.4.10", 76 83 "zod": "^3.23.8" 84 + } 85 + }, 86 + "node_modules/@types/node": { 87 + "version": "20.19.25", 88 + "resolved": "https://registry.npmjs.org/@types/node/-/node-20.19.25.tgz", 89 + "integrity": "sha512-ZsJzA5thDQMSQO788d7IocwwQbI8B5OPzmqNvpf3NY/+MHDAS759Wo0gd2WQeXYt5AAAQjzcrTVC6SKCuYgoCQ==", 90 + "dev": true, 91 + "license": "MIT", 92 + "dependencies": { 93 + "undici-types": "~6.21.0" 77 94 } 78 95 }, 79 96 "node_modules/await-lock": { ··· 115 132 "tlds": "bin.js" 116 133 } 117 134 }, 135 + "node_modules/typescript": { 136 + "version": "5.9.3", 137 + "resolved": "https://registry.npmjs.org/typescript/-/typescript-5.9.3.tgz", 138 + "integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==", 139 + "dev": true, 140 + "license": "Apache-2.0", 141 + "bin": { 142 + "tsc": "bin/tsc", 143 + "tsserver": "bin/tsserver" 144 + }, 145 + "engines": { 146 + "node": ">=14.17" 147 + } 148 + }, 118 149 "node_modules/uint8arrays": { 119 150 "version": "3.0.0", 120 151 "resolved": "https://registry.npmjs.org/uint8arrays/-/uint8arrays-3.0.0.tgz", ··· 123 154 "dependencies": { 124 155 "multiformats": "^9.4.2" 125 156 } 157 + }, 158 + "node_modules/undici-types": { 159 + "version": "6.21.0", 160 + "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.21.0.tgz", 161 + "integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ==", 162 + "dev": true, 163 + "license": "MIT" 126 164 }, 127 165 "node_modules/zod": { 128 166 "version": "3.25.76",
+19 -6
package.json
··· 1 1 { 2 2 "name": "lastfm-importer", 3 - "version": "1.0.0", 4 - "description": "Import Last.fm scrobbles to ATProto", 3 + "version": "0.0.2", 4 + "description": "Import Last.fm scrobbles to ATProto with rate limiting", 5 5 "type": "module", 6 - "main": "importer.js", 6 + "main": "./dist/index.js", 7 + "types": "./dist/index.d.ts", 8 + "bin": { 9 + "lastfm-import": "./dist/index.js" 10 + }, 7 11 "scripts": { 8 - "start": "node importer.js", 9 - "dry-run": "node importer.js --dry-run" 12 + "build": "tsc", 13 + "start": "npm run build && node dist/index.js", 14 + "dev": "tsc && node dist/index.js", 15 + "dry-run": "npm run build && node dist/index.js --dry-run", 16 + "clean": "rm -rf dist", 17 + "type-check": "tsc --noEmit" 10 18 }, 11 19 "keywords": [ 12 20 "lastfm", 13 21 "atproto", 14 22 "bluesky", 15 - "import" 23 + "import", 24 + "typescript" 16 25 ], 17 26 "author": "", 18 27 "license": "MIT", 19 28 "dependencies": { 20 29 "@atproto/api": "^0.13.0", 21 30 "csv-parse": "^5.5.0" 31 + }, 32 + "devDependencies": { 33 + "@types/node": "^20.0.0", 34 + "typescript": "^5.3.0" 22 35 } 23 36 }
-16
src/config.js
··· 1 - /** 2 - * Configuration constants for the Last.fm importer 3 - */ 4 - 5 - export const DEFAULT_BATCH_SIZE = 10; 6 - export const DEFAULT_BATCH_DELAY = 1500; 7 - export const MIN_BATCH_DELAY = 100; 8 - export const RECORD_TYPE = 'fm.teal.alpha.feed.play'; 9 - export const SLINGSHOT_RESOLVER = 'https://slingshot.microcosm.blue/xrpc/com.bad-example.identity.resolveMiniDoc'; 10 - export const CLIENT_AGENT = 'lastfm-importer/v0.0.1'; 11 - 12 - // Batch size calculation constants 13 - export const MIN_RECORDS_FOR_SCALING = 100; 14 - export const BASE_BATCH_SIZE = 5; 15 - export const MAX_BATCH_SIZE = 50; 16 - export const SCALING_FACTOR = 1.5;
+49
src/config.ts
··· 1 + import type { Config } from './types.js'; 2 + 3 + // ⚠️ IMPORTANT: Rate Limit Warning 4 + // Bluesky's AppView has rate limits on PDS instances: 5 + // - Exceeding 10K records per day can rate limit your ENTIRE PDS 6 + // - This affects all users on your PDS, not just your account 7 + // - See: https://docs.bsky.app/blog/rate-limits-pds-v3 8 + // 9 + // Default limit: 1K records per day (automatically batched with pauses) 10 + export const RECORDS_PER_DAY_LIMIT = 1000; 11 + 12 + // Safety margin factor (0.9 = use 90% of limit to be safe) 13 + export const SAFETY_MARGIN = 0.9; 14 + 15 + // Record type 16 + export const RECORD_TYPE = 'fm.teal.alpha.feed.play'; 17 + 18 + // Client agent 19 + export const CLIENT_AGENT = 'lastfm-importer/v0.0.2'; 20 + 21 + // Default batch configuration (will be adjusted for rate limiting) 22 + export const DEFAULT_BATCH_SIZE = 10; 23 + export const DEFAULT_BATCH_DELAY = 2000; // 2 seconds 24 + 25 + // Minimum safe delay between batches (1 second) 26 + export const MIN_BATCH_DELAY = 1000; 27 + 28 + // Maximum batch size 29 + export const MAX_BATCH_SIZE = 50; 30 + 31 + // Slingshot resolver URL 32 + export const SLINGSHOT_RESOLVER = 'https://slingshot.danner.cloud'; 33 + 34 + const config: Config = { 35 + RECORD_TYPE, 36 + MIN_RECORDS_FOR_SCALING: 20, 37 + BASE_BATCH_SIZE: 10, 38 + SCALING_FACTOR: 1.5, 39 + CLIENT_AGENT, 40 + DEFAULT_BATCH_SIZE, 41 + DEFAULT_BATCH_DELAY, 42 + MIN_BATCH_DELAY, 43 + MAX_BATCH_SIZE, 44 + SLINGSHOT_RESOLVER, 45 + RECORDS_PER_DAY_LIMIT, 46 + SAFETY_MARGIN, 47 + }; 48 + 49 + export default config;
-155
src/index.js
··· 1 - #!/usr/bin/env node 2 - 3 - import * as fs from 'fs'; 4 - import * as config from './config.js'; 5 - import { parseCommandLineArgs, showHelp } from './lib/cli.js'; 6 - import { login } from './lib/auth.js'; 7 - import { parseLastFmCsv, convertToPlayRecord, sortRecords } from './lib/csv.js'; 8 - import { publishRecords } from './lib/publisher.js'; 9 - import { prompt } from './utils/input.js'; 10 - import { formatDuration, calculateOptimalBatchSize } from './utils/helpers.js'; 11 - import { setupKillswitch } from './utils/killswitch.js'; 12 - 13 - /** 14 - * Main execution 15 - */ 16 - async function main() { 17 - const args = parseCommandLineArgs(); 18 - 19 - // Show help if requested 20 - if (args.help) { 21 - showHelp(); 22 - process.exit(0); 23 - } 24 - 25 - // Setup killswitch (unless in dry-run mode) 26 - if (!args['dry-run']) { 27 - setupKillswitch(); 28 - } 29 - 30 - try { 31 - console.log('=== Last.fm to ATProto Importer ===\n'); 32 - 33 - // Get CSV file path 34 - let csvPath = args.file; 35 - if (!csvPath) { 36 - csvPath = await prompt('Enter path to Last.fm CSV export: '); 37 - } else { 38 - console.log(`CSV file: ${csvPath}`); 39 - } 40 - 41 - if (!fs.existsSync(csvPath)) { 42 - console.error('✗ File not found!'); 43 - process.exit(1); 44 - } 45 - 46 - // Parse CSV 47 - const csvRecords = parseLastFmCsv(csvPath); 48 - 49 - if (csvRecords.length === 0) { 50 - console.error('✗ No records found in CSV file!'); 51 - process.exit(1); 52 - } 53 - 54 - // Convert records 55 - console.log('Converting records to ATProto format...'); 56 - const playRecords = csvRecords.map(record => convertToPlayRecord(record, config)); 57 - console.log('✓ Conversion complete\n'); 58 - 59 - // Sort records chronologically 60 - const reverseChronological = args['reverse-chronological']; 61 - sortRecords(playRecords, reverseChronological); 62 - 63 - // Validate and set batch delay 64 - let batchDelay = args['batch-delay'] ? parseInt(args['batch-delay']) : config.DEFAULT_BATCH_DELAY; 65 - if (batchDelay < config.MIN_BATCH_DELAY) { 66 - console.log(`⚠️ Batch delay ${batchDelay}ms is below minimum safe limit.`); 67 - console.log(` Enforcing minimum delay of ${config.MIN_BATCH_DELAY}ms to respect rate limits.\n`); 68 - batchDelay = config.MIN_BATCH_DELAY; 69 - } 70 - 71 - // Calculate optimal batch size 72 - let batchSize = args['batch-size'] ? parseInt(args['batch-size']) : null; 73 - if (!batchSize) { 74 - batchSize = calculateOptimalBatchSize(playRecords.length, batchDelay, config); 75 - console.log(`Auto-calculated batch size: ${batchSize}`); 76 - console.log(` Algorithm: Logarithmic scaling with O(n) time complexity`); 77 - console.log(` Optimized for: ${playRecords.length} records at ${batchDelay}ms delay`); 78 - console.log(` Rate limit strategy: Token bucket with conservative limits\n`); 79 - } else { 80 - console.log(`Using specified batch size: ${batchSize}\n`); 81 - } 82 - 83 - // Check if dry run mode 84 - const isDryRun = args['dry-run']; 85 - 86 - if (isDryRun) { 87 - console.log('🔍 Running in DRY RUN mode - no authentication required\n'); 88 - 89 - // Show preview without publishing 90 - await publishRecords(null, playRecords, batchSize, batchDelay, config, true); 91 - process.exit(0); 92 - } 93 - 94 - // Login to ATProto (only if not dry run) 95 - const agent = await login(args.identifier, args.password, config.SLINGSHOT_RESOLVER); 96 - 97 - // Confirm before publishing (unless --yes flag is set) 98 - if (!args.yes) { 99 - const confirm = await prompt(`\nReady to publish ${playRecords.length} records. Continue? (yes/no): `); 100 - if (confirm.toLowerCase() !== 'yes' && confirm.toLowerCase() !== 'y') { 101 - console.log('Aborted.'); 102 - process.exit(0); 103 - } 104 - console.log(''); 105 - } else { 106 - console.log(`Auto-confirmed: Publishing ${playRecords.length} records...\n`); 107 - } 108 - 109 - // Publish records 110 - const startTime = Date.now(); 111 - const { successCount, errorCount, cancelled } = await publishRecords( 112 - agent, 113 - playRecords, 114 - batchSize, 115 - batchDelay, 116 - config, 117 - false 118 - ); 119 - const totalTime = formatDuration(Date.now() - startTime); 120 - 121 - // Summary 122 - console.log('=== Import Complete ==='); 123 - if (cancelled) { 124 - console.log('Status: CANCELLED BY USER'); 125 - } else { 126 - console.log('Status: COMPLETED'); 127 - } 128 - console.log(`Total records: ${playRecords.length}`); 129 - console.log(`Successfully published: ${successCount}`); 130 - console.log(`Failed: ${errorCount}`); 131 - if (cancelled) { 132 - console.log(`Not processed: ${playRecords.length - successCount - errorCount}`); 133 - } 134 - console.log(`Total time: ${totalTime}`); 135 - 136 - if (successCount > 0) { 137 - const avgTime = (Date.now() - startTime) / successCount; 138 - console.log(`Average time per record: ${avgTime.toFixed(0)}ms`); 139 - } 140 - 141 - console.log('\n✓ Logged out'); 142 - 143 - // Exit with appropriate code 144 - process.exit(cancelled ? 130 : 0); 145 - 146 - } catch (error) { 147 - console.error('\n✗ Fatal error:', error.message); 148 - if (error.stack && process.env.DEBUG) { 149 - console.error('\nStack trace:', error.stack); 150 - } 151 - process.exit(1); 152 - } 153 - } 154 - 155 - main();
+5
src/index.ts
··· 1 + #!/usr/bin/env node 2 + 3 + import { runCLI } from './lib/cli.js'; 4 + 5 + runCLI();
+19 -9
src/lib/auth.js src/lib/auth.ts
··· 1 1 import { AtpAgent } from '@atproto/api'; 2 2 import { prompt } from '../utils/input.js'; 3 3 4 + interface ResolverResponse { 5 + did: string; 6 + pds: string; 7 + } 8 + 4 9 /** 5 10 * Resolves an AT Protocol identifier (handle or DID) to get PDS information 6 11 */ 7 - async function resolveIdentifier(identifier, resolverUrl) { 12 + async function resolveIdentifier(identifier: string, resolverUrl: string): Promise<ResolverResponse> { 8 13 console.log(`Resolving identifier: ${identifier}`); 9 14 10 15 const response = await fetch( ··· 15 20 throw new Error(`Failed to resolve identifier: ${response.status} ${response.statusText}`); 16 21 } 17 22 18 - const data = await response.json(); 23 + const data = await response.json() as ResolverResponse; 19 24 20 25 if (!data.did || !data.pds) { 21 26 throw new Error('Invalid response from identity resolver'); ··· 28 33 /** 29 34 * Login to ATProto using Slingshot resolver 30 35 */ 31 - export async function login(identifier, password, resolverUrl) { 36 + export async function login( 37 + identifier: string | undefined, 38 + password: string | undefined, 39 + resolverUrl: string 40 + ): Promise<AtpAgent> { 32 41 console.log('\n=== ATProto Login ==='); 33 42 34 43 // Prompt for missing credentials ··· 58 67 }); 59 68 60 69 console.log('✓ Logged in successfully!'); 61 - console.log(` DID: ${pdsAgent.session.did}`); 62 - console.log(` Handle: ${pdsAgent.session.handle}\n`); 70 + console.log(` DID: ${pdsAgent.session?.did}`); 71 + console.log(` Handle: ${pdsAgent.session?.handle}\n`); 63 72 64 73 return pdsAgent; 65 74 } catch (error) { 66 - console.error('✗ Login failed:', error.message); 75 + const err = error as Error; 76 + console.error('✗ Login failed:', err.message); 67 77 68 78 // Provide more specific error messages 69 - if (error.message.includes('Failed to resolve identifier')) { 79 + if (err.message.includes('Failed to resolve identifier')) { 70 80 throw new Error('Handle not found. Please check your AT Protocol handle.'); 71 - } else if (error.message.includes('AuthFactorTokenRequired')) { 81 + } else if (err.message.includes('AuthFactorTokenRequired')) { 72 82 throw new Error('Two-factor authentication required. Please use your app password.'); 73 - } else if (error.message.includes('InvalidCredentials')) { 83 + } else if (err.message.includes('InvalidCredentials')) { 74 84 throw new Error('Invalid credentials. Please check your handle and app password.'); 75 85 } 76 86
-93
src/lib/cli.js
··· 1 - import { parseArgs } from 'node:util'; 2 - 3 - /** 4 - * Parse command line arguments 5 - */ 6 - export function parseCommandLineArgs() { 7 - const options = { 8 - help: { 9 - type: 'boolean', 10 - short: 'h', 11 - default: false, 12 - }, 13 - file: { 14 - type: 'string', 15 - short: 'f', 16 - }, 17 - identifier: { 18 - type: 'string', 19 - short: 'i', 20 - }, 21 - password: { 22 - type: 'string', 23 - short: 'p', 24 - }, 25 - 'batch-size': { 26 - type: 'string', 27 - short: 'b', 28 - }, 29 - 'batch-delay': { 30 - type: 'string', 31 - short: 'd', 32 - }, 33 - yes: { 34 - type: 'boolean', 35 - short: 'y', 36 - default: false, 37 - }, 38 - 'dry-run': { 39 - type: 'boolean', 40 - short: 'n', 41 - default: false, 42 - }, 43 - 'reverse-chronological': { 44 - type: 'boolean', 45 - short: 'r', 46 - default: false, 47 - }, 48 - }; 49 - 50 - try { 51 - const { values } = parseArgs({ options, allowPositionals: false }); 52 - return values; 53 - } catch (error) { 54 - console.error('Error parsing arguments:', error.message); 55 - showHelp(); 56 - process.exit(1); 57 - } 58 - } 59 - 60 - /** 61 - * Show help message 62 - */ 63 - export function showHelp() { 64 - console.log(` 65 - Last.fm to ATProto Importer 66 - 67 - Usage: node importer.js [options] 68 - 69 - Options: 70 - -h, --help Show this help message 71 - -f, --file <path> Path to Last.fm CSV export file 72 - -i, --identifier <id> ATProto handle or DID 73 - -p, --password <pass> ATProto app password 74 - -b, --batch-size <num> Number of records per batch (auto-calculated if not set) 75 - -d, --batch-delay <ms> Delay between batches in ms (default: 2000, min: 1000) 76 - -y, --yes Skip confirmation prompt 77 - -n, --dry-run Preview records without publishing 78 - -r, --reverse-chronological Process newest first (default: oldest first) 79 - 80 - Examples: 81 - node importer.js -f lastfm.csv -i alice.bsky.social -p xxxx-xxxx-xxxx-xxxx 82 - node importer.js --file export.csv --identifier alice.bsky.social --yes 83 - node importer.js -f lastfm.csv --dry-run 84 - node importer.js (interactive mode - prompts for all values) 85 - 86 - Notes: 87 - - Batch size uses logarithmic scaling algorithm (O(n) complexity) for optimal throughput 88 - - Auto-calculated batch size considers both record count and delay settings 89 - - Records are processed in chronological order (oldest first) by default 90 - - Minimum batch delay of 1000ms enforced to respect rate limits 91 - - Rate limiting follows token bucket strategy for safe API usage 92 - `); 93 - }
+173
src/lib/cli.ts
··· 1 + import { parseArgs } from 'node:util'; 2 + import { AtpAgent } from '@atproto/api'; // Use AtpAgent for consistency 3 + import type { PlayRecord, Config, CommandLineArgs, PublishResult } from '../types.js'; 4 + import { login } from './auth.js'; 5 + import { parseLastFmCsv, convertToPlayRecord, sortRecords } from '../lib/csv.js'; 6 + import { publishRecords } from './publisher.js'; 7 + import { prompt } from '../utils/input.js'; 8 + import config from '../config.js'; 9 + import { calculateOptimalBatchSize, showRateLimitInfo } from '../utils/helpers.js'; 10 + 11 + /** 12 + * Show help message 13 + */ 14 + export function showHelp(): void { 15 + console.log(` 16 + Last.fm to ATProto Importer v0.0.2 17 + 18 + Usage: npm start [options] 19 + 20 + Options: 21 + -h, --help Show this help message 22 + -f, --file <path> Path to Last.fm CSV export file 23 + -i, --identifier <id> ATProto handle or DID 24 + -p, --password <pass> ATProto app password 25 + -b, --batch-size <num> Number of records per batch (auto-calculated if not set) 26 + -d, --batch-delay <ms> Delay between batches in ms (default: 2000, min: 1000) 27 + -y, --yes Skip confirmation prompt 28 + -n, --dry-run Preview records without publishing 29 + -r, --reverse-chronological Process newest first (default: oldest first) 30 + `); 31 + } 32 + 33 + /** 34 + * Parse command line arguments 35 + */ 36 + export function parseCommandLineArgs(): CommandLineArgs { 37 + // The options definition is identical to the CommandLineArgs keys 38 + const options = { 39 + help: { type: 'boolean', short: 'h', default: false }, 40 + file: { type: 'string', short: 'f' }, 41 + identifier: { type: 'string', short: 'i' }, 42 + password: { type: 'string', short: 'p' }, 43 + 'batch-size': { type: 'string', short: 'b' }, 44 + 'batch-delay': { type: 'string', short: 'd' }, 45 + yes: { type: 'boolean', short: 'y', default: false }, 46 + 'dry-run': { type: 'boolean', short: 'n', default: false }, 47 + 'reverse-chronological': { type: 'boolean', short: 'r', default: false }, 48 + } as const; 49 + 50 + try { 51 + const { values } = parseArgs({ options, allowPositionals: false }); 52 + return values as CommandLineArgs; 53 + } catch (error) { 54 + const err = error as Error; 55 + console.error('Error parsing arguments:', err.message); 56 + showHelp(); 57 + process.exit(1); 58 + } 59 + } 60 + 61 + /** 62 + * The full, real implementation of the CLI 63 + */ 64 + export async function runCLI(): Promise<void> { 65 + try { 66 + const args = parseCommandLineArgs(); 67 + const cfg = config as Config; // Use a constant for the typed config 68 + 69 + if (args.help) { 70 + showHelp(); 71 + return; 72 + } 73 + 74 + if (!args.file) { 75 + throw new Error('Missing required argument: -f, --file <path>'); 76 + } 77 + 78 + const dryRun = args['dry-run'] ?? false; 79 + let agent: AtpAgent | null = null; 80 + 81 + // 1. Get Authentication (skips login if dry-run) 82 + if (!dryRun) { 83 + if (!args.identifier || !args.password) { 84 + throw new Error('Missing required arguments for login: -i (identifier) and -p (password)'); 85 + } 86 + // Assume login returns AtpAgent, as per the type fix 87 + agent = await login(args.identifier, args.password, cfg.SLINGSHOT_RESOLVER) as AtpAgent; 88 + } 89 + 90 + // 2. Parse and Prepare Records 91 + // This function is assumed to read the file path in args.file 92 + const csvRecords = parseLastFmCsv(args.file); 93 + 94 + // This function maps the raw CSV records to the standardized PlayRecord structure 95 + const records: PlayRecord[] = csvRecords.map(record => convertToPlayRecord(record, cfg)); 96 + const totalRecords = records.length; 97 + 98 + const reverseChronological = args['reverse-chronological'] ?? false; 99 + const sortedRecords = sortRecords(records, reverseChronological); 100 + 101 + // 3. Determine Batching parameters 102 + let batchDelay = cfg.DEFAULT_BATCH_DELAY; 103 + if (args['batch-delay']) { 104 + const delay = parseInt(args['batch-delay'], 10); 105 + if (isNaN(delay)) { 106 + throw new Error(`Invalid batch delay value: ${args['batch-delay']}`); 107 + } 108 + // Enforce minimum delay 109 + batchDelay = Math.max(delay, cfg.MIN_BATCH_DELAY); 110 + } 111 + 112 + let batchSize: number; 113 + if (args['batch-size']) { 114 + batchSize = parseInt(args['batch-size'], 10); 115 + if (isNaN(batchSize) || batchSize <= 0) { 116 + throw new Error(`Invalid batch size value: ${args['batch-size']}`); 117 + } 118 + } else { 119 + // Calculate optimal batch size if not provided 120 + batchSize = calculateOptimalBatchSize(totalRecords, batchDelay, cfg); 121 + } 122 + 123 + // 4. Show Rate Limiting Information 124 + const recordsPerDay = cfg.RECORDS_PER_DAY_LIMIT * cfg.SAFETY_MARGIN; 125 + const estimatedDays = Math.ceil(totalRecords / recordsPerDay); 126 + 127 + // Updated call to match the expected signature in showRateLimitInfo (from previous response) 128 + showRateLimitInfo( 129 + totalRecords, 130 + batchSize, 131 + batchDelay, 132 + estimatedDays, 133 + cfg.RECORDS_PER_DAY_LIMIT, 134 + ); 135 + 136 + // 5. Confirmation Prompt 137 + if (!dryRun && !(args.yes ?? false)) { 138 + console.log(`\nReady to publish ${totalRecords.toLocaleString()} records.`); 139 + const answer = await prompt('Do you want to continue? (y/N) '); 140 + if (answer.toLowerCase() !== 'y') { 141 + console.log('Import cancelled by user.'); 142 + process.exit(0); 143 + } 144 + } 145 + 146 + // 6. Publish Records 147 + const result: PublishResult = await publishRecords( 148 + agent, 149 + sortedRecords, 150 + batchSize, 151 + batchDelay, 152 + cfg, 153 + dryRun 154 + ); 155 + 156 + // 7. Final Output 157 + if (result.cancelled) { 158 + console.log(`\nImport stopped gracefully. ${result.successCount} records processed.`); 159 + } else if (dryRun) { 160 + console.log('\nDRY RUN COMPLETE. No records were published.'); 161 + } else { 162 + console.log(`\n🎉 Import Complete!`); 163 + console.log(`Total records processed: ${result.successCount.toLocaleString()} (${result.errorCount.toLocaleString()} failed)`); 164 + } 165 + 166 + } catch (error) { 167 + // Handle fatal errors 168 + const err = error as Error; 169 + console.error('\n🛑 A fatal error occurred:'); 170 + console.error(err.message); 171 + process.exit(1); 172 + } 173 + }
+9 -7
src/lib/csv.js src/lib/csv.ts
··· 1 1 import * as fs from 'fs'; 2 2 import { parse } from 'csv-parse/sync'; 3 + import type { LastFmCsvRecord, PlayRecord, Config } from '../types.js'; 3 4 4 5 /** 5 6 * Parse Last.fm CSV export 6 7 */ 7 - export function parseLastFmCsv(filePath) { 8 + export function parseLastFmCsv(filePath: string): LastFmCsvRecord[] { 8 9 console.log(`Reading CSV file: ${filePath}`); 9 10 const fileContent = fs.readFileSync(filePath, 'utf-8'); 10 11 ··· 12 13 columns: true, 13 14 skip_empty_lines: true, 14 15 trim: true, 15 - }); 16 + }) as LastFmCsvRecord[]; 16 17 17 18 console.log(`✓ Parsed ${records.length} scrobbles\n`); 18 19 return records; ··· 21 22 /** 22 23 * Convert Last.fm CSV record to ATProto play record 23 24 */ 24 - export function convertToPlayRecord(csvRecord, config) { 25 + export function convertToPlayRecord(csvRecord: LastFmCsvRecord, config: Config): PlayRecord { 25 26 const { RECORD_TYPE, CLIENT_AGENT } = config; 26 27 27 28 // Parse the timestamp ··· 29 30 const playedTime = new Date(timestamp * 1000).toISOString(); 30 31 31 32 // Build artists array 32 - const artists = []; 33 + const artists: PlayRecord['artists'] = []; 33 34 if (csvRecord.artist) { 34 - const artistData = { 35 + const artistData: PlayRecord['artists'][0] = { 35 36 artistName: csvRecord.artist, 36 37 }; 37 38 if (csvRecord.artist_mbid && csvRecord.artist_mbid.trim()) { ··· 41 42 } 42 43 43 44 // Build the play record 44 - const playRecord = { 45 + const playRecord: PlayRecord = { 45 46 $type: RECORD_TYPE, 46 47 trackName: csvRecord.track, 47 48 artists, 48 49 playedTime, 49 50 submissionClientAgent: CLIENT_AGENT, 50 51 musicServiceBaseDomain: 'last.fm', 52 + originUrl: '', 51 53 }; 52 54 53 55 // Add optional fields ··· 74 76 /** 75 77 * Sort records chronologically 76 78 */ 77 - export function sortRecords(records, reverseChronological = false) { 79 + export function sortRecords(records: PlayRecord[], reverseChronological = false): PlayRecord[] { 78 80 console.log(`Sorting records ${reverseChronological ? 'newest' : 'oldest'} first...`); 79 81 80 82 records.sort((a, b) => {
-137
src/lib/publisher.js
··· 1 - import { formatDuration } from '../utils/helpers.js'; 2 - import { isImportCancelled } from '../utils/killswitch.js'; 3 - 4 - /** 5 - * Publish records in batches with rate limiting and killswitch support 6 - */ 7 - export async function publishRecords(agent, records, batchSize, batchDelay, config, dryRun = false) { 8 - const { RECORD_TYPE } = config; 9 - const totalRecords = records.length; 10 - let successCount = 0; 11 - let errorCount = 0; 12 - const startTime = Date.now(); 13 - 14 - if (dryRun) { 15 - return handleDryRun(records, batchSize, batchDelay); 16 - } 17 - 18 - const totalBatches = Math.ceil(totalRecords / batchSize); 19 - const estimatedTime = formatDuration(totalBatches * batchDelay); 20 - 21 - console.log(`Publishing ${totalRecords} records in batches of ${batchSize}...`); 22 - console.log(`Total batches: ${totalBatches}`); 23 - console.log(`Estimated time: ${estimatedTime}`); 24 - console.log(`\n🚨 Press Ctrl+C to stop gracefully after current batch\n`); 25 - 26 - for (let i = 0; i < totalRecords; i += batchSize) { 27 - // Check killswitch before processing batch 28 - if (isImportCancelled()) { 29 - return handleCancellation(successCount, errorCount, totalRecords); 30 - } 31 - 32 - const batch = records.slice(i, i + batchSize); 33 - const batchNum = Math.floor(i / batchSize) + 1; 34 - const progress = ((i / totalRecords) * 100).toFixed(1); 35 - 36 - console.log(`[${progress}%] Batch ${batchNum}/${totalBatches} (records ${i + 1}-${Math.min(i + batchSize, totalRecords)})`); 37 - 38 - // Process batch records 39 - const batchStartTime = Date.now(); 40 - for (const record of batch) { 41 - // Check killswitch during batch processing 42 - if (isImportCancelled()) { 43 - console.log(` ⚠️ Stopping mid-batch...`); 44 - break; 45 - } 46 - 47 - try { 48 - await agent.com.atproto.repo.createRecord({ 49 - repo: agent.session.did, 50 - collection: RECORD_TYPE, 51 - record, 52 - }); 53 - successCount++; 54 - } catch (error) { 55 - errorCount++; 56 - console.error(` ✗ Failed: ${record.trackName} - ${error.message}`); 57 - } 58 - } 59 - 60 - const batchDuration = Date.now() - batchStartTime; 61 - const elapsed = formatDuration(Date.now() - startTime); 62 - const remaining = formatDuration(((totalRecords - i - batchSize) / batchSize) * batchDelay); 63 - 64 - console.log(` ✓ Complete in ${batchDuration}ms (${successCount} successful, ${errorCount} failed)`); 65 - 66 - // Only show time estimates if not cancelled 67 - if (!isImportCancelled()) { 68 - console.log(` ⏱ Elapsed: ${elapsed} | Remaining: ~${remaining}\n`); 69 - } 70 - 71 - // Check again before waiting 72 - if (isImportCancelled()) { 73 - return handleCancellation(successCount, errorCount, totalRecords); 74 - } 75 - 76 - // Wait before next batch (except for last batch) 77 - if (i + batchSize < totalRecords) { 78 - await new Promise(resolve => setTimeout(resolve, batchDelay)); 79 - } 80 - } 81 - 82 - return { successCount, errorCount, cancelled: false }; 83 - } 84 - 85 - /** 86 - * Handle dry run mode 87 - */ 88 - function handleDryRun(records, batchSize, batchDelay) { 89 - const totalRecords = records.length; 90 - 91 - console.log(`\n=== DRY RUN MODE ===`); 92 - console.log(`Would publish ${totalRecords} records in batches of ${batchSize}`); 93 - console.log(`Estimated time: ${formatDuration(Math.ceil(totalRecords / batchSize) * batchDelay)}\n`); 94 - 95 - // Show first 5 records as preview 96 - const previewCount = Math.min(5, totalRecords); 97 - console.log(`Preview of first ${previewCount} records (in processing order):\n`); 98 - 99 - for (let i = 0; i < previewCount; i++) { 100 - const record = records[i]; 101 - console.log(`${i + 1}. ${record.artists[0]?.artistName} - ${record.trackName}`); 102 - console.log(` Album: ${record.releaseName || 'N/A'}`); 103 - console.log(` Played: ${record.playedTime}`); 104 - console.log(` URL: ${record.originUrl}`); 105 - 106 - // Show MusicBrainz IDs if available 107 - const mbids = []; 108 - if (record.artists[0]?.artistMbId) mbids.push(`Artist: ${record.artists[0].artistMbId}`); 109 - if (record.recordingMbId) mbids.push(`Recording: ${record.recordingMbId}`); 110 - if (record.releaseMbId) mbids.push(`Release: ${record.releaseMbId}`); 111 - 112 - if (mbids.length > 0) { 113 - console.log(` MBIDs: ${mbids.join(', ')}`); 114 - } 115 - console.log(''); 116 - } 117 - 118 - if (totalRecords > previewCount) { 119 - console.log(`... and ${totalRecords - previewCount} more records\n`); 120 - } 121 - 122 - console.log('=== DRY RUN COMPLETE ==='); 123 - console.log('No records were actually published.'); 124 - console.log('Remove --dry-run flag to publish for real.\n'); 125 - 126 - return { successCount: totalRecords, errorCount: 0, cancelled: false }; 127 - } 128 - 129 - /** 130 - * Handle cancellation 131 - */ 132 - function handleCancellation(successCount, errorCount, totalRecords) { 133 - console.log(`\n🛑 Import cancelled by user`); 134 - console.log(` Processed: ${successCount}/${totalRecords} records`); 135 - console.log(` Remaining: ${totalRecords - successCount} records\n`); 136 - return { successCount, errorCount, cancelled: true }; 137 - }
+326
src/lib/publisher.ts
··· 1 + import type { AtpAgent } from '@atproto/api'; 2 + import { formatDuration } from '../utils/helpers.js'; 3 + import { isImportCancelled } from '../utils/killswitch.js'; 4 + import { 5 + calculateDailySchedule, 6 + displayRateLimitWarning, 7 + displayRateLimitInfo, 8 + calculateRateLimitedBatches, 9 + } from '../utils/rate-limiter.js'; 10 + import type { PlayRecord, Config, PublishResult } from '../types.js'; 11 + 12 + /** 13 + * Publish records in batches with rate limiting and multi-day support 14 + */ 15 + export async function publishRecords( 16 + agent: AtpAgent | null, 17 + records: PlayRecord[], 18 + batchSize: number, 19 + batchDelay: number, 20 + config: Config, 21 + dryRun = false 22 + ): Promise<PublishResult> { 23 + const { RECORD_TYPE } = config; 24 + const totalRecords = records.length; 25 + 26 + if (dryRun) { 27 + return handleDryRun(records, batchSize, batchDelay, config); 28 + } 29 + 30 + if (!agent) { 31 + throw new Error('Agent is required for publishing'); 32 + } 33 + 34 + // Calculate rate-limited batch parameters 35 + const rateLimitParams = calculateRateLimitedBatches(totalRecords, config); 36 + 37 + // Override with calculated parameters if rate limiting is needed 38 + if (rateLimitParams.needsRateLimiting) { 39 + displayRateLimitWarning(); 40 + batchSize = rateLimitParams.batchSize; 41 + batchDelay = rateLimitParams.batchDelay; 42 + } 43 + 44 + displayRateLimitInfo( 45 + totalRecords, 46 + batchSize, 47 + batchDelay, 48 + rateLimitParams.estimatedDays, 49 + rateLimitParams.recordsPerDay 50 + ); 51 + 52 + // Calculate daily schedule if multi-day import 53 + const dailySchedule = 54 + rateLimitParams.estimatedDays > 1 55 + ? calculateDailySchedule( 56 + totalRecords, 57 + batchSize, 58 + batchDelay, 59 + rateLimitParams.recordsPerDay 60 + ) 61 + : null; 62 + 63 + let successCount = 0; 64 + let errorCount = 0; 65 + const startTime = Date.now(); 66 + 67 + const totalBatches = Math.ceil(totalRecords / batchSize); 68 + const estimatedTime = formatDuration(totalBatches * batchDelay); 69 + 70 + console.log(`Publishing ${totalRecords} records in batches of ${batchSize}...`); 71 + console.log(`Total batches: ${totalBatches}`); 72 + if (!dailySchedule) { 73 + console.log(`Estimated time: ${estimatedTime}`); 74 + } 75 + console.log(`\n🚨 Press Ctrl+C to stop gracefully after current batch\n`); 76 + 77 + // If multi-day, process day by day 78 + if (dailySchedule) { 79 + for (const day of dailySchedule) { 80 + console.log(`\n╔═══════════════════════════════════════════════════════════════╗`); 81 + console.log(`║ DAY ${day.day} of ${rateLimitParams.estimatedDays}`); 82 + console.log(`║ Records: ${day.recordsStart + 1}-${day.recordsEnd} (${day.recordsCount} total)`); 83 + console.log(`╚═══════════════════════════════════════════════════════════════╝\n`); 84 + 85 + const dayRecords = records.slice(day.recordsStart, day.recordsEnd); 86 + const result = await processDayBatch( 87 + agent, 88 + dayRecords, 89 + batchSize, 90 + batchDelay, 91 + RECORD_TYPE, 92 + day.recordsStart, 93 + totalRecords, 94 + startTime 95 + ); 96 + 97 + successCount += result.successCount; 98 + errorCount += result.errorCount; 99 + 100 + if (result.cancelled) { 101 + return { successCount, errorCount, cancelled: true }; 102 + } 103 + 104 + // Pause between days 105 + if (day.pauseAfter) { 106 + console.log(`\n⏸️ Pausing for 24 hours before continuing...`); 107 + console.log(` Next batch will start at: ${new Date(Date.now() + day.pauseDuration).toLocaleString()}`); 108 + console.log(` Progress: ${successCount}/${totalRecords} records completed\n`); 109 + console.log(` 💡 You can safely stop (Ctrl+C) and restart later.\n`); 110 + 111 + await new Promise((resolve) => setTimeout(resolve, day.pauseDuration)); 112 + } 113 + } 114 + } else { 115 + // Single day import - process normally 116 + const result = await processDayBatch( 117 + agent, 118 + records, 119 + batchSize, 120 + batchDelay, 121 + RECORD_TYPE, 122 + 0, 123 + totalRecords, 124 + startTime 125 + ); 126 + 127 + successCount = result.successCount; 128 + errorCount = result.errorCount; 129 + 130 + if (result.cancelled) { 131 + return { successCount, errorCount, cancelled: true }; 132 + } 133 + } 134 + 135 + return { successCount, errorCount, cancelled: false }; 136 + } 137 + 138 + /** 139 + * Process a batch of records (for a single day or entire import) 140 + */ 141 + async function processDayBatch( 142 + agent: AtpAgent, 143 + records: PlayRecord[], 144 + batchSize: number, 145 + batchDelay: number, 146 + recordType: string, 147 + globalOffset: number, 148 + totalRecords: number, 149 + startTime: number 150 + ): Promise<PublishResult> { 151 + let successCount = 0; 152 + let errorCount = 0; 153 + 154 + for (let i = 0; i < records.length; i += batchSize) { 155 + // Check killswitch before processing batch 156 + if (isImportCancelled()) { 157 + return handleCancellation(successCount, errorCount, totalRecords); 158 + } 159 + 160 + const batch = records.slice(i, i + batchSize); 161 + const globalIndex = globalOffset + i; 162 + const batchNum = Math.floor(globalIndex / batchSize) + 1; 163 + const progress = (((globalOffset + i) / totalRecords) * 100).toFixed(1); 164 + 165 + console.log( 166 + `[${progress}%] Batch ${batchNum} (records ${globalOffset + i + 1}-${Math.min(globalOffset + i + batchSize, globalOffset + records.length)})` 167 + ); 168 + 169 + // Process batch records 170 + const batchStartTime = Date.now(); 171 + for (const record of batch) { 172 + // Check killswitch during batch processing 173 + if (isImportCancelled()) { 174 + console.log(` ⚠️ Stopping mid-batch...`); 175 + break; 176 + } 177 + 178 + try { 179 + await agent.com.atproto.repo.createRecord({ 180 + repo: agent.session?.did || '', 181 + collection: recordType, 182 + record, 183 + }); 184 + successCount++; 185 + } catch (error) { 186 + errorCount++; 187 + const err = error as Error; 188 + console.error(` ✗ Failed: ${record.trackName} - ${err.message}`); 189 + } 190 + } 191 + 192 + const batchDuration = Date.now() - batchStartTime; 193 + const elapsed = formatDuration(Date.now() - startTime); 194 + const remaining = formatDuration( 195 + ((totalRecords - (globalOffset + i + batchSize)) / batchSize) * batchDelay 196 + ); 197 + 198 + console.log( 199 + ` ✓ Complete in ${batchDuration}ms (${successCount} successful, ${errorCount} failed)` 200 + ); 201 + 202 + // Only show time estimates if not cancelled 203 + if (!isImportCancelled()) { 204 + console.log(` ⏱ Elapsed: ${elapsed} | Remaining: ~${remaining}\n`); 205 + } 206 + 207 + // Check again before waiting 208 + if (isImportCancelled()) { 209 + return handleCancellation(successCount, errorCount, totalRecords); 210 + } 211 + 212 + // Wait before next batch (except for last batch) 213 + if (i + batchSize < records.length) { 214 + await new Promise((resolve) => setTimeout(resolve, batchDelay)); 215 + } 216 + } 217 + 218 + return { successCount, errorCount, cancelled: false }; 219 + } 220 + 221 + /** 222 + * Handle dry run mode 223 + */ 224 + function handleDryRun( 225 + records: PlayRecord[], 226 + batchSize: number, 227 + batchDelay: number, 228 + config: Config 229 + ): PublishResult { 230 + const totalRecords = records.length; 231 + 232 + // Calculate rate limiting info 233 + const rateLimitParams = calculateRateLimitedBatches(totalRecords, config); 234 + 235 + if (rateLimitParams.needsRateLimiting) { 236 + displayRateLimitWarning(); 237 + batchSize = rateLimitParams.batchSize; 238 + batchDelay = rateLimitParams.batchDelay; 239 + 240 + displayRateLimitInfo( 241 + totalRecords, 242 + batchSize, 243 + batchDelay, 244 + rateLimitParams.estimatedDays, 245 + rateLimitParams.recordsPerDay 246 + ); 247 + 248 + if (rateLimitParams.estimatedDays > 1) { 249 + const dailySchedule = calculateDailySchedule( 250 + totalRecords, 251 + batchSize, 252 + batchDelay, 253 + rateLimitParams.recordsPerDay 254 + ); 255 + 256 + console.log('📅 Multi-Day Import Schedule:\n'); 257 + dailySchedule.forEach((day) => { 258 + console.log(` Day ${day.day}:`); 259 + console.log(` Records ${day.recordsStart + 1}-${day.recordsEnd} (${day.recordsCount} total)`); 260 + if (day.pauseAfter) { 261 + console.log(` → Pause 24h after completion`); 262 + } 263 + }); 264 + console.log(''); 265 + } 266 + } 267 + 268 + console.log(`\n=== DRY RUN MODE ===`); 269 + console.log(`Would publish ${totalRecords} records in batches of ${batchSize}`); 270 + 271 + if (rateLimitParams.estimatedDays > 1) { 272 + console.log( 273 + `Import would span ${rateLimitParams.estimatedDays} days with automatic pauses\n` 274 + ); 275 + } else { 276 + console.log(`Estimated time: ${formatDuration(Math.ceil(totalRecords / batchSize) * batchDelay)}\n`); 277 + } 278 + 279 + // Show first 5 records as preview 280 + const previewCount = Math.min(5, totalRecords); 281 + console.log(`Preview of first ${previewCount} records (in processing order):\n`); 282 + 283 + for (let i = 0; i < previewCount; i++) { 284 + const record = records[i]; 285 + console.log(`${i + 1}. ${record.artists[0]?.artistName} - ${record.trackName}`); 286 + console.log(` Album: ${record.releaseName || 'N/A'}`); 287 + console.log(` Played: ${record.playedTime}`); 288 + console.log(` URL: ${record.originUrl}`); 289 + 290 + // Show MusicBrainz IDs if available 291 + const mbids = []; 292 + if (record.artists[0]?.artistMbId) 293 + mbids.push(`Artist: ${record.artists[0].artistMbId}`); 294 + if (record.recordingMbId) mbids.push(`Recording: ${record.recordingMbId}`); 295 + if (record.releaseMbId) mbids.push(`Release: ${record.releaseMbId}`); 296 + 297 + if (mbids.length > 0) { 298 + console.log(` MBIDs: ${mbids.join(', ')}`); 299 + } 300 + console.log(''); 301 + } 302 + 303 + if (totalRecords > previewCount) { 304 + console.log(`... and ${totalRecords - previewCount} more records\n`); 305 + } 306 + 307 + console.log('=== DRY RUN COMPLETE ==='); 308 + console.log('No records were actually published.'); 309 + console.log('Remove --dry-run flag to publish for real.\n'); 310 + 311 + return { successCount: totalRecords, errorCount: 0, cancelled: false }; 312 + } 313 + 314 + /** 315 + * Handle cancellation 316 + */ 317 + function handleCancellation( 318 + successCount: number, 319 + errorCount: number, 320 + totalRecords: number 321 + ): PublishResult { 322 + console.log(`\n🛑 Import cancelled by user`); 323 + console.log(` Processed: ${successCount}/${totalRecords} records`); 324 + console.log(` Remaining: ${totalRecords - successCount} records\n`); 325 + return { successCount, errorCount, cancelled: true }; 326 + }
+71
src/types.ts
··· 1 + import { AtpAgent as Agent } from '@atproto/api'; 2 + 3 + /** 4 + * Type alias for the ATProto Agent, used for clarity in the project. 5 + */ 6 + export type AtpAgent = Agent; 7 + 8 + export interface LastFmCsvRecord { 9 + artist: string; 10 + track: string; 11 + album: string; 12 + uts: string; 13 + artist_mbid?: string; 14 + album_mbid?: string; 15 + track_mbid?: string; 16 + } 17 + 18 + export interface PlayRecordArtist { 19 + artistName: string; 20 + artistMbId?: string; 21 + } 22 + 23 + export interface PlayRecord { 24 + $type: string; 25 + trackName: string; 26 + artists: PlayRecordArtist[]; 27 + playedTime: string; 28 + submissionClientAgent: string; 29 + musicServiceBaseDomain: string; 30 + releaseName?: string; 31 + releaseMbId?: string; 32 + recordingMbId?: string; 33 + originUrl: string; 34 + } 35 + 36 + export interface CommandLineArgs { 37 + help?: boolean; 38 + file?: string; 39 + identifier?: string; 40 + password?: string; 41 + 'batch-size'?: string; 42 + 'batch-delay'?: string; 43 + yes?: boolean; 44 + 'dry-run'?: boolean; 45 + 'reverse-chronological'?: boolean; 46 + } 47 + 48 + export interface PublishResult { 49 + successCount: number; 50 + errorCount: number; 51 + cancelled: boolean; 52 + } 53 + 54 + export interface Config { 55 + MIN_RECORDS_FOR_SCALING: number; 56 + BASE_BATCH_SIZE: number; 57 + MAX_BATCH_SIZE: number; 58 + SCALING_FACTOR: number; 59 + DEFAULT_BATCH_DELAY: number; 60 + 61 + CLIENT_AGENT: string; 62 + 63 + DEFAULT_BATCH_SIZE: number; // from rate limiter 64 + MIN_BATCH_DELAY: number; // from rate limiter 65 + RECORDS_PER_DAY_LIMIT: number; 66 + SAFETY_MARGIN: number; 67 + 68 + SLINGSHOT_RESOLVER: string; 69 + 70 + RECORD_TYPE: string; 71 + }
-63
src/utils/helpers.js
··· 1 - /** 2 - * Utility functions for the Last.fm importer 3 - */ 4 - 5 - /** 6 - * Format duration in human-readable format 7 - */ 8 - export function formatDuration(milliseconds) { 9 - const seconds = Math.floor(milliseconds / 1000); 10 - const minutes = Math.floor(seconds / 60); 11 - const hours = Math.floor(minutes / 60); 12 - 13 - if (hours > 0) { 14 - const mins = minutes % 60; 15 - return `${hours}h ${mins}m`; 16 - } else if (minutes > 0) { 17 - const secs = seconds % 60; 18 - return `${minutes}m ${secs}s`; 19 - } else { 20 - return `${seconds}s`; 21 - } 22 - } 23 - 24 - /** 25 - * Calculate optimal batch size based on total records and rate limits 26 - * Uses a logarithmic scaling approach to balance throughput with API safety 27 - */ 28 - export function calculateOptimalBatchSize(totalRecords, batchDelay, config) { 29 - const { 30 - MIN_RECORDS_FOR_SCALING, 31 - BASE_BATCH_SIZE, 32 - MAX_BATCH_SIZE, 33 - SCALING_FACTOR, 34 - DEFAULT_BATCH_DELAY 35 - } = config; 36 - 37 - const delay = batchDelay || DEFAULT_BATCH_DELAY; 38 - 39 - // For very small datasets, use minimal batches 40 - if (totalRecords <= 50) { 41 - return 3; 42 - } 43 - 44 - // For small to medium datasets, use conservative batching 45 - if (totalRecords <= MIN_RECORDS_FOR_SCALING) { 46 - return BASE_BATCH_SIZE; 47 - } 48 - 49 - // Logarithmic scaling 50 - const logScale = Math.log2(totalRecords / MIN_RECORDS_FOR_SCALING); 51 - const calculatedSize = Math.floor(BASE_BATCH_SIZE + (logScale * SCALING_FACTOR)); 52 - 53 - // Apply maximum cap 54 - let optimalSize = Math.min(calculatedSize, MAX_BATCH_SIZE); 55 - 56 - // Adjust based on batch delay 57 - if (delay < 1500 && optimalSize > 15) { 58 - optimalSize = Math.floor(optimalSize * 0.75); 59 - } 60 - 61 - // Ensure batch size is at least 3 62 - return Math.max(3, optimalSize); 63 - }
+88
src/utils/helpers.ts
··· 1 + /** 2 + * Utility functions for the Last.fm importer 3 + */ 4 + import type { Config } from '../types.js'; 5 + 6 + /** 7 + * Format duration in human-readable format 8 + */ 9 + export function formatDuration(milliseconds: number): string { 10 + const seconds = Math.floor(milliseconds / 1000); 11 + const minutes = Math.floor(seconds / 60); 12 + const hours = Math.floor(minutes / 60); 13 + 14 + if (hours > 0) { 15 + const mins = minutes % 60; 16 + return `${hours}h ${mins}m`; 17 + } else if (minutes > 0) { 18 + const secs = seconds % 60; 19 + return `${minutes}m ${secs}s`; 20 + } else { 21 + return `${seconds}s`; 22 + } 23 + } 24 + 25 + /** 26 + * Calculate optimal batch size based on total records and rate limits 27 + * Uses a logarithmic scaling approach to balance throughput with API safety 28 + */ 29 + export function calculateOptimalBatchSize(totalRecords: number, batchDelay: number, config: Config): number { 30 + const { 31 + MIN_RECORDS_FOR_SCALING, 32 + BASE_BATCH_SIZE, 33 + MAX_BATCH_SIZE, 34 + SCALING_FACTOR, 35 + DEFAULT_BATCH_DELAY 36 + } = config; 37 + 38 + const delay = batchDelay || DEFAULT_BATCH_DELAY; 39 + 40 + // For very small datasets, use minimal batches 41 + if (totalRecords <= 50) { 42 + return 3; 43 + } 44 + 45 + // For small to medium datasets, use conservative batching 46 + if (totalRecords <= MIN_RECORDS_FOR_SCALING) { 47 + return BASE_BATCH_SIZE; 48 + } 49 + 50 + // Logarithmic scaling 51 + const logScale = Math.log2(totalRecords / MIN_RECORDS_FOR_SCALING); 52 + const calculatedSize = Math.floor(BASE_BATCH_SIZE + (logScale * SCALING_FACTOR)); 53 + 54 + // Apply maximum cap 55 + let optimalSize = Math.min(calculatedSize, MAX_BATCH_SIZE); 56 + 57 + // Adjust based on batch delay 58 + if (delay < 1500 && optimalSize > 15) { 59 + optimalSize = Math.floor(optimalSize * 0.75); 60 + } 61 + 62 + // Ensure batch size is at least 3 63 + return Math.max(3, optimalSize); 64 + } 65 + 66 + /** 67 + * Logs rate limiting and batching information to the console. 68 + */ 69 + export function showRateLimitInfo( 70 + totalRecords: number, 71 + batchSize: number, 72 + batchDelay: number, 73 + estimatedDays: number, 74 + dailyLimit: number 75 + ): void { 76 + console.log('\n📊 Rate Limiting Information:'); 77 + console.log(` Total records: ${totalRecords.toLocaleString()}`); 78 + console.log(` Daily limit: ${dailyLimit.toLocaleString()} records/day`); 79 + console.log(` Estimated duration: ${estimatedDays} day${estimatedDays > 1 ? 's' : ''}`); 80 + console.log(` Batch size: ${batchSize} records`); 81 + console.log(` Batch delay: ${(batchDelay / 1000).toFixed(1)}s`); 82 + 83 + if (estimatedDays > 1) { 84 + console.log('\n The import will automatically pause between days.'); 85 + console.log(' You can safely close and restart the importer - it will resume from where it left off.'); 86 + } 87 + console.log(''); 88 + }
+12 -12
src/utils/input.js src/utils/input.ts
··· 3 3 /** 4 4 * Read user input from command line with proper password masking 5 5 */ 6 - export function prompt(question, hideInput = false) { 6 + export function prompt(question: string, hideInput = false): Promise<string> { 7 7 return new Promise((resolve) => { 8 8 if (hideInput) { 9 9 // For password input, use raw mode 10 10 const stdin = process.stdin; 11 11 const wasRaw = stdin.isRaw; 12 - 12 + 13 13 // Set raw mode to capture individual keystrokes 14 14 if (stdin.isTTY) { 15 15 stdin.setRawMode(true); 16 16 } 17 - 17 + 18 18 stdin.resume(); 19 19 stdin.setEncoding('utf8'); 20 - 20 + 21 21 process.stdout.write(question); 22 - 22 + 23 23 let password = ''; 24 - const onData = (char) => { 25 - char = char.toString(); 26 - 27 - switch (char) { 24 + const onData = (char: Buffer | string) => { 25 + const charStr = char.toString(); 26 + 27 + switch (charStr) { 28 28 case '\n': 29 29 case '\r': 30 30 case '\u0004': // Ctrl-D ··· 49 49 } 50 50 break; 51 51 default: 52 - password += char; 52 + password += charStr; 53 53 process.stdout.write('*'); 54 54 break; 55 55 } 56 56 }; 57 - 57 + 58 58 stdin.on('data', onData); 59 59 } else { 60 60 const rl = readline.createInterface({ 61 61 input: process.stdin, 62 62 output: process.stdout, 63 63 }); 64 - 64 + 65 65 rl.question(question, (answer) => { 66 66 rl.close(); 67 67 resolve(answer);
-35
src/utils/killswitch.js
··· 1 - // Global state for killswitch 2 - let importCancelled = false; 3 - let gracefulShutdown = false; 4 - 5 - /** 6 - * Setup killswitch handler for graceful shutdown 7 - */ 8 - export function setupKillswitch() { 9 - process.on('SIGINT', () => { 10 - if (gracefulShutdown) { 11 - console.log('\n\n⚠️ Force quit detected. Exiting immediately...'); 12 - process.exit(1); 13 - } 14 - 15 - gracefulShutdown = true; 16 - importCancelled = true; 17 - console.log('\n\n🛑 Killswitch activated! Stopping after current batch...'); 18 - console.log(' Press Ctrl+C again to force quit immediately.\n'); 19 - }); 20 - } 21 - 22 - /** 23 - * Check if import has been cancelled 24 - */ 25 - export function isImportCancelled() { 26 - return importCancelled; 27 - } 28 - 29 - /** 30 - * Reset killswitch state (useful for testing) 31 - */ 32 - export function resetKillswitch() { 33 - importCancelled = false; 34 - gracefulShutdown = false; 35 - }
+22
src/utils/killswitch.ts
··· 1 + let cancelled = false; 2 + 3 + // Flip the killswitch when the user hits CTRL-C 4 + process.on('SIGINT', () => { 5 + console.log('\nCaught CTRL-C — stopping import…'); 6 + cancelled = true; 7 + }); 8 + 9 + /** 10 + * Manually cancel the import if needed. 11 + */ 12 + export function cancelImport() { 13 + cancelled = true; 14 + } 15 + 16 + /** 17 + * Check whether the import should stop. 18 + * Call this inside loops, batch processors, etc. 19 + */ 20 + export function isImportCancelled(): boolean { 21 + return cancelled; 22 + }
+166
src/utils/rate-limiter.ts
··· 1 + import type { Config } from '../types.js'; 2 + 3 + /** 4 + * Calculate rate-limited batch parameters 5 + * Ensures we don't exceed daily limits while maintaining efficiency 6 + */ 7 + export function calculateRateLimitedBatches( 8 + totalRecords: number, 9 + config: Config 10 + ): { 11 + batchSize: number; 12 + batchDelay: number; 13 + estimatedDays: number; 14 + recordsPerDay: number; 15 + needsRateLimiting: boolean; 16 + } { 17 + const dailyLimit = Math.floor(config.RECORDS_PER_DAY_LIMIT * config.SAFETY_MARGIN); 18 + 19 + // Check if we need rate limiting 20 + const needsRateLimiting = totalRecords > dailyLimit; 21 + 22 + if (!needsRateLimiting) { 23 + // Can import everything in one go 24 + return { 25 + batchSize: config.DEFAULT_BATCH_SIZE, 26 + batchDelay: config.DEFAULT_BATCH_DELAY, 27 + estimatedDays: 1, 28 + recordsPerDay: totalRecords, 29 + needsRateLimiting: false, 30 + }; 31 + } 32 + 33 + // Calculate how many days needed 34 + const estimatedDays = Math.ceil(totalRecords / dailyLimit); 35 + const recordsPerDay = Math.floor(totalRecords / estimatedDays); 36 + 37 + // Calculate batch parameters 38 + // We want to spread records evenly throughout the day 39 + const minutesPerDay = 24 * 60; 40 + const batchesPerDay = Math.ceil(recordsPerDay / config.DEFAULT_BATCH_SIZE); 41 + const delayBetweenBatches = Math.floor((minutesPerDay * 60 * 1000) / batchesPerDay); 42 + 43 + // Ensure batch delay is at least minimum 44 + const batchDelay = Math.max(delayBetweenBatches, config.MIN_BATCH_DELAY); 45 + 46 + // Adjust batch size if needed to hit the target 47 + const adjustedBatchSize = Math.min( 48 + Math.ceil(recordsPerDay / Math.floor((minutesPerDay * 60 * 1000) / batchDelay)), 49 + config.MAX_BATCH_SIZE 50 + ); 51 + 52 + return { 53 + batchSize: adjustedBatchSize, 54 + batchDelay, 55 + estimatedDays, 56 + recordsPerDay, 57 + needsRateLimiting: true, 58 + }; 59 + } 60 + 61 + /** 62 + * Calculate daily batches and pause times 63 + */ 64 + export function calculateDailySchedule( 65 + totalRecords: number, 66 + batchSize: number, 67 + batchDelay: number, 68 + recordsPerDay: number 69 + ) { 70 + const schedule = []; 71 + 72 + // How many batches fit into a 24h window using the actual delay? 73 + const batchesPerDay = Math.floor((24 * 60 * 60 * 1000) / batchDelay); 74 + 75 + // Max records we could process in one day given the spacing 76 + const maxRecordsPerDay = batchesPerDay * batchSize; 77 + 78 + // Respect the external rate limit (recordsPerDay) 79 + const dailyCap = Math.min(maxRecordsPerDay, recordsPerDay); 80 + 81 + let processed = 0; 82 + let day = 1; 83 + 84 + while (processed < totalRecords) { 85 + const recordsStart = processed; 86 + const dailyCount = Math.min(dailyCap, totalRecords - processed); 87 + const recordsEnd = recordsStart + dailyCount; 88 + const isLastDay = recordsEnd >= totalRecords; 89 + 90 + schedule.push({ 91 + day, 92 + recordsStart, 93 + recordsEnd, 94 + recordsCount: dailyCount, 95 + pauseAfter: !isLastDay, 96 + pauseDuration: isLastDay ? 0 : 24 * 60 * 60 * 1000 97 + }); 98 + 99 + processed = recordsEnd; 100 + day++; 101 + } 102 + 103 + return schedule; 104 + } 105 + 106 + 107 + /** 108 + * Format time duration in human-readable format 109 + */ 110 + export function formatTimeRemaining(ms: number): string { 111 + const days = Math.floor(ms / (24 * 60 * 60 * 1000)); 112 + const hours = Math.floor((ms % (24 * 60 * 60 * 1000)) / (60 * 60 * 1000)); 113 + const minutes = Math.floor((ms % (60 * 60 * 1000)) / (60 * 1000)); 114 + 115 + if (days > 0) { 116 + return `${days}d ${hours}h ${minutes}m`; 117 + } else if (hours > 0) { 118 + return `${hours}h ${minutes}m`; 119 + } else if (minutes > 0) { 120 + return `${minutes}m`; 121 + } else { 122 + return '< 1m'; 123 + } 124 + } 125 + 126 + /** 127 + * Display rate limit warning 128 + */ 129 + export function displayRateLimitWarning(): void { 130 + console.log('\n⚠️ ═══════════════════════════════════════════════════════════════════════════════'); 131 + console.log('⚠️ IMPORTANT: Bluesky AppView Rate Limits'); 132 + console.log('⚠️ ═══════════════════════════════════════════════════════════════════════════════'); 133 + console.log('⚠️'); 134 + console.log('⚠️ Exceeding 10K records per day can rate limit your ENTIRE PDS on Bluesky\'s'); 135 + console.log('⚠️ AppView. This affects ALL users on your PDS, not just your account!'); 136 + console.log('⚠️'); 137 + console.log('⚠️ This importer automatically limits imports to 1K records per day by default'); 138 + console.log('⚠️ with automatic batching and pauses to stay within safe limits.'); 139 + console.log('⚠️'); 140 + console.log('⚠️ See: https://docs.bsky.app/blog/rate-limits-pds-v3'); 141 + console.log('⚠️ ═══════════════════════════════════════════════════════════════════════════════\n'); 142 + } 143 + 144 + /** 145 + * Display rate limiting info 146 + */ 147 + export function displayRateLimitInfo( 148 + totalRecords: number, 149 + batchSize: number, 150 + batchDelay: number, 151 + estimatedDays: number, 152 + recordsPerDay: number 153 + ): void { 154 + console.log('\n📊 Rate Limiting Information:'); 155 + console.log(` Total records: ${totalRecords.toLocaleString()}`); 156 + console.log(` Daily limit: ${recordsPerDay.toLocaleString()} records/day`); 157 + console.log(` Estimated duration: ${estimatedDays} day${estimatedDays > 1 ? 's' : ''}`); 158 + console.log(` Batch size: ${batchSize} records`); 159 + console.log(` Batch delay: ${(batchDelay / 1000).toFixed(1)}s`); 160 + 161 + if (estimatedDays > 1) { 162 + console.log('\n The import will automatically pause between days.'); 163 + console.log(' You can safely close and restart the importer - it will resume from where it left off.'); 164 + } 165 + console.log(''); 166 + }
+27
tsconfig.json
··· 1 + { 2 + "compilerOptions": { 3 + "target": "ES2022", 4 + "module": "node16", 5 + "moduleResolution": "node16", 6 + "lib": ["ES2022"], 7 + "outDir": "./dist", 8 + "rootDir": "./src", 9 + "strict": true, 10 + "esModuleInterop": true, 11 + "skipLibCheck": true, 12 + "forceConsistentCasingInFileNames": true, 13 + "resolveJsonModule": true, 14 + "declaration": true, 15 + "declarationMap": true, 16 + "sourceMap": true, 17 + "noImplicitAny": true, 18 + "strictNullChecks": true, 19 + "strictFunctionTypes": true, 20 + "noUnusedLocals": true, 21 + "noUnusedParameters": true, 22 + "noImplicitReturns": true, 23 + "noFallthroughCasesInSwitch": true 24 + }, 25 + "include": ["src/**/*"], 26 + "exclude": ["node_modules", "dist"] 27 + }