Import your Last.fm and Spotify listening history to the AT Protocol network using the fm.teal.alpha.feed.play lexicon.
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

chore: update README to be more clear

+256 -298
+256 -298
README.md
··· 1 1 # Last.fm to ATProto Importer 2 2 3 - Import your Last.fm listening history to the AT Protocol network using the `fm.teal.alpha.feed.play` lexicon. 3 + Import your Last.fm and Spotify listening history to the AT Protocol network using the `fm.teal.alpha.feed.play` lexicon. 4 4 5 - (Also [on Tangled!](https://tangled.org/@did:plc:ofrbh253gwicbkc5nktqepol/atproto-lastfm-importer)) 5 + [Also available on Tangled](https://tangled.org/@did:plc:ofrbh253gwicbkc5nktqepol/atproto-lastfm-importer) 6 6 7 - ## Features 7 + ## ⚠️ Important: Rate Limits 8 8 9 - - ✅ **Structured Logging**: Color-coded output with debug/verbose modes 10 - - ✅ **Batch Operations**: Uses `com.atproto.repo.applyWrites` for efficient batch publishing (up to 200 records per call) 11 - - ✅ **Spotify Support**: Import from Spotify Extended Streaming History (JSON format) 12 - - ✅ **Combined Import**: Merge Last.fm and Spotify exports, automatically deduplicating overlapping plays 13 - - ✅ **TID-based Record Keys**: Records use timestamp-based identifiers for chronological ordering 14 - - ✅ **Re-Sync Mode**: Check existing Teal records and only import new scrobbles (no duplicates!) 15 - - ✅ **Rate Limiting**: Automatically limits imports to 1K records per day to prevent rate limiting your entire PDS 16 - - ✅ **Multi-Day Imports**: Large imports (>1K records) automatically span multiple days with 24-hour pauses 17 - - ✅ **Resume Support**: Safe to stop (Ctrl+C) and restart - continues from where it left off 18 - - ✅ **Graceful Cancellation**: Press Ctrl+C to stop after the current batch completes 19 - - ✅ **Identity Resolution**: Resolves ATProto handles/DIDs using Slingshot 20 - - ✅ **PDS Auto-Discovery**: Automatically connects to your personal PDS 21 - - ✅ **Dry Run Mode**: Preview records without publishing 22 - - ✅ **Batch Processing**: Configurable batching with rate limit safety 23 - - ✅ **Progress Tracking**: Real-time progress with time estimates 24 - - ✅ **Error Handling**: Continues on errors with detailed reporting 25 - - ✅ **MusicBrainz Support**: Preserves MusicBrainz IDs when available (Last.fm only) 26 - - ✅ **Chronological Ordering**: Processes oldest first (or newest with `-r` flag) 9 + **CRITICAL**: Bluesky's AppView has rate limits on PDS instances. Exceeding 10K records per day can rate limit your **ENTIRE PDS**, affecting all users on your instance. 27 10 28 - ## Important: Rate Limits 29 - 30 - ⚠️ **CRITICAL**: Bluesky's AppView has rate limits on PDS instances. Exceeding 10K records per day can rate limit your **ENTIRE PDS**, affecting all users on your instance! 31 - 32 - This importer automatically: 33 - - Limits imports to **1,000 records per day** (90% of safe limit) 34 - - Calculates optimal batch sizes and delays 35 - - Pauses 24 hours between days for large imports 36 - - Shows clear progress and time estimates 11 + This importer automatically protects your PDS by: 12 + - Limiting imports to **1,000 records per day** (with 75% safety margin) 13 + - Calculating optimal batch sizes and delays 14 + - Pausing 24 hours between days for large imports 15 + - Providing clear progress tracking and time estimates 37 16 38 - See: [Bluesky Rate Limits Documentation](https://docs.bsky.app/blog/rate-limits-pds-v3) 17 + For more details, see the [Bluesky Rate Limits Documentation](https://docs.bsky.app/blog/rate-limits-pds-v3). 39 18 40 - ## Setup 19 + ## Quick Start 41 20 42 21 ```bash 22 + # Install dependencies 43 23 npm install 24 + 25 + # Build the project 44 26 npm run build 27 + 28 + # Run with interactive prompts 29 + npm start 30 + 31 + # Or run with command line arguments 32 + npm start -- -i lastfm.csv -h alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y 45 33 ``` 46 34 47 - ## Usage 35 + ## Features 36 + 37 + ### Import Capabilities 38 + - ✅ **Last.fm Import**: Full support for Last.fm CSV exports with MusicBrainz IDs 39 + - ✅ **Spotify Import**: Import Extended Streaming History JSON files 40 + - ✅ **Combined Import**: Merge Last.fm and Spotify exports with intelligent deduplication 41 + - ✅ **Re-Sync Mode**: Import only new scrobbles without creating duplicates 42 + - ✅ **Duplicate Removal**: Clean up accidentally imported duplicate records 43 + 44 + ### Performance & Safety 45 + - ✅ **Batch Operations**: Uses `com.atproto.repo.applyWrites` for efficient batch publishing (up to 200 records per call) 46 + - ✅ **Rate Limiting**: Automatic daily limits prevent PDS rate limiting 47 + - ✅ **Multi-Day Imports**: Large imports automatically span multiple days with 24-hour pauses 48 + - ✅ **Resume Support**: Safe to stop (Ctrl+C) and restart - continues from where it left off 49 + - ✅ **Graceful Cancellation**: Press Ctrl+C to stop after the current batch completes 50 + 51 + ### User Experience 52 + - ✅ **Structured Logging**: Color-coded output with debug/verbose modes 53 + - ✅ **Progress Tracking**: Real-time progress with time estimates 54 + - ✅ **Dry Run Mode**: Preview records without publishing 55 + - ✅ **Interactive Mode**: Simple prompts guide you through the process 56 + - ✅ **Command Line Mode**: Full automation support for scripting 48 57 49 - ### Combined Import Mode 58 + ### Technical Features 59 + - ✅ **TID-based Record Keys**: Timestamp-based identifiers for chronological ordering 60 + - ✅ **Identity Resolution**: Resolves ATProto handles/DIDs using Slingshot 61 + - ✅ **PDS Auto-Discovery**: Automatically connects to your personal PDS 62 + - ✅ **MusicBrainz Support**: Preserves MusicBrainz IDs when available (Last.fm) 63 + - ✅ **Chronological Ordering**: Processes oldest first (or newest with `-r` flag) 64 + - ✅ **Error Handling**: Continues on errors with detailed reporting 65 + 66 + ## Usage Examples 67 + 68 + ### Combined Import (Last.fm + Spotify) 50 69 51 70 Merge your Last.fm and Spotify listening history into a single, deduplicated import: 52 71 ··· 58 77 npm start -- -i lastfm.csv --spotify-input spotify-export/ -m combined -h alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y 59 78 ``` 60 79 61 - Combined mode will: 62 - 1. Parse both Last.fm CSV and Spotify JSON exports 63 - 2. Normalize track names and artist names for comparison 64 - 3. Identify duplicate plays (same track within 5 minutes) 65 - 4. Choose the best version of each play: 66 - - Prefers Last.fm records with MusicBrainz IDs 67 - - Otherwise prefers Spotify for better metadata quality 68 - 5. Merge into a single chronological timeline 69 - 6. Show detailed statistics about the merge 80 + **What combined mode does:** 81 + 1. Parses both Last.fm CSV and Spotify JSON exports 82 + 2. Normalizes track names and artist names for comparison 83 + 3. Identifies duplicate plays (same track within 5 minutes) 84 + 4. Chooses the best version of each play (prefers Last.fm with MusicBrainz IDs) 85 + 5. Merges into a single chronological timeline 86 + 6. Shows detailed statistics about the merge 70 87 71 - This is perfect for: 72 - - Getting complete listening history from both services 73 - - Filling gaps where one service was used more than the other 74 - - Ensuring the best metadata quality for each play 75 - - Avoiding duplicate entries when both services tracked the same play 76 - 77 - **Example Output:** 88 + **Example output:** 78 89 ``` 79 90 📊 Merge Statistics 80 91 ═══════════════════════════════════════════ ··· 96 107 97 108 ### Re-Sync Mode 98 109 99 - If you've already imported scrobbles before and want to sync your Last.fm export with Teal without creating duplicates: 110 + Sync your Last.fm export with Teal without creating duplicates: 100 111 101 112 ```bash 102 113 # Preview what will be synced ··· 106 117 npm start -- -i lastfm.csv -h alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -m sync -y 107 118 ``` 108 119 109 - Sync mode will: 110 - 1. Fetch all existing play records from your Teal feed 111 - 2. Compare them against your Last.fm export 112 - 3. Identify gaps (scrobbles in Last.fm that aren't in Teal) 113 - 4. Only import the missing records 114 - 5. Show detailed statistics about duplicates and new records 115 - 116 - This is perfect for: 120 + **Perfect for:** 117 121 - Re-running imports with updated Last.fm exports 118 122 - Recovering from interrupted imports 119 123 - Adding recent scrobbles without duplicating old ones 120 124 121 125 **Note:** Sync mode requires authentication even in dry-run mode to fetch existing records. 122 126 123 - ### Remove Duplicates Mode 127 + ### Remove Duplicates 124 128 125 - If you accidentally imported duplicate records, you can clean them up: 129 + Clean up accidentally imported duplicate records: 126 130 127 131 ```bash 128 132 # Preview duplicates (dry run) ··· 132 136 npm start -- -m deduplicate -h alice.bsky.social -p xxxx-xxxx-xxxx-xxxx 133 137 ``` 134 138 135 - This will: 136 - 1. Fetch all existing records from Teal 137 - 2. Identify duplicate plays (same track, artist, and timestamp) 138 - 3. Keep the first occurrence of each duplicate 139 - 4. Delete the rest 139 + ### Import from Spotify 140 140 141 - ### Interactive Mode 141 + ```bash 142 + # Import single Spotify JSON file 143 + npm start -- -i Streaming_History_Audio_2021-2023_0.json -m spotify -h alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y 142 144 143 - The simplest way to use the importer - just run it and follow the prompts: 144 - 145 - ```bash 146 - npm start 145 + # Import directory with multiple Spotify files (recommended) 146 + npm start -- -i '/path/to/Spotify Extended Streaming History' -m spotify -h alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y 147 147 ``` 148 148 149 - ### Command Line Mode 150 - 151 - For automation or scripting, provide all parameters via flags: 149 + ### Import from Last.fm 152 150 153 151 ```bash 154 - # Full automation (Last.fm) 152 + # Standard Last.fm import 155 153 npm start -- -i lastfm.csv -h alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y 156 - 157 - # Import from Spotify (single file) 158 - npm start -- -i Streaming_History_Audio_2021-2023_0.json -m spotify -h alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y 159 - 160 - # Import from Spotify (directory with multiple files - recommended) 161 - npm start -- -i '/path/to/Spotify Extended Streaming History' -m spotify -h alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y 162 - 163 - # Combined import (merge Last.fm and Spotify) 164 - npm start -- -i lastfm.csv --spotify-input '/path/to/Spotify Extended Streaming History' -m combined -h alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y 165 154 166 155 # Preview without publishing 167 156 npm start -- -i lastfm.csv --dry-run 168 157 169 - # Preview with verbose debug output 158 + # Process newest tracks first 159 + npm start -- -i lastfm.csv -h alice.bsky.social -r -y 160 + 161 + # Verbose debug output 170 162 npm start -- -i lastfm.csv --dry-run -v 171 163 172 164 # Quiet mode (only warnings and errors) 173 165 npm start -- -i lastfm.csv -h alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -q -y 166 + ``` 174 167 175 - # Custom batch settings (advanced users) 168 + ### Advanced Options 169 + 170 + ```bash 171 + # Custom batch settings (advanced users only) 176 172 npm start -- -i lastfm.csv -h alice.bsky.social -b 20 -d 3000 177 173 178 - # Process newest tracks first 179 - npm start -- -i lastfm.csv -h alice.bsky.social -r -y 174 + # Full automation with all flags 175 + npm start -- -i lastfm.csv -h alice.bsky.social -p xxxx-xxxx-xxxx-xxxx -y -q 180 176 ``` 181 177 182 178 ## Command Line Options 183 179 184 - ### Authentication 185 - | Option | Short | Description | 186 - |--------|-------|-------------| 187 - | `--handle <handle>` | `-h` | ATProto handle or DID (e.g., alice.bsky.social) | 188 - | `--password <pass>` | `-p` | ATProto app password | 180 + ### Required Options 181 + 182 + | Option | Short | Description | Example | 183 + |--------|-------|-------------|---------| 184 + | `--input <path>` | `-i` | Path to Last.fm CSV or Spotify JSON file/directory | `-i lastfm.csv` | 185 + | `--handle <handle>` | `-h` | ATProto handle or DID | `-h alice.bsky.social` | 186 + | `--password <pass>` | `-p` | ATProto app password | `-p xxxx-xxxx-xxxx-xxxx` | 189 187 190 - ### Input 191 - | Option | Short | Description | 192 - |--------|-------|-------------| 193 - | `--input <path>` | `-i` | Path to Last.fm CSV or Spotify JSON file/directory | 194 - | `--spotify-input <path>` | | Path to Spotify export (for combined mode) | 188 + ### Import Mode 195 189 196 - ### Mode 197 190 | Option | Short | Description | Default | 198 191 |--------|-------|-------------|---------| 199 192 | `--mode <mode>` | `-m` | Import mode | `lastfm` | 200 193 201 194 **Available modes:** 202 195 - `lastfm` - Import Last.fm export only 203 - - `spotify` - Import Spotify export only 196 + - `spotify` - Import Spotify export only 204 197 - `combined` - Merge Last.fm + Spotify exports 205 198 - `sync` - Skip existing records (sync mode) 206 199 - `deduplicate` - Remove duplicate records 207 200 208 - ### Batch Configuration 209 - | Option | Short | Description | Default | 210 - |--------|-------|-------------|---------| 211 - | `--batch-size <num>` | `-b` | Records per batch | Auto-calculated | 212 - | `--batch-delay <ms>` | `-d` | Delay between batches in ms | 500 (min: 500) | 201 + ### Additional Options 213 202 214 - ### Import Options 215 203 | Option | Short | Description | Default | 216 204 |--------|-------|-------------|---------| 217 - | `--reverse` | `-r` | Process newest first | false (oldest first) | 218 - | `--yes` | `-y` | Skip confirmation prompts | false | 219 - | `--dry-run` | | Preview without importing | false | 220 - 221 - ### Output 222 - | Option | Short | Description | Default | 223 - |--------|-------|-------------|---------| 224 - | `--verbose` | `-v` | Enable debug logging | false | 225 - | `--quiet` | `-q` | Suppress non-essential output | false | 205 + | `--spotify-input <path>` | | Path to Spotify export (for combined mode) | - | 206 + | `--reverse` | `-r` | Process newest first | `false` | 207 + | `--yes` | `-y` | Skip confirmation prompts | `false` | 208 + | `--dry-run` | | Preview without importing | `false` | 209 + | `--verbose` | `-v` | Enable debug logging | `false` | 210 + | `--quiet` | `-q` | Suppress non-essential output | `false` | 211 + | `--batch-size <num>` | `-b` | Records per batch (1-200) | Auto-calculated | 212 + | `--batch-delay <ms>` | `-d` | Delay between batches in ms | `500` (min) | 226 213 | `--help` | | Show help message | - | 227 214 228 215 ### Legacy Flags (Backwards Compatible) 229 216 230 - For backwards compatibility, the following old flags still work: 231 - - `--file` → Use `--input` instead 232 - - `--identifier` → Use `--handle` instead 233 - - `--spotify-file` → Use `--spotify-input` instead 234 - - `--reverse-chronological` → Use `--reverse` instead 235 - - `--spotify` → Use `--mode spotify` instead 236 - - `--combined` → Use `--mode combined` instead 237 - - `--sync` → Use `--mode sync` instead 238 - - `--remove-duplicates` → Use `--mode deduplicate` instead 239 - 240 - ### Batch Settings 241 - 242 - The importer automatically calculates optimal batch settings based on your total record count and rate limits. You generally **don't need** to specify batch settings unless you have specific requirements. 243 - 244 - **Automatic behavior:** 245 - - For imports < 1K records: Uses default settings (200 records/batch, 500ms delay) 246 - - For imports > 1K records: Automatically calculates settings to spread across multiple days 247 - 248 - **Manual override** (advanced): 249 - - `--batch-size`: Number of records processed per batch (1-200, PDS maximum) 250 - - `--batch-delay`: Milliseconds to wait between batches (min: 500) 251 - 252 - ⚠️ Lower delays increase speed but risk hitting rate limits. The automatic calculation is recommended. 253 - 254 - ## Logging and Output 255 - 256 - The importer includes a structured logging system with color-coded output: 257 - 258 - - **Green (✓)**: Success messages 259 - - **Cyan (→)**: Progress updates 260 - - **Yellow (⚠️)**: Warnings 261 - - **Red (✗)**: Errors 262 - - **Bold Red (🛑)**: Fatal errors 263 - 264 - ### Verbosity Levels 265 - 266 - **Default Mode**: Shows standard operational messages 267 - ```bash 268 - npm start -- -i lastfm.csv -h alice.bsky.social -p pass 269 - ``` 270 - 271 - **Verbose Mode** (`-v`): Shows detailed debug information including batch timing, API calls, etc. 272 - ```bash 273 - npm start -- -i lastfm.csv -h alice.bsky.social -p pass -v 274 - ``` 275 - 276 - **Quiet Mode** (`-q`): Only shows warnings and errors 277 - ```bash 278 - npm start -- -i lastfm.csv -h alice.bsky.social -p pass -q 279 - ``` 217 + These old flags still work but are deprecated: 218 + - `--file` → Use `--input` 219 + - `--identifier` → Use `--handle` 220 + - `--spotify-file` → Use `--spotify-input` 221 + - `--reverse-chronological` → Use `--reverse` 222 + - `--spotify` → Use `--mode spotify` 223 + - `--combined` → Use `--mode combined` 224 + - `--sync` → Use `--mode sync` 225 + - `--remove-duplicates` → Use `--mode deduplicate` 280 226 281 227 ## Getting Your Data 282 228 283 229 ### Last.fm Export 284 230 285 - 1. Go to <https://lastfm.ghan.nl/export/> 231 + 1. Visit [Last.fm Export Tool](https://lastfm.ghan.nl/export/) 286 232 2. Request your data export in CSV format 287 233 3. Download the CSV file when ready 288 - 4. Use the CSV file path with this script 234 + 4. Use the CSV file path with this importer 289 235 290 236 ### Spotify Export 291 237 292 - 1. Go to your [Spotify Privacy Settings](https://www.spotify.com/account/privacy/) 293 - 2. Scroll down to "Download your data" and request your data 294 - 3. Select "Extended streaming history" (this can take up to 30 days) 238 + 1. Go to [Spotify Privacy Settings](https://www.spotify.com/account/privacy/) 239 + 2. Scroll to "Download your data" and request your data 240 + 3. Select "Extended streaming history" (can take up to 30 days) 295 241 4. When ready, download and extract the ZIP file 296 242 5. Use either: 297 243 - A single JSON file: `Streaming_History_Audio_2021-2023_0.json` 298 244 - The entire extracted directory (recommended) 299 245 300 - **Note**: Spotify exports include multiple JSON files. The importer automatically: 246 + **Note:** The importer automatically: 301 247 - Reads all `Streaming_History_Audio_*.json` files in a directory 302 - - Filters out podcasts, audiobooks, and other non-music content 248 + - Filters out podcasts, audiobooks, and non-music content 303 249 - Combines all music tracks into a single import 304 250 305 - ## What Gets Imported 251 + ## Data Format 306 252 307 - Each scrobble (from Last.fm or Spotify) becomes an `fm.teal.alpha.feed.play` record with: 253 + Each scrobble becomes an `fm.teal.alpha.feed.play` record with: 308 254 309 255 ### Required Fields 310 256 - **trackName**: The name of the track 311 257 - **artists**: Array of artist objects (requires `artistName`, optional `artistMbId` for Last.fm) 312 258 - **playedTime**: ISO 8601 timestamp of when you listened 313 259 - **submissionClientAgent**: Identifies this importer (`lastfm-importer/v0.6.0`) 314 - - **musicServiceBaseDomain**: Set to `last.fm` or `spotify.com` depending on source 260 + - **musicServiceBaseDomain**: Set to `last.fm` or `spotify.com` 315 261 316 - ### Optional Fields (when available) 262 + ### Optional Fields 317 263 - **releaseName**: Album/release name 318 264 - **releaseMbId**: MusicBrainz release ID (Last.fm only) 319 265 - **recordingMbId**: MusicBrainz recording/track ID (Last.fm only) ··· 360 306 } 361 307 ``` 362 308 363 - ## Processing Order 309 + ## How It Works 364 310 365 - By default, records are processed **oldest first** (chronological order). This means your earliest scrobbles will appear first in your ATProto feed. 366 - 367 - Use the `--reverse` or `-r` flag to process **newest first** instead. 311 + ### Processing Flow 312 + 1. **Parses input file(s)**: 313 + - Last.fm: CSV using `csv-parse` library 314 + - Spotify: JSON files (single or multiple in directory) 315 + 2. **Filters data**: 316 + - Spotify: Automatically removes podcasts, audiobooks, and non-music content 317 + 3. **Converts to schema**: Maps to `fm.teal.alpha.feed.play` format 318 + 4. **Sorts records**: Chronologically (oldest first) or reverse with `-r` flag 319 + 5. **Generates TID-based keys**: From `playedTime` for chronological ordering 320 + 6. **Validates fields**: Ensures required fields are present 321 + 7. **Publishes in batches**: Uses `com.atproto.repo.applyWrites` (up to 200 records per call) 368 322 369 - ## Multi-Day Imports 323 + ### Rate Limiting Algorithm 324 + 1. Calculates safe daily limit (75% of 10K = 7,500 records/day by default) 325 + 2. Determines how many days needed for your import 326 + 3. Calculates optimal batch size and delay to spread records evenly 327 + 4. Enforces minimum delay between batches 328 + 5. Shows clear schedule before starting 370 329 371 - For imports exceeding 1,000 records (after applying the 90% safety margin), the importer automatically: 330 + ### Multi-Day Imports 372 331 332 + For imports exceeding the daily limit, the importer automatically: 373 333 1. **Calculates a schedule**: Splits your import across multiple days 374 334 2. **Shows the plan**: Displays which records will be imported each day 375 335 3. **Processes Day 1**: Imports the first batch of records 376 336 4. **Pauses 24 hours**: Waits a full day before continuing 377 337 5. **Repeats**: Continues until all records are imported 378 338 339 + **Example output for a 20,000 record import:** 340 + ``` 341 + 📊 Rate Limiting Information: 342 + Total records: 20,000 343 + Daily limit: 7,500 records/day 344 + Estimated duration: 3 days 345 + Batch size: 200 records 346 + Batch delay: 11.52s 347 + ``` 348 + 379 349 **Important notes:** 380 - - You can safely stop (Ctrl+C) and restart the importer 381 - - Progress is preserved - it continues where it left off 350 + - You can safely stop (Ctrl+C) and restart 351 + - Progress is preserved - continues where it left off 382 352 - Each day's progress is clearly displayed 383 353 - Time estimates account for multi-day duration 384 354 385 - Example output for a 5,000 record import: 386 - ``` 387 - 📊 Rate Limiting Information: 388 - Total records: 5,000 389 - Daily limit: 900 records/day 390 - Estimated duration: 6 days 391 - Batch size: 50 records 392 - Batch delay: 1920.0s 393 - ``` 355 + ## Logging and Output 356 + 357 + The importer uses color-coded output for clarity: 358 + 359 + - **Green (✓)**: Success messages 360 + - **Cyan (→)**: Progress updates 361 + - **Yellow (⚠️)**: Warnings 362 + - **Red (✗)**: Errors 363 + - **Bold Red (🛑)**: Fatal errors 394 364 395 - ## Dry Run Mode 365 + ### Verbosity Levels 396 366 397 - Preview what will be imported without actually publishing: 367 + **Default Mode**: Standard operational messages 368 + ```bash 369 + npm start -- -i lastfm.csv -h alice.bsky.social -p pass 370 + ``` 398 371 372 + **Verbose Mode** (`-v`): Detailed debug information including batch timing and API calls 399 373 ```bash 400 - npm start -- -i lastfm.csv --dry-run 374 + npm start -- -i lastfm.csv -h alice.bsky.social -p pass -v 401 375 ``` 402 376 403 - Dry run shows: 404 - - Total record count 405 - - Rate limiting schedule (if applicable) 406 - - Multi-day import plan (if needed) 407 - - Preview of first 5 records with full details 408 - - MusicBrainz IDs when available 377 + **Quiet Mode** (`-q`): Only warnings and errors 378 + ```bash 379 + npm start -- -i lastfm.csv -h alice.bsky.social -p pass -q 380 + ``` 409 381 410 382 ## Error Handling 411 383 412 384 The importer is designed to be resilient: 413 385 414 - - **Network errors**: Records that fail are logged but don't stop the import 386 + - **Network errors**: Failed records are logged but don't stop the import 415 387 - **Invalid data**: Skipped with error messages 416 388 - **Authentication issues**: Clear error messages with suggested fixes 417 389 - **Rate limit hits**: Automatic adjustment and retry logic 418 390 - **Ctrl+C handling**: Gracefully stops after current batch 419 391 420 - Failed records are logged but don't prevent the rest of your import from completing. 392 + ## Troubleshooting 393 + 394 + ### Authentication Issues 395 + 396 + **"Handle not found"** 397 + - Verify your ATProto handle is correct (e.g., `alice.bsky.social`) 398 + - Ensure you're using a valid DID or handle 399 + 400 + **"Invalid credentials"** 401 + - Use an **app password**, not your main account password 402 + - Generate app passwords in your account settings 403 + 404 + ### Performance Issues 405 + 406 + **"Rate limit exceeded"** 407 + - The importer should prevent this automatically 408 + - If you see this, wait 24 hours before retrying 409 + - Consider reducing batch size with `-b` flag 410 + 411 + **Import seems stuck** 412 + - Check progress messages - large imports take time 413 + - Multi-day imports pause for 24 hours between days 414 + - You can safely stop (Ctrl+C) and resume later 415 + - Use `--verbose` flag to see detailed progress 416 + 417 + ### Connection Issues 418 + 419 + **"Connection refused"** 420 + - Check your internet connection 421 + - Verify your PDS is accessible 422 + - Some PDSs may have firewall rules 423 + 424 + ### Output Control 425 + 426 + **Too much output** 427 + - Use `--quiet` flag to suppress non-essential messages 428 + - Only warnings and errors will be shown 429 + 430 + **Need more details** 431 + - Use `--verbose` flag to see debug-level information 432 + - Shows batch timing, API calls, and detailed progress 433 + 434 + ## Development 435 + 436 + ```bash 437 + # Type checking 438 + npm run type-check 439 + 440 + # Build 441 + npm run build 442 + 443 + # Development mode (rebuild + run) 444 + npm run dev 445 + 446 + # Run tests 447 + npm run test 448 + 449 + # Clean build artifacts 450 + npm run clean 451 + ``` 421 452 422 453 ## Project Structure 423 454 ··· 433 464 │ │ ├── merge.ts # Combined import deduplication 434 465 │ │ └── sync.ts # Re-sync mode & duplicate detection 435 466 │ ├── utils/ 436 - │ │ ├── logger.ts # Structured logging system (NEW!) 467 + │ │ ├── logger.ts # Structured logging system 437 468 │ │ ├── helpers.ts # Utility functions (timing, formatting) 438 469 │ │ ├── input.ts # User input handling (prompts, passwords) 439 470 │ │ ├── rate-limiter.ts # Rate limiting calculations ··· 448 479 │ └── play.json # Play record schema 449 480 ├── package.json 450 481 ├── tsconfig.json 451 - ├── CLI_IMPROVEMENTS.md # Detailed CLI documentation 452 482 └── README.md 453 483 ``` 454 484 455 - ## Development 456 - 457 - ```bash 458 - # Type checking 459 - npm run type-check 460 - 461 - # Build 462 - npm run build 463 - 464 - # Development mode (rebuild + run) 465 - npm run dev 466 - 467 - # Clean build artifacts 468 - npm run clean 469 - ``` 470 - 471 485 ## Technical Details 472 486 473 487 ### Authentication ··· 475 489 - Requires an ATProto app password (not your main password) 476 490 - Automatically configures the agent for your personal PDS 477 491 478 - ### Rate Limiting Algorithm 479 - 1. Calculates safe daily limit (90% of 1K = 900 records/day) 480 - 2. Determines how many days needed for your import 481 - 3. Calculates optimal batch size and delay to spread records evenly 482 - 4. Enforces minimum 500ms delay between batches 483 - 5. Shows clear schedule before starting 484 - 485 - ### Record Processing 486 - 1. Parses input file(s): 487 - - **Last.fm**: CSV using `csv-parse` library 488 - - **Spotify**: JSON files (single or multiple in directory) 489 - 2. Filters data: 490 - - **Spotify**: Automatically removes podcasts, audiobooks, and non-music content 491 - 3. Converts to `fm.teal.alpha.feed.play` schema 492 - 4. Sorts records chronologically (or reverse if `-r` flag) 493 - 5. Generates TID-based record keys from `playedTime` for chronological ordering 494 - 6. Validates required fields 495 - 7. Publishes in batches using `com.atproto.repo.applyWrites` (up to 200 records per call, the PDS maximum) 496 - 497 - **Note:** The batch publishing uses `applyWrites` instead of individual `createRecord` calls for dramatically improved performance (up to 20x faster). 492 + ### Batch Publishing 493 + - Uses `com.atproto.repo.applyWrites` for efficiency (up to 20x faster than individual calls) 494 + - Batches up to 200 records per API call (PDS maximum) 495 + - Automatically adjusts batch size based on total record count 496 + - Enforces minimum delays between batches for rate limit safety 498 497 499 498 ### Data Mapping 500 499 501 500 **Last.fm:** 502 - - **Track info**: Direct mapping from CSV columns 503 - - **Timestamps**: Converts Unix timestamps to ISO 8601 504 - - **MusicBrainz IDs**: Preserved when present in CSV 505 - - **URLs**: Generated from artist/track names 506 - - **Artists**: Wrapped in array format with optional MBID 501 + - Direct mapping from CSV columns 502 + - Converts Unix timestamps to ISO 8601 503 + - Preserves MusicBrainz IDs when present 504 + - Generates URLs from artist/track names 505 + - Wraps artists in array format with optional MBID 507 506 508 507 **Spotify:** 509 - - **Track info**: Extracted from JSON fields 510 - - **Timestamps**: Already in ISO 8601 format (`ts` field) 511 - - **URLs**: Generated from `spotify_track_uri` field 512 - - **Artists**: Extracted from `master_metadata_album_artist_name` 513 - - **Albums**: Extracted from `master_metadata_album_album_name` 514 - - **Filtering**: Non-music content automatically excluded 508 + - Extracts data from JSON fields 509 + - Already in ISO 8601 format (`ts` field) 510 + - Generates URLs from `spotify_track_uri` 511 + - Automatically filters non-music content 512 + - Extracts artist and album from metadata fields 515 513 516 - ## Lexicon Reference 514 + ### Lexicon Reference 517 515 518 516 This importer follows the official `fm.teal.alpha` lexicon defined in `/lexicons/fm.teal.alpha/feed/play.json`. 519 517 520 - The lexicon defines: 521 - - Required and optional field types 522 - - String length constraints 523 - - Array formats 524 - - Timestamp formatting 525 - - URL validation 526 - 527 - ## Troubleshooting 528 - 529 - ### "Handle not found" 530 - - Verify your ATProto handle is correct (e.g., `alice.bsky.social`) 531 - - Make sure you're using a valid DID or handle 532 - 533 - ### "Invalid credentials" 534 - - Use an **app password**, not your main account password 535 - - Generate app passwords in your account settings 536 - 537 - ### "Rate limit exceeded" 538 - - The importer should prevent this automatically 539 - - If you see this, wait 24 hours before retrying 540 - - Consider reducing batch size or increasing delay 541 - 542 - ### "Connection refused" 543 - - Check your internet connection 544 - - Verify your PDS is accessible 545 - - Some PDSs may have firewall rules 546 - 547 - ### Import seems stuck 548 - - Check progress messages - large imports take time 549 - - Multi-day imports pause for 24 hours between days 550 - - You can safely stop (Ctrl+C) and resume later 551 - - Use `--verbose` flag to see detailed progress 552 - 553 - ### Too much output 554 - - Use `--quiet` flag to suppress non-essential messages 555 - - Only warnings and errors will be shown 556 - 557 - ### Need more details 558 - - Use `--verbose` flag to see debug-level information 559 - - Shows batch timing, API calls, and detailed progress 518 + The lexicon defines required and optional field types, string length constraints, array formats, timestamp formatting, and URL validation. 560 519 561 520 ## Contributing 562 521 563 - Contributions welcome! Please: 522 + Contributions are welcome! Please: 564 523 1. Fork the repository 565 524 2. Create a feature branch 566 525 3. Make your changes with tests 567 526 4. Submit a pull request 568 527 569 - See `CLI_IMPROVEMENTS.md` for developer documentation on the logging system and CLI structure. 570 - 571 528 ## License 572 529 573 530 AGPL-3.0-only - See LICENCE file for details ··· 579 536 - Identity resolution via [Slingshot](https://slingshot.danner.cloud) 580 537 - Follows the `fm.teal.alpha` lexicon standard 581 538 - Colored output via [chalk](https://www.npmjs.com/package/chalk) 539 + - Progress indicators via [ora](https://www.npmjs.com/package/ora) and [cli-progress](https://www.npmjs.com/package/cli-progress) 582 540 583 541 --- 584 542 585 - **Note**: This tool is for personal use. Respect the terms of service and rate limits when exporting your data. 543 + **Note**: This tool is for personal use. Respect the terms of service and rate limits when importing your data.