this repo has no description
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

statuses-lookup command, closes #13

+86 -2
+23 -2
README.md
··· 59 59 60 60 $ twitter-to-sqlite users-lookup users.db simonw cleopaws 61 61 62 - You can pass user IDs instead usincg the `--ids` option: 62 + You can pass user IDs instead using the `--ids` option: 63 63 64 64 $ twitter-to-sqlite users-lookup users.db 12497 3166449535 --ids 65 + 66 + This command also accepts `--sql` and `--attach` options, documented below. 67 + 68 + ## Retrieve tweets in bulk 69 + 70 + If you have a list of tweet IDS you can bulk fetch them using the `statuses-lookup` command: 71 + 72 + $ twitter-to-sqlite statuses-lookup tweets.db 1122154819815239680 1122154178493575169 73 + 74 + The `--sql` and `--attach` options are supported. 75 + 76 + Here's a recipe to retrieve any tweets that existing tweets are in-reply-to which have not yet been stored in your database: 77 + 78 + $ twitter-to-sqlite statuses-lookup tweets.db \ 79 + --sql=' 80 + select in_reply_to_status_id 81 + from tweets 82 + where in_reply_to_status_id is not null' \ 83 + --skip-existing 84 + 85 + The `--skip-existing` option means that tweets that have already been stored in the database will not be fetched again. 65 86 66 87 ## Retrieving Twitter followers 67 88 ··· 111 132 112 133 This option is available for some subcommands - run `twitter-to-sqlite command-name --help` to check. 113 134 114 - You can provide Twitter screen names (or user IDs) directly as command-line arguments, or you can provide those screen names or IDs by executing a SQL query. 135 + You can provide Twitter screen names (or user IDs or tweet IDs) directly as command-line arguments, or you can provide those screen names or IDs by executing a SQL query. 115 136 116 137 For example: consider a SQLite database with an `attendees` table listing names and Twitter accounts - something like this: 117 138
+44
twitter_to_sqlite/cli.py
··· 221 221 utils.save_users(db, batch) 222 222 223 223 224 + @cli.command(name="statuses-lookup") 225 + @click.argument( 226 + "db_path", 227 + type=click.Path(file_okay=True, dir_okay=False, allow_dash=False), 228 + required=True, 229 + ) 230 + @add_identifier_options 231 + @click.option( 232 + "-a", 233 + "--auth", 234 + type=click.Path(file_okay=True, dir_okay=False, allow_dash=True, exists=True), 235 + default="auth.json", 236 + help="Path to auth.json token file", 237 + ) 238 + @click.option( 239 + "--skip-existing", is_flag=True, help="Skip tweets that are already in the DB" 240 + ) 241 + @click.option("--silent", is_flag=True, help="Disable progress bar") 242 + def statuses_lookup(db_path, identifiers, attach, sql, auth, skip_existing, silent): 243 + "Fetch tweets by their IDs" 244 + auth = json.load(open(auth)) 245 + session = utils.session_for_auth(auth) 246 + db = sqlite_utils.Database(db_path) 247 + identifiers = utils.resolve_identifiers(db, identifiers, attach, sql) 248 + if skip_existing: 249 + existing_ids = set( 250 + r[0] for r in db.conn.execute("select id from tweets").fetchall() 251 + ) 252 + identifiers = [i for i in identifiers if int(i) not in existing_ids] 253 + if silent: 254 + for batch in utils.fetch_status_batches(session, identifiers): 255 + utils.save_tweets(db, batch) 256 + else: 257 + # Do it with a progress bar 258 + count = len(identifiers) 259 + with click.progressbar( 260 + length=count, 261 + label="Importing {:,} tweet{}".format(count, "" if count == 1 else "s"), 262 + ) as bar: 263 + for batch in utils.fetch_status_batches(session, identifiers): 264 + utils.save_tweets(db, batch) 265 + bar.update(len(batch)) 266 + 267 + 224 268 @cli.command(name="list-members") 225 269 @click.argument( 226 270 "db_path",
+19
twitter_to_sqlite/utils.py
··· 255 255 time.sleep(sleep) 256 256 257 257 258 + def fetch_status_batches(session, tweet_ids, sleep=1): 259 + # Yields lists of up to 100 tweets 260 + batches = [] 261 + batch = [] 262 + for id in tweet_ids: 263 + batch.append(id) 264 + if len(batch) == 100: 265 + batches.append(batch) 266 + batch = [] 267 + if batch: 268 + batches.append(batch) 269 + url = "https://api.twitter.com/1.1/statuses/lookup.json" 270 + for batch in batches: 271 + args = {"id": ",".join(map(str, batch)), "tweet_mode": "extended"} 272 + tweets = session.get(url, params=args).json() 273 + yield tweets 274 + time.sleep(sleep) 275 + 276 + 258 277 def resolve_identifiers(db, identifiers, attach, sql): 259 278 if sql: 260 279 if attach: