My own corner of monopam
2
fork

Configure Feed

Select the types of activity you want to include in your feed.

linkedin: match LinkedIn's current public-Pulse DOM

Article body now lives under [data-test-id="publishing-text-block"]
inside <article class="article-main">, and engagement counts are
exposed as data-num-reactions / data-num-comments attributes on
<div data-test-id="social-actions__*"> elements. The old class-based
selectors no longer match, so the body extractor 404'd on real
articles.

Reorder body selectors to try the current LinkedIn shape first, fall
back to historical class names. Swap the engagement extractor to read
the attribute values directly (taking the max across the page since
each article has several social-actions widgets). Update the HTML
fixture to use the modern shape so tests exercise the primary chain.

Also reuse Vlog's --json flag instead of defining a second one;
cmdliner was rejecting the duplicate registration at startup with
"option name --json defined twice".

+51 -30
+2 -6
ocaml-linkedin/bin/main.ml
··· 305 305 print_string s.url; 306 306 print_char '\n' 307 307 308 - let json_flag = 309 - let doc = "Output as JSON." in 310 - Arg.(value & flag & info [ "json" ] ~doc) 311 - 312 308 let print_item_json i = 313 309 print_string (Json.to_string Linkedin.Item.json i); 314 310 print_char '\n' ··· 347 343 let info = Cmd.info "get" ~doc ~man in 348 344 Cmd.v info 349 345 Term.( 350 - const run' $ setup $ li_at_t $ jsessionid_t $ url_t $ json_flag 346 + const run' $ setup $ li_at_t $ jsessionid_t $ url_t $ Vlog.json 351 347 $ const ()) 352 348 353 349 let feed_cmd = ··· 389 385 Cmd.v info 390 386 Term.( 391 387 const run' $ setup $ li_at_t $ jsessionid_t $ id_or_url_t $ count_t 392 - $ json_flag $ const ()) 388 + $ Vlog.json $ const ()) 393 389 394 390 let cookies_cmd = 395 391 let run' () () =
+44 -19
ocaml-linkedin/lib/pulse.ml
··· 263 263 let parts = Soup.trimmed_texts el in 264 264 String.concat " " parts 265 265 266 + let max_count_from_attr soup attr = 267 + let nodes = Soup.($$) soup (Fmt.str "[%s]" attr) |> Soup.to_list in 268 + List.fold_left 269 + (fun acc el -> 270 + match Soup.attribute attr el with 271 + | Some s -> max acc (first_int s) 272 + | None -> acc) 273 + 0 nodes 274 + 266 275 let extract_count_from_selectors soup selectors = 267 276 List.find_map 268 277 (fun sel -> ··· 280 289 selectors 281 290 |> Option.value ~default:0 282 291 292 + (** The article body has several [data-num-reactions] attributes (one per 293 + social-actions widget, at article top and bottom, plus nested ones inside 294 + comments). The article's own count is the max across the page. *) 283 295 let extract_num_likes soup = 284 - extract_count_from_selectors soup 285 - [ 286 - "[data-test-id=\"social-counts-reactions\"]"; 287 - ".social-details-social-counts__reactions-count"; 288 - ".social-counts-reactions"; 289 - "[aria-label*=\"reactions\"]"; 290 - "[aria-label*=\"Reactions\"]"; 291 - ] 296 + let from_attr = max_count_from_attr soup "data-num-reactions" in 297 + if from_attr > 0 then from_attr 298 + else 299 + extract_count_from_selectors soup 300 + [ 301 + "[data-test-id=\"social-actions__reaction-count\"]"; 302 + "[data-test-id=\"social-actions__reactions\"]"; 303 + ".social-details-social-counts__reactions-count"; 304 + "[aria-label*=\"reactions\"]"; 305 + "[aria-label*=\"Reactions\"]"; 306 + ] 292 307 293 308 let extract_num_comments soup = 294 - extract_count_from_selectors soup 295 - [ 296 - "[data-test-id=\"social-counts-comments\"]"; 297 - ".social-details-social-counts__comments"; 298 - ".social-counts-comments"; 299 - "[aria-label*=\"comments\"]"; 300 - "[aria-label*=\"Comments\"]"; 301 - ] 309 + let from_attr = max_count_from_attr soup "data-num-comments" in 310 + if from_attr > 0 then from_attr 311 + else 312 + extract_count_from_selectors soup 313 + [ 314 + "[data-test-id=\"social-actions__comments\"]"; 315 + ".social-details-social-counts__comments"; 316 + "[aria-label*=\"comments\"]"; 317 + "[aria-label*=\"Comments\"]"; 318 + ] 302 319 303 - (** Find the article body element. Tries a ranked list of selectors; returns the 304 - first match. *) 320 + (** Find the article body element. Tries a ranked list of selectors, starting 321 + with LinkedIn's current public-Pulse markup and falling back to historical 322 + shapes. *) 305 323 let extract_body soup = 306 324 let selectors = 307 325 [ 326 + (* Current public Pulse (2026): <article class="article-main"><div 327 + class="article-main__content" data-test-id="publishing-text-block"> 328 + ...body... </div></article> *) 329 + "[data-test-id=\"publishing-text-block\"]"; 330 + "article.article-main .article-main__content"; 331 + ".article-main__content"; 332 + "article.article-main"; 333 + (* Historical / in-app shapes *) 308 334 "article .reader-article-content"; 309 335 "div.reader-article-content"; 310 336 "article .article__body"; 311 337 "div.article-body"; 312 338 "main article"; 313 339 "article"; 314 - "div[data-testid=\"article-body\"]"; 315 340 ] 316 341 in 317 342 List.find_map (fun sel -> Soup.( $? ) soup sel) selectors
+5 -5
ocaml-linkedin/test/fixtures/pulse_article.html
··· 13 13 <header> 14 14 <a href="https://www.linkedin.com/in/parsimoni" class="author-link">Parsimoni</a> 15 15 </header> 16 - <article> 17 - <div class="article__body"> 16 + <article class="article-main relative flex-grow pulse"> 17 + <div class="article-main__content" data-test-id="publishing-text-block"> 18 18 <p>Satellite operators are leaving significant <strong>economic value</strong> 19 19 in orbit. Three patterns recur:</p> 20 20 <ul> ··· 42 42 <p>See <code>orbit_value</code> above.</p> 43 43 </div> 44 44 </article> 45 - <div class="social-details-social-counts"> 46 - <span class="social-details-social-counts__reactions-count" aria-label="142 reactions">142</span> 47 - <span class="social-details-social-counts__comments">37 comments</span> 45 + <div class="social-actions"> 46 + <div data-test-id="social-actions__reactions" data-num-reactions="142" aria-label="142 Reactions">Reactions</div> 47 + <div data-test-id="social-actions__comments" data-num-comments="37" aria-label="37 Comments">Comments</div> 48 48 </div> 49 49 </body> 50 50 </html>