A fork of https://github.com/crosspoint-reader/crosspoint-reader
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

perf: Replace std::list with std::vector in text layout (#1038)

## Summary

_Revision to @blindbat's #802. Description comes from the original PR._

- Replace `std::list` with `std::vector` for word storage in `TextBlock`
and `ParsedText`
- Use index-based access (`words[i]`) instead of iterator advancement
(`std::advance(it, n)`)
- Remove the separate `continuesVec` copy that was built from
`wordContinues` for O(1) access — now unnecessary since
`std::vector<bool>` already provides O(1) indexing

## Why

`std::list` allocates each node individually on the heap with 16 bytes
of prev/next pointer overhead per node. For text layout with many small
words, this means:
- Scattered heap allocations instead of contiguous memory
- Poor cache locality during iteration (each node can be anywhere in
memory)
- Per-node malloc/free overhead during construction and destruction

`std::vector` stores elements contiguously, giving better cache
performance during the tight rendering and layout loops. The
`extractLine` function also benefits: list splice was O(1) but required
maintaining three parallel iterators, while vector range construction
with move iterators is simpler and still efficient for the small
line-sized chunks involved.

## Files changed

- `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp`
- `lib/Epub/Epub/ParsedText.h` / `.cpp`

## AI Usage

YES

## Test plan

- [ ] Open an EPUB with mixed formatting (bold, italic, underline) —
verify text renders correctly
- [ ] Open a book with justified text — verify word spacing is correct
- [ ] Open a book with hyphenation enabled — verify words break
correctly at hyphens
- [ ] Navigate through pages rapidly — verify no rendering glitches or
crashes
- [ ] Open a book with long paragraphs — verify text layout matches
pre-change behavior

---------

Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>

authored by

Zach Nelson
Kuanysh Bekkulov
and committed by
GitHub
36967945 c1fad16e

+50 -93
+32 -66
lib/Epub/Epub/ParsedText.cpp
··· 5 5 #include <algorithm> 6 6 #include <cmath> 7 7 #include <functional> 8 - #include <iterator> 9 8 #include <limits> 10 9 #include <vector> 11 10 ··· 82 81 const int spaceWidth = renderer.getSpaceWidth(fontId); 83 82 auto wordWidths = calculateWordWidths(renderer, fontId); 84 83 85 - // Build indexed continues vector from the parallel list for O(1) access during layout 86 - std::vector<bool> continuesVec(wordContinues.begin(), wordContinues.end()); 87 - 88 84 std::vector<size_t> lineBreakIndices; 89 85 if (hyphenationEnabled) { 90 86 // Use greedy layout that can split words mid-loop when a hyphenated prefix fits. 91 - lineBreakIndices = computeHyphenatedLineBreaks(renderer, fontId, pageWidth, spaceWidth, wordWidths, continuesVec); 87 + lineBreakIndices = computeHyphenatedLineBreaks(renderer, fontId, pageWidth, spaceWidth, wordWidths, wordContinues); 92 88 } else { 93 - lineBreakIndices = computeLineBreaks(renderer, fontId, pageWidth, spaceWidth, wordWidths, continuesVec); 89 + lineBreakIndices = computeLineBreaks(renderer, fontId, pageWidth, spaceWidth, wordWidths, wordContinues); 94 90 } 95 91 const size_t lineCount = includeLastLine ? lineBreakIndices.size() : lineBreakIndices.size() - 1; 96 92 97 93 for (size_t i = 0; i < lineCount; ++i) { 98 - extractLine(i, pageWidth, spaceWidth, wordWidths, continuesVec, lineBreakIndices, processLine); 94 + extractLine(i, pageWidth, spaceWidth, wordWidths, wordContinues, lineBreakIndices, processLine); 95 + } 96 + 97 + // Remove consumed words so size() reflects only remaining words 98 + if (lineCount > 0) { 99 + const size_t consumed = lineBreakIndices[lineCount - 1]; 100 + words.erase(words.begin(), words.begin() + consumed); 101 + wordStyles.erase(wordStyles.begin(), wordStyles.begin() + consumed); 102 + wordContinues.erase(wordContinues.begin(), wordContinues.begin() + consumed); 99 103 } 100 104 } 101 105 102 106 std::vector<uint16_t> ParsedText::calculateWordWidths(const GfxRenderer& renderer, const int fontId) { 103 - const size_t totalWordCount = words.size(); 104 - 105 107 std::vector<uint16_t> wordWidths; 106 - wordWidths.reserve(totalWordCount); 108 + wordWidths.reserve(words.size()); 107 109 108 - auto wordsIt = words.begin(); 109 - auto wordStylesIt = wordStyles.begin(); 110 - 111 - while (wordsIt != words.end()) { 112 - wordWidths.push_back(measureWordWidth(renderer, fontId, *wordsIt, *wordStylesIt)); 113 - 114 - std::advance(wordsIt, 1); 115 - std::advance(wordStylesIt, 1); 110 + for (size_t i = 0; i < words.size(); ++i) { 111 + wordWidths.push_back(measureWordWidth(renderer, fontId, words[i], wordStyles[i])); 116 112 } 117 113 118 114 return wordWidths; ··· 137 133 // First word needs to fit in reduced width if there's an indent 138 134 const int effectiveWidth = i == 0 ? pageWidth - firstLineIndent : pageWidth; 139 135 while (wordWidths[i] > effectiveWidth) { 140 - if (!hyphenateWordAtIndex(i, effectiveWidth, renderer, fontId, wordWidths, /*allowFallbackBreaks=*/true, 141 - &continuesVec)) { 136 + if (!hyphenateWordAtIndex(i, effectiveWidth, renderer, fontId, wordWidths, /*allowFallbackBreaks=*/true)) { 142 137 break; 143 138 } 144 139 } ··· 284 279 const int availableWidth = effectivePageWidth - lineWidth - spacing; 285 280 const bool allowFallbackBreaks = isFirstWord; // Only for first word on line 286 281 287 - if (availableWidth > 0 && hyphenateWordAtIndex(currentIndex, availableWidth, renderer, fontId, wordWidths, 288 - allowFallbackBreaks, &continuesVec)) { 282 + if (availableWidth > 0 && 283 + hyphenateWordAtIndex(currentIndex, availableWidth, renderer, fontId, wordWidths, allowFallbackBreaks)) { 289 284 // Prefix now fits; append it to this line and move to next line 290 285 lineWidth += spacing + wordWidths[currentIndex]; 291 286 ++currentIndex; ··· 317 312 // available width. 318 313 bool ParsedText::hyphenateWordAtIndex(const size_t wordIndex, const int availableWidth, const GfxRenderer& renderer, 319 314 const int fontId, std::vector<uint16_t>& wordWidths, 320 - const bool allowFallbackBreaks, std::vector<bool>* continuesVec) { 315 + const bool allowFallbackBreaks) { 321 316 // Guard against invalid indices or zero available width before attempting to split. 322 317 if (availableWidth <= 0 || wordIndex >= words.size()) { 323 318 return false; 324 319 } 325 320 326 - // Get iterators to target word and style. 327 - auto wordIt = words.begin(); 328 - auto styleIt = wordStyles.begin(); 329 - std::advance(wordIt, wordIndex); 330 - std::advance(styleIt, wordIndex); 331 - 332 - const std::string& word = *wordIt; 333 - const auto style = *styleIt; 321 + const std::string& word = words[wordIndex]; 322 + const auto style = wordStyles[wordIndex]; 334 323 335 324 // Collect candidate breakpoints (byte offsets and hyphen requirements). 336 325 auto breakInfos = Hyphenator::breakOffsets(word, allowFallbackBreaks); ··· 367 356 368 357 // Split the word at the selected breakpoint and append a hyphen if required. 369 358 std::string remainder = word.substr(chosenOffset); 370 - wordIt->resize(chosenOffset); 359 + words[wordIndex].resize(chosenOffset); 371 360 if (chosenNeedsHyphen) { 372 - wordIt->push_back('-'); 361 + words[wordIndex].push_back('-'); 373 362 } 374 363 375 364 // Insert the remainder word (with matching style and continuation flag) directly after the prefix. 376 - auto insertWordIt = std::next(wordIt); 377 - auto insertStyleIt = std::next(styleIt); 378 - words.insert(insertWordIt, remainder); 379 - wordStyles.insert(insertStyleIt, style); 365 + words.insert(words.begin() + wordIndex + 1, remainder); 366 + wordStyles.insert(wordStyles.begin() + wordIndex + 1, style); 380 367 381 368 // Continuation flag handling after splitting a word into prefix + remainder. 382 369 // ··· 397 384 // 398 385 // This lets the backtracking loop keep the entire prefix group ("200 Quadrat-") on one 399 386 // line, while "kilometer" moves to the next line. 400 - auto continuesIt = wordContinues.begin(); 401 - std::advance(continuesIt, wordIndex); 402 - // *continuesIt is intentionally left unchanged — the prefix keeps its original attachment. 403 - const auto insertContinuesIt = std::next(continuesIt); 404 - wordContinues.insert(insertContinuesIt, false); 405 - 406 - // Keep the indexed vector in sync if provided. 407 - if (continuesVec) { 408 - // (*continuesVec)[wordIndex] stays unchanged — prefix keeps its attachment. 409 - continuesVec->insert(continuesVec->begin() + wordIndex + 1, false); 410 - } 387 + // wordContinues[wordIndex] is intentionally left unchanged — the prefix keeps its original attachment. 388 + wordContinues.insert(wordContinues.begin() + wordIndex + 1, false); 411 389 412 390 // Update cached widths to reflect the new prefix/remainder pairing. 413 391 wordWidths[wordIndex] = static_cast<uint16_t>(chosenWidth); ··· 467 445 468 446 // Pre-calculate X positions for words 469 447 // Continuation words attach to the previous word with no space before them 470 - std::list<uint16_t> lineXPos; 448 + std::vector<uint16_t> lineXPos; 449 + lineXPos.reserve(lineWordCount); 471 450 472 451 for (size_t wordIdx = 0; wordIdx < lineWordCount; wordIdx++) { 473 452 const uint16_t currentWordWidth = wordWidths[lastBreakAt + wordIdx]; ··· 480 459 xpos += currentWordWidth + (nextIsContinuation ? 0 : spacing); 481 460 } 482 461 483 - // Iterators always start at the beginning as we are moving content with splice below 484 - auto wordEndIt = words.begin(); 485 - auto wordStyleEndIt = wordStyles.begin(); 486 - auto wordContinuesEndIt = wordContinues.begin(); 487 - std::advance(wordEndIt, lineWordCount); 488 - std::advance(wordStyleEndIt, lineWordCount); 489 - std::advance(wordContinuesEndIt, lineWordCount); 490 - 491 - // *** CRITICAL STEP: CONSUME DATA USING SPLICE *** 492 - std::list<std::string> lineWords; 493 - lineWords.splice(lineWords.begin(), words, words.begin(), wordEndIt); 494 - std::list<EpdFontFamily::Style> lineWordStyles; 495 - lineWordStyles.splice(lineWordStyles.begin(), wordStyles, wordStyles.begin(), wordStyleEndIt); 496 - 497 - // Consume continues flags (not passed to TextBlock, but must be consumed to stay in sync) 498 - std::list<bool> lineContinues; 499 - lineContinues.splice(lineContinues.begin(), wordContinues, wordContinues.begin(), wordContinuesEndIt); 462 + // Build line data by moving from the original vectors using index range 463 + std::vector<std::string> lineWords(std::make_move_iterator(words.begin() + lastBreakAt), 464 + std::make_move_iterator(words.begin() + lineBreak)); 465 + std::vector<EpdFontFamily::Style> lineWordStyles(wordStyles.begin() + lastBreakAt, wordStyles.begin() + lineBreak); 500 466 501 467 for (auto& word : lineWords) { 502 468 if (containsSoftHyphen(word)) {
+4 -6
lib/Epub/Epub/ParsedText.h
··· 3 3 #include <EpdFontFamily.h> 4 4 5 5 #include <functional> 6 - #include <list> 7 6 #include <memory> 8 7 #include <string> 9 8 #include <vector> ··· 14 13 class GfxRenderer; 15 14 16 15 class ParsedText { 17 - std::list<std::string> words; 18 - std::list<EpdFontFamily::Style> wordStyles; 19 - std::list<bool> wordContinues; // true = word attaches to previous (no space before it) 16 + std::vector<std::string> words; 17 + std::vector<EpdFontFamily::Style> wordStyles; 18 + std::vector<bool> wordContinues; // true = word attaches to previous (no space before it) 20 19 BlockStyle blockStyle; 21 20 bool extraParagraphSpacing; 22 21 bool hyphenationEnabled; ··· 28 27 int spaceWidth, std::vector<uint16_t>& wordWidths, 29 28 std::vector<bool>& continuesVec); 30 29 bool hyphenateWordAtIndex(size_t wordIndex, int availableWidth, const GfxRenderer& renderer, int fontId, 31 - std::vector<uint16_t>& wordWidths, bool allowFallbackBreaks, 32 - std::vector<bool>* continuesVec = nullptr); 30 + std::vector<uint16_t>& wordWidths, bool allowFallbackBreaks); 33 31 void extractLine(size_t breakIndex, int pageWidth, int spaceWidth, const std::vector<uint16_t>& wordWidths, 34 32 const std::vector<bool>& continuesVec, const std::vector<size_t>& lineBreakIndices, 35 33 const std::function<void(std::shared_ptr<TextBlock>)>& processLine);
+8 -15
lib/Epub/Epub/blocks/TextBlock.cpp
··· 12 12 return; 13 13 } 14 14 15 - auto wordIt = words.begin(); 16 - auto wordStylesIt = wordStyles.begin(); 17 - auto wordXposIt = wordXpos.begin(); 18 15 for (size_t i = 0; i < words.size(); i++) { 19 - const int wordX = *wordXposIt + x; 20 - const EpdFontFamily::Style currentStyle = *wordStylesIt; 21 - renderer.drawText(fontId, wordX, y, wordIt->c_str(), true, currentStyle); 16 + const int wordX = wordXpos[i] + x; 17 + const EpdFontFamily::Style currentStyle = wordStyles[i]; 18 + renderer.drawText(fontId, wordX, y, words[i].c_str(), true, currentStyle); 22 19 23 20 if ((currentStyle & EpdFontFamily::UNDERLINE) != 0) { 24 - const std::string& w = *wordIt; 21 + const std::string& w = words[i]; 25 22 const int fullWordWidth = renderer.getTextWidth(fontId, w.c_str(), currentStyle); 26 23 // y is the top of the text line; add ascender to reach baseline, then offset 2px below 27 24 const int underlineY = y + renderer.getFontAscenderSize(fontId) + 2; ··· 41 38 42 39 renderer.drawLine(startX, underlineY, startX + underlineWidth, underlineY, true); 43 40 } 44 - 45 - std::advance(wordIt, 1); 46 - std::advance(wordStylesIt, 1); 47 - std::advance(wordXposIt, 1); 48 41 } 49 42 } 50 43 ··· 80 73 81 74 std::unique_ptr<TextBlock> TextBlock::deserialize(FsFile& file) { 82 75 uint16_t wc; 83 - std::list<std::string> words; 84 - std::list<uint16_t> wordXpos; 85 - std::list<EpdFontFamily::Style> wordStyles; 76 + std::vector<std::string> words; 77 + std::vector<uint16_t> wordXpos; 78 + std::vector<EpdFontFamily::Style> wordStyles; 86 79 BlockStyle blockStyle; 87 80 88 81 // Word count 89 82 serialization::readPod(file, wc); 90 83 91 - // Sanity check: prevent allocation of unreasonably large lists (max 10000 words per block) 84 + // Sanity check: prevent allocation of unreasonably large vectors (max 10000 words per block) 92 85 if (wc > 10000) { 93 86 LOG_ERR("TXB", "Deserialization failed: word count %u exceeds maximum", wc); 94 87 return nullptr;
+6 -6
lib/Epub/Epub/blocks/TextBlock.h
··· 2 2 #include <EpdFontFamily.h> 3 3 #include <HalStorage.h> 4 4 5 - #include <list> 6 5 #include <memory> 7 6 #include <string> 7 + #include <vector> 8 8 9 9 #include "Block.h" 10 10 #include "BlockStyle.h" ··· 12 12 // Represents a line of text on a page 13 13 class TextBlock final : public Block { 14 14 private: 15 - std::list<std::string> words; 16 - std::list<uint16_t> wordXpos; 17 - std::list<EpdFontFamily::Style> wordStyles; 15 + std::vector<std::string> words; 16 + std::vector<uint16_t> wordXpos; 17 + std::vector<EpdFontFamily::Style> wordStyles; 18 18 BlockStyle blockStyle; 19 19 20 20 public: 21 - explicit TextBlock(std::list<std::string> words, std::list<uint16_t> word_xpos, 22 - std::list<EpdFontFamily::Style> word_styles, const BlockStyle& blockStyle = BlockStyle()) 21 + explicit TextBlock(std::vector<std::string> words, std::vector<uint16_t> word_xpos, 22 + std::vector<EpdFontFamily::Style> word_styles, const BlockStyle& blockStyle = BlockStyle()) 23 23 : words(std::move(words)), 24 24 wordXpos(std::move(word_xpos)), 25 25 wordStyles(std::move(word_styles)),