Real-time index of opencode sessions
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Add function breakdown analysis to discovery docs

rektide 37092096 db417d89

+348
+348
doc/discovery/breakdown.md
··· 1 + # Function Breakdown Analysis 2 + 3 + This document identifies opportunities to refactor large functions into smaller, more focused methods, following the pattern established in `SessionIndex::build` which was split into `build_session` and `build_message`. 4 + 5 + ## Summary of Refactoring Pattern: Extract Method 6 + 7 + The **Extract Method** refactoring involves: 8 + 1. Identifying a cohesive block of code within a larger function 9 + 2. Creating a new method with a descriptive name 10 + 3. Moving the code block to the new method 11 + 4. Calling the new method from the original location 12 + 13 + **Benefits:** 14 + - Improved readability through self-documenting method names 15 + - Easier testing of isolated functionality 16 + - Reduced cognitive load when understanding code 17 + - Better code reuse opportunities 18 + 19 + --- 20 + 21 + ## File-by-File Analysis 22 + 23 + ### `src/storage/reader.rs` 24 + 25 + **Issue:** Three nearly identical list methods with duplicated logic 26 + 27 + The methods `list_sessions`, `list_messages`, and `list_parts` share the same pattern: 28 + 1. Get directory path 29 + 2. Check if directory exists (return empty vec if not) 30 + 3. Read directory entries 31 + 4. Filter for `.json` files 32 + 5. Parse IDs from filenames 33 + 6. Sort results 34 + 35 + **Lines affected:** 73-140 (67 lines total, 3 methods ~20 lines each) 36 + 37 + **Recommendation:** Extract a generic `list_ids_in_dir` helper 38 + 39 + ```rust 40 + // BEFORE: Three separate methods with duplicated logic 41 + 42 + pub fn list_sessions(&self, project_id: &str) -> Result<Vec<SessionId>> { 43 + let dir = self.paths.session_dir(project_id); 44 + if !dir.exists() { 45 + return Ok(Vec::new()); 46 + } 47 + 48 + let mut sessions = Vec::new(); 49 + for entry in std::fs::read_dir(&dir)? { 50 + let entry = entry?; 51 + let path = entry.path(); 52 + if path.extension().map(|e| e == "json").unwrap_or(false) { 53 + if let Some(filename) = path.file_name().and_then(|n| n.to_str()) { 54 + if let Some(id) = SessionId::from_filename(filename) { 55 + sessions.push(id); 56 + } 57 + } 58 + } 59 + } 60 + 61 + sessions.sort_by(|a, b| b.as_str().cmp(a.as_str())); 62 + Ok(sessions) 63 + } 64 + 65 + pub fn list_messages(&self, session_id: &SessionId) -> Result<Vec<MessageId>> { 66 + // Nearly identical implementation... 67 + } 68 + 69 + pub fn list_parts(&self, message_id: &MessageId) -> Result<Vec<PartId>> { 70 + // Nearly identical implementation... 71 + } 72 + ``` 73 + 74 + ```rust 75 + // AFTER: Generic helper with specific wrappers 76 + 77 + fn list_ids_in_dir<T, F>(&self, dir: &Path, parse_fn: F, sort_desc: bool) -> Result<Vec<T>> 78 + where 79 + F: Fn(&str) -> Option<T>, 80 + { 81 + if !dir.exists() { 82 + return Ok(Vec::new()); 83 + } 84 + 85 + let mut ids = Vec::new(); 86 + for entry in std::fs::read_dir(dir)? { 87 + let entry = entry?; 88 + let path = entry.path(); 89 + if path.extension().map(|e| e == "json").unwrap_or(false) { 90 + if let Some(filename) = path.file_name().and_then(|n| n.to_str()) { 91 + if let Some(id) = parse_fn(filename) { 92 + ids.push(id); 93 + } 94 + } 95 + } 96 + } 97 + 98 + if sort_desc { 99 + ids.sort_by(|a, b| b.to_string().cmp(&a.to_string())); 100 + } else { 101 + ids.sort_by(|a, b| a.to_string().cmp(&b.to_string())); 102 + } 103 + Ok(ids) 104 + } 105 + 106 + pub fn list_sessions(&self, project_id: &str) -> Result<Vec<SessionId>> { 107 + let dir = self.paths.session_dir(project_id); 108 + self.list_ids_in_dir(&dir, SessionId::from_filename, true) 109 + } 110 + 111 + pub fn list_messages(&self, session_id: &SessionId) -> Result<Vec<MessageId>> { 112 + let dir = self.paths.message_dir(session_id); 113 + self.list_ids_in_dir(&dir, MessageId::from_filename, false) 114 + } 115 + 116 + pub fn list_parts(&self, message_id: &MessageId) -> Result<Vec<PartId>> { 117 + let dir = self.paths.part_dir(message_id); 118 + self.list_ids_in_dir(&dir, PartId::from_filename, false) 119 + } 120 + ``` 121 + 122 + **Impact:** Reduces ~67 lines to ~40 lines, eliminates duplication 123 + 124 + --- 125 + 126 + ### `src/storage/paths.rs` 127 + 128 + **Issue:** `project_dirs` method has nested conditionals 129 + 130 + **Lines affected:** 95-111 (17 lines) 131 + 132 + **Recommendation:** Extract directory entry filtering into a helper 133 + 134 + ```rust 135 + // BEFORE: Nested conditionals in project_dirs 136 + 137 + pub fn project_dirs(&self) -> Result<Vec<String>> { 138 + let mut projects = Vec::new(); 139 + let entries = std::fs::read_dir(&self.session).map_err(Error::Io)?; 140 + 141 + for entry in entries { 142 + let entry = entry.map_err(Error::Io)?; 143 + if entry.file_type().map_err(Error::Io)?.is_dir() { 144 + if let Some(name) = entry.file_name().to_str() { 145 + if !name.starts_with('.') { 146 + projects.push(name.to_string()); 147 + } 148 + } 149 + } 150 + } 151 + 152 + Ok(projects) 153 + } 154 + ``` 155 + 156 + ```rust 157 + // AFTER: Helper method for filtering 158 + 159 + fn is_visible_directory(entry: &std::fs::DirEntry) -> Option<String> { 160 + let file_type = entry.file_type().ok()?; 161 + if !file_type.is_dir() { 162 + return None; 163 + } 164 + let name = entry.file_name().to_str()?.to_string(); 165 + if name.starts_with('.') { 166 + return None; 167 + } 168 + Some(name) 169 + } 170 + 171 + pub fn project_dirs(&self) -> Result<Vec<String>> { 172 + let entries = std::fs::read_dir(&self.session)?; 173 + let projects: Vec<String> = entries 174 + .filter_map(|e| e.ok()) 175 + .filter_map(|e| Self::is_visible_directory(&e)) 176 + .collect(); 177 + Ok(projects) 178 + } 179 + ``` 180 + 181 + **Impact:** Cleaner code, easier to test the filtering logic 182 + 183 + --- 184 + 185 + ### `src/storage/mmap.rs` 186 + 187 + **Issue:** `MappedFileCache::get` has nested scopes for lock management 188 + 189 + **Lines affected:** 53-69 (17 lines) 190 + 191 + **Recommendation:** Extract cache lookup into a separate method 192 + 193 + ```rust 194 + // BEFORE: Nested scopes with explicit lock management 195 + 196 + pub fn get(&self, path: &Path) -> Result<Arc<MappedFile>> { 197 + { 198 + let files = self.files.read(); 199 + if let Some((_, cached)) = files.iter().find(|(p, _)| p == path) { 200 + return Ok(Arc::clone(cached)); 201 + } 202 + } 203 + 204 + let mapped = Arc::new(MappedFile::open(path)?); 205 + 206 + { 207 + let mut files = self.files.write(); 208 + files.push((path.to_path_buf(), Arc::clone(&mapped))); 209 + } 210 + 211 + Ok(mapped) 212 + } 213 + ``` 214 + 215 + ```rust 216 + // AFTER: Extract cache operations 217 + 218 + fn find_cached(&self, path: &Path) -> Option<Arc<MappedFile>> { 219 + let files = self.files.read(); 220 + files 221 + .iter() 222 + .find(|(p, _)| p == path) 223 + .map(|(_, cached)| Arc::clone(cached)) 224 + } 225 + 226 + fn insert(&self, path: PathBuf, file: Arc<MappedFile>) { 227 + let mut files = self.files.write(); 228 + files.push((path, file)); 229 + } 230 + 231 + pub fn get(&self, path: &Path) -> Result<Arc<MappedFile>> { 232 + if let Some(cached) = self.find_cached(path) { 233 + return Ok(cached); 234 + } 235 + 236 + let mapped = Arc::new(MappedFile::open(path)?); 237 + self.insert(path.to_path_buf(), Arc::clone(&mapped)); 238 + Ok(mapped) 239 + } 240 + ``` 241 + 242 + **Impact:** Clearer separation of read/write operations, easier to reason about 243 + 244 + --- 245 + 246 + ### `src/types/part.rs` 247 + 248 + **Issue:** `Part::id()` and `Part::message_id()` have repetitive match expressions 249 + 250 + **Lines affected:** 24-56 (33 lines, but repetitive) 251 + 252 + **Recommendation:** Use a macro or base trait to reduce boilerplate 253 + 254 + ```rust 255 + // BEFORE: Repetitive match expressions 256 + 257 + impl Part { 258 + pub fn id(&self) -> &PartId { 259 + match self { 260 + Part::Text(p) => &p.base.id, 261 + Part::Reasoning(p) => &p.base.id, 262 + Part::Tool(p) => &p.base.id, 263 + // ... 9 more variants 264 + } 265 + } 266 + 267 + pub fn message_id(&self) -> &MessageId { 268 + match self { 269 + Part::Text(p) => &p.base.message_id, 270 + Part::Reasoning(p) => &p.base.message_id, 271 + Part::Tool(p) => &p.base.message_id, 272 + // ... 9 more variants 273 + } 274 + } 275 + } 276 + ``` 277 + 278 + ```rust 279 + // AFTER: Macro to generate accessors 280 + 281 + macro_rules! part_accessors { 282 + ($self:ident, $field:ident) => { 283 + match $self { 284 + Part::Text(p) => &p.base.$field, 285 + Part::Reasoning(p) => &p.base.$field, 286 + Part::Tool(p) => &p.base.$field, 287 + Part::File(p) => &p.base.$field, 288 + Part::StepStart(p) => &p.base.$field, 289 + Part::StepFinish(p) => &p.base.$field, 290 + Part::Snapshot(p) => &p.base.$field, 291 + Part::Patch(p) => &p.base.$field, 292 + Part::Agent(p) => &p.base.$field, 293 + Part::Subtask(p) => &p.base.$field, 294 + Part::Compaction(p) => &p.base.$field, 295 + Part::Retry(p) => &p.base.$field, 296 + } 297 + }; 298 + } 299 + 300 + impl Part { 301 + pub fn id(&self) -> &PartId { 302 + part_accessors!(self, id) 303 + } 304 + 305 + pub fn message_id(&self) -> &MessageId { 306 + part_accessors!(self, message_id) 307 + } 308 + 309 + pub fn session_id(&self) -> &SessionId { 310 + part_accessors!(self, session_id) 311 + } 312 + } 313 + ``` 314 + 315 + **Impact:** Easier to add new accessors, reduced boilerplate, still type-safe 316 + 317 + --- 318 + 319 + ## Files That Are Already Well-Structured 320 + 321 + ### `src/index.rs` 322 + Already refactored with `build_session` and `build_message` extracted from `build`. This is the pattern to follow. 323 + 324 + ### `src/materializer.rs` 325 + All methods are small (<15 lines) and focused on single responsibilities. No changes needed. 326 + 327 + ### `src/loader.rs` 328 + Methods are appropriately sized and well-named. The delegation pattern to `FileReader` is clean. 329 + 330 + ### `src/error.rs`, `src/lib.rs`, `src/types/mod.rs`, `src/storage/mod.rs` 331 + These are simple module/struct definitions with no complex functions. 332 + 333 + --- 334 + 335 + ## Priority Ranking 336 + 337 + 1. **High:** `reader.rs` - Most duplication, clearest win 338 + 2. **Medium:** `mmap.rs` - Improves lock management clarity 339 + 3. **Low:** `paths.rs` - Minor improvement 340 + 4. **Low:** `part.rs` - Cosmetic, but reduces maintenance burden 341 + 342 + --- 343 + 344 + ## Recommended Next Steps 345 + 346 + 1. Start with `reader.rs` - extract `list_ids_in_dir` generic helper 347 + 2. Consider `mmap.rs` refactoring for better cache operation separation 348 + 3. Add `session_id()` accessor to `Part` using the macro pattern