this repo has no description
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

internal/mod/modimports: avoid extra fs.Stat calls in AllModuleFiles

As reported by a user on a large repository with many hundreds
of directories, CUE_EXPERIMENT=modules now being the default
caused a sudden increase in the number of file operations being done
by cmd/cue when loading packages, causing a slow-down.

We can easily reproduce the issue via strace in our CUE repository,
using `cue fmt ./internal/ci` as an example to load one CUE package.
Where we used to only stat or open 8 cue.mod files, we now do 360:

$ CUE_EXPERIMENT=modules=0 strace -f -t -e trace=file cue fmt ./internal/ci |& grep 'cue\.mod"' | wc -l
8
$ strace -f -t -e trace=file cue fmt ./internal/ci |& grep 'cue\.mod"' | wc -l
360

The culprit turned out to be AllModuleFiles; the way it needs to skip
walking nested CUE modules did not fit well with the fs.WalkDir API.
Using fs.ReadDir and recursive func calls instead works better:

$ strace -f -t -e trace=file cue fmt ./internal/ci |& grep 'cue\.mod"' | wc -l
8

Updates #3155.

Signed-off-by: Daniel Martí <mvdan@mvdan.cc>
Change-Id: I32dc0c39ea795f795077feff88475d38cb324433
Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1194921
Reviewed-by: Paul Jolly <paul@myitcv.io>
Unity-Result: CUE porcuepine <cue.porcuepine@gmail.com>
TryBot-Result: CUEcueckoo <cueckoo@cuelang.org>

+39 -38
+39 -38
internal/mod/modimports/modimports.go
··· 1 1 package modimports 2 2 3 3 import ( 4 - "errors" 5 4 "fmt" 6 5 "io/fs" 7 6 "path" ··· 88 87 // module at the given root. 89 88 func AllModuleFiles(fsys fs.FS, root string) func(func(ModuleFile, error) bool) { 90 89 return func(yield func(ModuleFile, error) bool) { 91 - fs.WalkDir(fsys, root, func(fpath string, d fs.DirEntry, err error) (_err error) { 92 - if err != nil { 93 - if !yield(ModuleFile{ 94 - FilePath: fpath, 95 - }, err) { 96 - return fs.SkipAll 97 - } 98 - return nil 99 - } 100 - if path.Base(fpath) == "cue.mod" { 101 - return fs.SkipDir 90 + yieldAllModFiles(fsys, root, true, yield) 91 + } 92 + } 93 + 94 + // yieldAllModFiles implements AllModuleFiles by recursing into directories. 95 + // 96 + // Note that we avoid [fs.WalkDir]; it yields directory entries in lexical order, 97 + // so we would walk `foo/bar.cue` before walking `foo/cue.mod/` and realizing 98 + // that `foo/` is a nested module that we should be ignoring entirely. 99 + // That could be avoided via extra `fs.Stat` calls, but those are extra fs calls. 100 + // Using [fs.ReadDir] avoids this issue entirely, as we can loop twice. 101 + func yieldAllModFiles(fsys fs.FS, fpath string, topDir bool, yield func(ModuleFile, error) bool) bool { 102 + entries, err := fs.ReadDir(fsys, fpath) 103 + if err != nil { 104 + if !yield(ModuleFile{ 105 + FilePath: fpath, 106 + }, err) { 107 + return false 108 + } 109 + } 110 + // Skip nested submodules entirely. 111 + if !topDir { 112 + for _, entry := range entries { 113 + if entry.Name() == "cue.mod" { 114 + return true 102 115 } 103 - if d.IsDir() { 104 - if fpath == root { 105 - return nil 106 - } 107 - base := path.Base(fpath) 108 - if strings.HasPrefix(base, ".") || strings.HasPrefix(base, "_") { 109 - return fs.SkipDir 110 - } 111 - _, err := fs.Stat(fsys, path.Join(fpath, "cue.mod")) 112 - if err == nil { 113 - // TODO is it enough to have a cue.mod directory 114 - // or should we look for cue.mod/module.cue too? 115 - return fs.SkipDir 116 - } 117 - if !errors.Is(err, fs.ErrNotExist) { 118 - // We haven't got a package file to produce with the 119 - // error here. Should we just ignore the error or produce 120 - // a ModuleFile with an empty path? 121 - yield(ModuleFile{}, err) 122 - return fs.SkipAll 123 - } 124 - return nil 116 + } 117 + } 118 + for _, entry := range entries { 119 + name := entry.Name() 120 + fpath := path.Join(fpath, name) 121 + if entry.IsDir() { 122 + if name == "cue.mod" || strings.HasPrefix(name, ".") || strings.HasPrefix(name, "_") { 123 + continue 125 124 } 126 - if !yieldPackageFile(fsys, fpath, "*", yield) { 127 - return fs.SkipAll 125 + if !yieldAllModFiles(fsys, fpath, false, yield) { 126 + return false 128 127 } 129 - return nil 130 - }) 128 + } else if !yieldPackageFile(fsys, fpath, "*", yield) { 129 + return false 130 + } 131 131 } 132 + return true 132 133 } 133 134 134 135 func yieldPackageFile(fsys fs.FS, fpath, pkgQualifier string, yield func(ModuleFile, error) bool) bool {