Skip to content

Commit a7d5bf4

Browse files
authored
fix: improve .gitignore iteration speed (scikit-build#1103)
Using `os.walk` instead of `Path().rglob` can be 10x faster in deeply nested directory structures. This is a significant saving on certain projects, where just this iteration can take up to 3 or 4 seconds. An even better approach would be to parse gitignore files lazily, as the directory tree is traversed later in this function. Another optimization would be to prune ignored directories while walking the directory tree, so they aren't even explored. However, just the single optimization in this PR already reduces the impact of this code significantly, bringing it from almost 3s in my profile to just 0.4s.
1 parent faad1bb commit a7d5bf4

File tree

1 file changed

+5
-4
lines changed

1 file changed

+5
-4
lines changed

src/scikit_build_core/build/_file_processor.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -46,11 +46,12 @@ def each_unignored_file(
4646
global_exclude_lines += f.readlines()
4747

4848
nested_excludes = {
49-
p.parent: pathspec.GitIgnoreSpec.from_lines(
50-
p.read_text(encoding="utf-8").splitlines()
49+
Path(dirpath): pathspec.GitIgnoreSpec.from_lines(
50+
(Path(dirpath) / filename).read_text(encoding="utf-8").splitlines()
5151
)
52-
for p in Path().rglob("**/.gitignore")
53-
if p != Path(".gitignore")
52+
for dirpath, _, filenames in os.walk(".")
53+
for filename in filenames
54+
if filename == ".gitignore" and dirpath != "."
5455
}
5556

5657
exclude_build_dir = build_dir.format(**pyproject_format(dummy=True))

0 commit comments

Comments
 (0)