Redesigned Components iterator to use front and back indexing instead mutating and subslicing path field#156496
Redesigned Components iterator to use front and back indexing instead mutating and subslicing path field#156496asder8215 wants to merge 9 commits into
Components iterator to use front and back indexing instead mutating and subslicing path field#156496Conversation
|
rustbot has assigned @Mark-Simulacrum. Use Why was this reviewer chosen?The reviewer was selected based on:
|
This comment has been minimized.
This comment has been minimized.
1627e2f to
33e69e1
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
33e69e1 to
ed9d33d
Compare
This comment has been minimized.
This comment has been minimized.
ed9d33d to
0b0f84c
Compare
This comment has been minimized.
This comment has been minimized.
0b0f84c to
8ed33ea
Compare
This comment was marked as outdated.
This comment was marked as outdated.
2151b8f to
83cdbed
Compare
This comment was marked as outdated.
This comment was marked as outdated.
83cdbed to
3921fff
Compare
0a25dda to
92e0132
Compare
|
New benchmarking results. You can see what the benchmark code looks like here and run it yourself to see if there are any difference in measurements on your end: This is the measurement of the current implementation of This is the measurement of the new implementation of Edit: Updated |
|
Here are the benchmark results with black box: From current From this Edit: Updated Edit 2: Took off Path ordering benchmark here since it was incorrect see below to see corrected path ordering benchmarks. |
92e0132 to
574d7f2
Compare
|
I'm confident this code works (passed CI in previous run, the current amended commit change I made doesn't change logic, but makes the code written in a more idiomatic way). In my opinion, the logic in this code should look more readable than how |
|
@rustbot label +I-libs-nominated Since |
| && prefix.is_verbatim() | ||
| && !path.inner.is_empty() | ||
| { | ||
| let comps = self.components(); |
There was a problem hiding this comment.
Not sure how likely the redundant prefix-checking is to get elided here in optimisation, but perhaps the fact that you need to recreate the components iterator again might be a reason to just start with that, maybe gated by HAS_PREFIXES.
There was a problem hiding this comment.
I have the components iterator created earlier in the Windows/HAS_PREFIXES path. Unix and all the other platforms with !HAS_PREFIXES don't need to rely on components iterator at all.
With redundant prefix checking, the only optimization I could think about that is to have a parse_prefix_unchecked method that constructs a Prefix<'_> given two OsStr and something like a prefix tag enum (same enum names as the members in Prefix just without the field info that these members have) (though maybe a better name than parse should be used here since it's constructing from given arguments instead of parsing).
|
So, after the investigation I wanted to do, it does seem that we have a bit of confusion regarding how Can probably move onto verifying perf after addressing a few comments. |
|
Will address these comments when I get back to my PC! I also wanted to check in with the team that handles Rust releases (maybe also Cargo or Rustup team) to verify if I'm building |
53dd52e to
2ae818f
Compare
This comment has been minimized.
This comment has been minimized.
… of mutating and subslicing path field; as a result, Components iterator memory size goes from 64 bytes to 40 bytes and as_path does not use cloning at all
…ity, added safety comments, and check for root dir after Prefix component (e.g., '\\?\checkout\src\tools' should produce Prefix, RootDir, Normal, Normal, None, ...) in Components::parse_single_component
…ng iter().position()/rposition()
…here to use iter().position()/.iter().rposition(), refactored code in compare_components, and removed stale comments
…omponents::normalize_back instead, refactored Components::as_path code
…e in previous implementation, but making it work with Components<'_> front index
…nt_front and consume_first_component_back. Also introduced aggressive inlining.
…comparison (normalizing paths if needed) and return Ordering Equal/Greater/Less if possible before needing to fall back on Iterator::cmp
2ae818f to
21ccd83
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
Just occurred to me that a pretty good benchmark to represent what I was doing (where the performance was so bad) would be to aggregate a few file lists based upon the contents of various filesystems (I'm thinking, docker images), shuffling them, and storing them in a (And if we want to only benchmark equality, not ordering, using I would be down to collect those file lists, and writing the code would be pretty easy for a benchmark. Apparently there are some benchmarks in the standard library via Will try to work on a proper benchmark crate later if I remember. |
… make it simpler or have it documented in the appropriate areas, and renamed a couple of methods/field members, and simplified Components::next_back relative first component branch
21ccd83 to
c10aaf6
Compare
That sounds good to me. I got some assistance from Jakub on Zulip a few days ago and I built On Debug build: rustc 1.98.0 nightly compiler On release build: rustc 1.98.0 nightly compiler There's a marginal improvement from the PGO optimized Components build on release (~3% improvement on average). However, I'm unsure if my cargo build time is a perfect comparison because it's possible that this PR rustc may not match with 1.98 rustc since my branch for this PR may be outdated in terms of other functions or features added to the standard library within 1.98.
Sure, I'll leave you to that. I'm curious to see if there would be any big performance differences. On a separate note, I was thinking about the ACP suggestion you mentioned earlier, and I was thinking that this Going back to the ACP suggestion idea, I think we can just do something similar to how In a future/separate PR, it would also be beneficial to put the |
View all comments
This PR entirely changes how
Components<'_>is implemented. Currently, theComponents<'_>iterator 'consumes' components through mutating its path field to a subslice that presents the left over unconsumed path components (this consumed path component is what's returned inComponents::nextorComponents::next_back). However, this PR keeps the path field alive/unmodified and uses front and back indexing strategy to extract consumed/unconsumed components.This PR benefits implementations like
Components::as_path, which is pretty used is multiple areas of the standard library. Previously,Components<'_>iterator was required to clone inside the function to present the unconsumed path because our originalComponent<'_>consuming behavior on path will not allow the returned&'a PathfromComponents::as_pathto last after aComponents::nextorComponents::next_backcall. Due to the current implementation ofComponentsiterator has a size of 64 bytes, if you're usingComponents::as_pathafter eachComponents::next/Components::next_back, then it's pretty unfortunate to be cloning 64 bytes again and again, especially if each of your path components are a few bytes (e.g., "foo/bar/baz").On the point of size, with the indexing strategy, this PR has further optimized the size of
Components<'_>from 64 bytes -> 40 bytes since a large chunk of theComponents<'_>was taken up by theOption<Prefix>(this takes up 40 bytes). Instead of holding a prefix field inComponents<'_>, we can encode the length of thePrefixwithin ourfrontfield index and use another enum calledFirstComponentto check whether our first component of the given path isPrefix(or something else). If it's aPrefix, we can useparse_prefixon the subsliceself.path[..self.front]since we know our front index encodes thePrefixlength.Due to not having the prefix
Option<Prefix>field insideComponents<'_>anymore, all the prefix functions inComponents<'_>have been removed in favor of callingparse_prefix,Prefix::is_verbatim,Prefix::is_drive, etc.I'm curious if this redesign of
Components<'_>improves Path equality as pointed out by @clarfonthey in #154521 with Path equality (not to be confused with Path ordering as mentioned in the issue, since that usesComponents:::compare_componentsand the example code shows equality) being slow.I haven't benchmarked this though.I have benchmarked the result and I can say that currently this implementation improves Path equality due toComponents::next_backrunning faster with this implementation than the current mutating path with a subslice implementation. However, Path ordering runs slightly slower. You can check the benchmark code I used here, and play around with the number of bytes in a component, the number of components, etc..Right now, when I tested it locally on my PC (Fedora OS), it passed all the standard library tests and rust analyzer didn't crash on me (had a few crash reports coming from rust analyzer early on when I messed around with
Components<'_>dealing something with threads usingPath::components, but now that's all resolved).I have not tested this on Windows yet, and I would probably need someone to help me test on this platform as my Windows VM is not working properly to run the standard library test suite.There's a lot of things being done here, and possibly there may be better approaches or ways I could improve this implementation or write the code in a neater way here. I am open to any advice or feedback on this approach.
Update: I got to testing some things out with Prefixes on my Windows VM manually, so the prefix component index encoded into the
Components<'_>front field seems to work out nicely. I've also accounted for root directory being able to exists after a Prefix component like "\?\checkout\src\tools" having the following components:PrefixVerbatim->RootDir->Normal->Normal->None(learnt this from the fail that occurred in miri tests, which is nice to see thisComponents<'_>implementation works on the Windows tests in CI).