-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
GH-113464: Generate a more efficient JIT #118512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@savannahostrowski, I'd love to get your review of this if you have a few cycles. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments and questions but after sitting and reading through this code a bunch over the last week or two, I'm excited about how much more readable this will get with this change! 💆♀️
yield "" | ||
|
||
|
||
def dump(groups: dict[str, _stencils.StencilGroup]) -> typing.Iterator[str]: | ||
"""Yield a JIT compiler line-by-line as a C header file.""" | ||
yield from _dump_header() | ||
for opname, group in groups.items(): | ||
for opname, group in sorted(groups.items()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason that this needs to be sorted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, I just like it that way (if you couldn't tell by now). ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding in the comment about the naming conventions - I think that helps! Otherwise, this looks pretty solid to me (barring some Windows CI failures). Lots of moving things into function but it's a whole lot more readable! 🎉
Windows JIT CI fixed in GH-118564. |
This breaks up the JIT into smaller functions, reduces a lot of branching in hot inner loops, and generally makes the C code cleaner (and probably faster).
Currently, we generate declarative structures at build time that we then loop over in order to emit the desired machine code at runtime. For example, the
_STORE_FAST
stencil looks like this:This very general approach means that we have a lot of complex logic in our hot inner loop to decode instructions and set up values for patching that may not even be needed. It's also very branchy, since we're essentially "interpreting" the array of holes for each instruction.
With this PR,
jit_stencils.h
instead contains the following function:This function is called directly to emit the machine code for every
_STORE_FAST
instruction, and hardcodes the logic for all of the necessary copies and patches. The result is one indirect call, no unnecessary branching, and (in my opinion) cleaner code, since a lot of the tricky logic is now hidden away in generated files.I know this is right before feature freeze, but I'd really like to get this in 3.13 since it will make backporting any fixes much easier. It doesn't change the actual jitted code in any way.
Note to reviewers: the diff is a bit messy, so it may make more sense to compare the before-vs-after files side-by-side instead.