Skip to content
This repository was archived by the owner on Nov 1, 2021. It is now read-only.

Commit 131d7fb

Browse files
committed
[PerformanceTips] Document various items folks have suggested
This could stand to be expanded - patches welcome! - but let's at least write them down so they don't get forgotten. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230995 91177308-0d34-0410-b5e6-96231b3b80d8
1 parent de83324 commit 131d7fb

File tree

1 file changed

+45
-0
lines changed

1 file changed

+45
-0
lines changed

docs/Frontend/PerformanceTips.rst

+45
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,51 @@ operations for safety. If your source language provides information about
4747
the range of the index, you may wish to manually extend indices to machine
4848
register width using a zext instruction.
4949

50+
Other things to consider
51+
=========================
52+
53+
#. Make sure that a DataLayout is provided (this will likely become required in
54+
the near future, but is certainly important for optimization).
55+
56+
#. Add nsw/nuw/fast-math flags as appropriate
57+
58+
#. Add noalias/align/dereferenceable/nonnull to function arguments and return
59+
values as appropriate
60+
61+
#. Mark functions as readnone/readonly/nounwind when known (especially for
62+
external functions)
63+
64+
#. Use ptrtoint/inttoptr sparingly (they interfere with pointer aliasing
65+
analysis), prefer GEPs
66+
67+
#. Use the lifetime.start/lifetime.end and invariant.start/invariant.end
68+
intrinsics where possible. Common profitable uses are for stack like data
69+
structures (thus allowing dead store elimination) and for describing
70+
life times of allocas (thus allowing smaller stack sizes).
71+
72+
#. Use pointer aliasing metadata, especially tbaa metadata, to communicate
73+
otherwise-non-deducible pointer aliasing facts
74+
75+
#. Use the "most-private" possible linkage types for the functions being defined
76+
(private, internal or linkonce_odr preferably)
77+
78+
#. Mark invariant locations using !invariant.load and TBAA's constant flags
79+
80+
#. Prefer globals over inttoptr of a constant address - this gives you
81+
dereferencability information. In MCJIT, use getSymbolAddress to provide
82+
actual address.
83+
84+
#. Be wary of ordered and atomic memory operations. They are hard to optimize
85+
and may not be well optimized by the current optimizer. Depending on your
86+
source language, you may consider using fences instead.
87+
88+
#. If you language uses range checks, consider using the IRCE pass. It is not
89+
currently part of the standard pass order.
90+
91+
p.s. If you want to help improve this document, patches expanding any of the
92+
above items into standalone sections of their own with a more complete
93+
discussion would be very welcome.
94+
5095

5196
Adding to this document
5297
=======================

0 commit comments

Comments
 (0)