Skip to content

Commit 8a1c45d

Browse files
committed
[IR] Reformulate LLVM's EH funclet IR
While we have successfully implemented a funclet-oriented EH scheme on top of LLVM IR, our scheme has some notable deficiencies: - catchendpad and cleanupendpad are necessary in the current design but they are difficult to explain to others, even to seasoned LLVM experts. - catchendpad and cleanupendpad are optimization barriers. They cannot be split and force all potentially throwing call-sites to be invokes. This has a noticable effect on the quality of our code generation. - catchpad, while similar in some aspects to invoke, is fairly awkward. It is unsplittable, starts a funclet, and has control flow to other funclets. - The nesting relationship between funclets is currently a property of control flow edges. Because of this, we are forced to carefully analyze the flow graph to see if there might potentially exist illegal nesting among funclets. While we have logic to clone funclets when they are illegally nested, it would be nicer if we had a representation which forbade them upfront. Let's clean this up a bit by doing the following: - Instead, make catchpad more like cleanuppad and landingpad: no control flow, just a bunch of simple operands; catchpad would be splittable. - Introduce catchswitch, a control flow instruction designed to model the constraints of funclet oriented EH. - Make funclet scoping explicit by having funclet instructions consume the token produced by the funclet which contains them. - Remove catchendpad and cleanupendpad. Their presence can be inferred implicitly using coloring information. N.B. The state numbering code for the CLR has been updated but the veracity of it's output cannot be spoken for. An expert should take a look to make sure the results are reasonable. Reviewers: rnk, JosephTremoulet, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D15139 llvm-svn: 255422
1 parent a38312a commit 8a1c45d

File tree

106 files changed

+3078
-6086
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

106 files changed

+3078
-6086
lines changed

llvm/docs/ExceptionHandling.rst

+102-49
Original file line numberDiff line numberDiff line change
@@ -522,16 +522,12 @@ table.
522522
Exception Handling using the Windows Runtime
523523
=================================================
524524

525-
(Note: Windows C++ exception handling support is a work in progress and is not
526-
yet fully implemented. The text below describes how it will work when
527-
completed.)
528-
529525
Background on Windows exceptions
530526
---------------------------------
531527

532-
Interacting with exceptions on Windows is significantly more complicated than on
533-
Itanium C++ ABI platforms. The fundamental difference between the two models is
534-
that Itanium EH is designed around the idea of "successive unwinding," while
528+
Interacting with exceptions on Windows is significantly more complicated than
529+
on Itanium C++ ABI platforms. The fundamental difference between the two models
530+
is that Itanium EH is designed around the idea of "successive unwinding," while
535531
Windows EH is not.
536532

537533
Under Itanium, throwing an exception typically involes allocating thread local
@@ -618,10 +614,11 @@ purposes.
618614

619615
The following new instructions are considered "exception handling pads", in that
620616
they must be the first non-phi instruction of a basic block that may be the
621-
unwind destination of an invoke: ``catchpad``, ``cleanuppad``, and
622-
``terminatepad``. As with landingpads, when entering a try scope, if the
617+
unwind destination of an EH flow edge:
618+
``catchswitch``, ``catchpad``, ``cleanuppad``, and ``terminatepad``.
619+
As with landingpads, when entering a try scope, if the
623620
frontend encounters a call site that may throw an exception, it should emit an
624-
invoke that unwinds to a ``catchpad`` block. Similarly, inside the scope of a
621+
invoke that unwinds to a ``catchswitch`` block. Similarly, inside the scope of a
625622
C++ object with a destructor, invokes should unwind to a ``cleanuppad``. The
626623
``terminatepad`` instruction exists to represent ``noexcept`` and throw
627624
specifications with one combined instruction. All potentially throwing calls in
@@ -634,26 +631,20 @@ generated funclet). A catch handler which reaches its end by normal execution
634631
executes a ``catchret`` instruction, which is a terminator indicating where in
635632
the function control is returned to. A cleanup handler which reaches its end
636633
by normal execution executes a ``cleanupret`` instruction, which is a terminator
637-
indicating where the active exception will unwind to next. A catch or cleanup
638-
handler which is exited by another exception being raised during its execution will
639-
unwind through a ``catchendpad`` or ``cleanuupendpad`` (respectively). The
640-
``catchendpad`` and ``cleanupendpad`` instructions are considered "exception
641-
handling pads" in the same sense that ``catchpad``, ``cleanuppad``, and
642-
``terminatepad`` are.
643-
644-
Each of these new EH pad instructions has a way to identify which
645-
action should be considered after this action. The ``catchpad`` and
646-
``terminatepad`` instructions are terminators, and have a label operand considered
647-
to be an unwind destination analogous to the unwind destination of an invoke. The
648-
``cleanuppad`` instruction is different from the other two in that it is not a
649-
terminator. The code inside a cleanuppad runs before transferring control to the
650-
next action, so the ``cleanupret`` and ``cleanupendpad`` instructions are the
651-
instructions that hold a label operand and unwind to the next EH pad. All of
652-
these "unwind edges" may refer to a basic block that contains an EH pad instruction,
653-
or they may simply unwind to the caller. Unwinding to the caller has roughly the
654-
same semantics as the ``resume`` instruction in the ``landingpad`` model. When
655-
inlining through an invoke, instructions that unwind to the caller are hooked
656-
up to unwind to the unwind destination of the call site.
634+
indicating where the active exception will unwind to next.
635+
636+
Each of these new EH pad instructions has a way to identify which action should
637+
be considered after this action. The ``catchswitch`` and ``terminatepad``
638+
instructions are terminators, and have a unwind destination operand analogous
639+
to the unwind destination of an invoke. The ``cleanuppad`` instruction is not
640+
a terminator, so the unwind destination is stored on the ``cleanupret``
641+
instruction instead. Successfully executing a catch handler should resume
642+
normal control flow, so neither ``catchpad`` nor ``catchret`` instructions can
643+
unwind. All of these "unwind edges" may refer to a basic block that contains an
644+
EH pad instruction, or they may unwind to the caller. Unwinding to the caller
645+
has roughly the same semantics as the ``resume`` instruction in the landingpad
646+
model. When inlining through an invoke, instructions that unwind to the caller
647+
are hooked up to unwind to the unwind destination of the call site.
657648

658649
Putting things together, here is a hypothetical lowering of some C++ that uses
659650
all of the new IR instructions:
@@ -694,33 +685,95 @@ all of the new IR instructions:
694685
call void @"\01??_DCleanup@@QEAA@XZ"(%struct.Cleanup* nonnull %obj) nounwind
695686
br label %return
696687
697-
return: ; preds = %invoke.cont.2, %invoke.cont.3
698-
%retval.0 = phi i32 [ 0, %invoke.cont.2 ], [ %9, %catch ]
688+
return: ; preds = %invoke.cont.3, %invoke.cont.2
689+
%retval.0 = phi i32 [ 0, %invoke.cont.2 ], [ %3, %invoke.cont.3 ]
699690
ret i32 %retval.0
700691
701-
; EH scope code, ordered innermost to outermost:
702-
703-
lpad.cleanup: ; preds = %invoke.cont
704-
%cleanup = cleanuppad []
705-
call void @"\01??_DCleanup@@QEAA@XZ"(%struct.Cleanup* nonnull %obj) nounwind
706-
cleanupret %cleanup unwind label %lpad.catch
692+
lpad.cleanup: ; preds = %invoke.cont.2
693+
%0 = cleanuppad within none []
694+
call void @"\01??1Cleanup@@QEAA@XZ"(%struct.Cleanup* nonnull %obj) nounwind
695+
cleanupret %0 unwind label %lpad.catch
707696
708-
lpad.catch: ; preds = %entry, %lpad.cleanup
709-
%catch = catchpad [%rtti.TypeDescriptor2* @"\01??_R0H@8", i32 0, i32* %e]
710-
to label %catch.body unwind label %catchend
697+
lpad.catch: ; preds = %lpad.cleanup, %entry
698+
%1 = catchswitch within none [label %catch.body] unwind label %lpad.terminate
711699
712700
catch.body: ; preds = %lpad.catch
701+
%catch = catchpad within %1 [%rtti.TypeDescriptor2* @"\01??_R0H@8", i32 0, i32* %e]
713702
invoke void @"\01?may_throw@@YAXXZ"()
714-
to label %invoke.cont.3 unwind label %catchend
703+
to label %invoke.cont.3 unwind label %lpad.terminate
715704
716705
invoke.cont.3: ; preds = %catch.body
717-
%9 = load i32, i32* %e, align 4
718-
catchret %catch to label %return
706+
%3 = load i32, i32* %e, align 4
707+
catchret from %catch to label %return
708+
709+
lpad.terminate: ; preds = %catch.body, %lpad.catch
710+
terminatepad within none [void ()* @"\01?terminate@@YAXXZ"] unwind to caller
711+
}
712+
713+
Funclet parent tokens
714+
-----------------------
719715

720-
catchend: ; preds = %lpad.catch, %catch.body
721-
catchendpad unwind label %lpad.terminate
716+
In order to produce tables for EH personalities that use funclets, it is
717+
necessary to recover the nesting that was present in the source. This funclet
718+
parent relationship is encoded in the IR using tokens produced by the new "pad"
719+
instructions. The token operand of a "pad" or "ret" instruction indicates which
720+
funclet it is in, or "none" if it is not nested within another funclet.
722721

723-
lpad.terminate: ; preds = %catchend
724-
terminatepad [void ()* @"\01?terminate@@YAXXZ"]
725-
unwind to caller
722+
The ``catchpad`` and ``cleanuppad`` instructions establish new funclets, and
723+
their tokens are consumed by other "pad" instructions to establish membership.
724+
The ``catchswitch`` instruction does not create a funclet, but it produces a
725+
token that is always consumed by its immediate successor ``catchpad``
726+
instructions. This ensures that every catch handler modelled by a ``catchpad``
727+
belongs to exactly one ``catchswitch``, which models the dispatch point after a
728+
C++ try. The ``terminatepad`` instruction cannot contain lexically nested
729+
funclets inside the termination action, so it does not produce a token.
730+
731+
Here is an example of what this nesting looks like using some hypothetical
732+
C++ code:
733+
734+
.. code-block:: c
735+
736+
void f() {
737+
try {
738+
throw;
739+
} catch (...) {
740+
try {
741+
throw;
742+
} catch (...) {
743+
}
744+
}
726745
}
746+
747+
.. code-block:: llvm
748+
define void @f() #0 personality i8* bitcast (i32 (...)* @__CxxFrameHandler3 to i8*) {
749+
entry:
750+
invoke void @_CxxThrowException(i8* null, %eh.ThrowInfo* null) #1
751+
to label %unreachable unwind label %catch.dispatch
752+
753+
catch.dispatch: ; preds = %entry
754+
%0 = catchswitch within none [label %catch] unwind to caller
755+
756+
catch: ; preds = %catch.dispatch
757+
%1 = catchpad within %0 [i8* null, i32 64, i8* null]
758+
invoke void @_CxxThrowException(i8* null, %eh.ThrowInfo* null) #1
759+
to label %unreachable unwind label %catch.dispatch2
760+
761+
catch.dispatch2: ; preds = %catch
762+
%2 = catchswitch within %1 [label %catch3] unwind to caller
763+
764+
catch3: ; preds = %catch.dispatch2
765+
%3 = catchpad within %2 [i8* null, i32 64, i8* null]
766+
catchret from %3 to label %try.cont
767+
768+
try.cont: ; preds = %catch3
769+
catchret from %1 to label %try.cont6
770+
771+
try.cont6: ; preds = %try.cont
772+
ret void
773+
774+
unreachable: ; preds = %catch, %entry
775+
unreachable
776+
}
777+
778+
The "inner" ``catchswitch`` consumes ``%1`` which is produced by the outer
779+
catchswitch.

0 commit comments

Comments
 (0)