Skip to content

Commit 8b6c314

Browse files
committed
[TableGen][SubtargetEmitter] Add the ability for processor models to describe dependency breaking instructions.
This patch adds the ability for processor models to describe dependency breaking instructions. Different processors may specify a different set of dependency-breaking instructions. That means, we cannot assume that all processors of the same target would use the same rules to classify dependency breaking instructions. The main goal of this patch is to provide the means to describe dependency breaking instructions directly via tablegen, and have the following TargetSubtargetInfo hooks redefined in overrides by tabegen'd XXXGenSubtargetInfo classes (here, XXX is a Target name). ``` virtual bool isZeroIdiom(const MachineInstr *MI, APInt &Mask) const { return false; } virtual bool isDependencyBreaking(const MachineInstr *MI, APInt &Mask) const { return isZeroIdiom(MI); } ``` An instruction MI is a dependency-breaking instruction if a call to method isDependencyBreaking(MI) on the STI (TargetSubtargetInfo object) evaluates to true. Similarly, an instruction MI is a special case of zero-idiom dependency breaking instruction if a call to STI.isZeroIdiom(MI) returns true. The extra APInt is used for those targets that may want to select which machine operands have their dependency broken (see comments in code). Note that by default, subtargets don't know about the existence of dependency-breaking. In the absence of external information, those method calls would always return false. A new tablegen class named STIPredicate has been added by this patch to let processor models classify instructions that have properties in common. The idea is that, a MCInstrPredicate definition can be used to "generate" an instruction equivalence class, with the idea that instructions of a same class all have a property in common. STIPredicate definitions are essentially a collection of instruction equivalence classes. Also, different processor models can specify a different variant of the same STIPredicate with different rules (i.e. predicates) to classify instructions. Tablegen backends (in this particular case, the SubtargetEmitter) will be able to process STIPredicate definitions, and automatically generate functions in XXXGenSubtargetInfo. This patch introduces two special kind of STIPredicate classes named IsZeroIdiomFunction and IsDepBreakingFunction in tablegen. It also adds a definition for those in the BtVer2 scheduling model only. This patch supersedes the one committed at r338372 (phabricator review: D49310). The main advantages are: - We can describe subtarget predicates via tablegen using STIPredicates. - We can describe zero-idioms / dep-breaking instructions directly via tablegen in the scheduling models. In future, the STIPredicates framework can be used for solving other problems. Examples of future developments are: - Teach how to identify optimizable register-register moves - Teach how to identify slow LEA instructions (each subtarget defining its own concept of "slow" LEA). - Teach how to identify instructions that have undocumented false dependencies on the output registers on some processors only. It is also (in my opinion) an elegant way to expose knowledge to both external tools like llvm-mca, and codegen passes. For example, machine schedulers in LLVM could reuse that information when internally constructing the data dependency graph for a code region. This new design feature is also an "opt-in" feature. Processor models don't have to use the new STIPredicates. It has all been designed to be as unintrusive as possible. Differential Revision: https://reviews.llvm.org/D52174 llvm-svn: 342555
1 parent 4fd2e2a commit 8b6c314

File tree

13 files changed

+1164
-92
lines changed

13 files changed

+1164
-92
lines changed

llvm/include/llvm/CodeGen/TargetSubtargetInfo.h

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
#ifndef LLVM_CODEGEN_TARGETSUBTARGETINFO_H
1515
#define LLVM_CODEGEN_TARGETSUBTARGETINFO_H
1616

17+
#include "llvm/ADT/APInt.h"
1718
#include "llvm/ADT/ArrayRef.h"
1819
#include "llvm/ADT/SmallVector.h"
1920
#include "llvm/ADT/StringRef.h"
@@ -144,6 +145,31 @@ class TargetSubtargetInfo : public MCSubtargetInfo {
144145
return 0;
145146
}
146147

148+
/// Returns true if \param MI is a dependency breaking zero-idiom instruction
149+
/// for the subtarget.
150+
///
151+
/// This function also sets bits in \param Mask related to input operands that
152+
/// are not in a data dependency relationship. There is one bit for each
153+
/// machine operand; implicit operands follow explicit operands in the bit
154+
/// representation used for \param Mask. An empty \param Mask (i.e. a mask
155+
/// with all bits cleared) means: data dependencies are "broken" for all the
156+
/// explicit input machine operands of \param MI.
157+
virtual bool isZeroIdiom(const MachineInstr *MI, APInt &Mask) const {
158+
return false;
159+
}
160+
161+
/// Returns true if \param MI is a dependency breaking instruction for the
162+
/// subtarget.
163+
///
164+
/// Similar in behavior to `isZeroIdiom`. However, it knows how to identify
165+
/// all dependency breaking instructions (i.e. not just zero-idioms).
166+
///
167+
/// As for `isZeroIdiom`, this method returns a mask of "broken" dependencies.
168+
/// (See method `isZeroIdiom` for a detailed description of \param Mask).
169+
virtual bool isDependencyBreaking(const MachineInstr *MI, APInt &Mask) const {
170+
return isZeroIdiom(MI, Mask);
171+
}
172+
147173
/// True if the subtarget should run MachineScheduler after aggressive
148174
/// coalescing.
149175
///

llvm/include/llvm/MC/MCInstrAnalysis.h

Lines changed: 42 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -88,18 +88,53 @@ class MCInstrAnalysis {
8888
const MCInst &Inst,
8989
APInt &Writes) const;
9090

91-
/// Returns true if \param Inst is a dependency breaking instruction for the
91+
/// Returns true if \param MI is a dependency breaking zero-idiom for the
9292
/// given subtarget.
9393
///
94+
/// \param Mask is used to identify input operands that have their dependency
95+
/// broken. Each bit of the mask is associated with a specific input operand.
96+
/// Bits associated with explicit input operands are laid out first in the
97+
/// mask; implicit operands come after explicit operands.
98+
///
99+
/// Dependencies are broken only for operands that have their corresponding bit
100+
/// set. Operands that have their bit cleared, or that don't have a
101+
/// corresponding bit in the mask don't have their dependency broken.
102+
/// Note that \param Mask may not be big enough to describe all operands.
103+
/// The assumption for operands that don't have a correspondent bit in the
104+
/// mask is that those are still data dependent.
105+
///
106+
/// The only exception to the rule is for when \param Mask has all zeroes.
107+
/// A zero mask means: dependencies are broken for all explicit register
108+
/// operands.
109+
virtual bool isZeroIdiom(const MCInst &MI, APInt &Mask,
110+
unsigned CPUID) const {
111+
return false;
112+
}
113+
114+
/// Returns true if \param MI is a dependency breaking instruction for the
115+
/// subtarget associated with \param CPUID.
116+
///
94117
/// The value computed by a dependency breaking instruction is not dependent
95118
/// on the inputs. An example of dependency breaking instruction on X86 is
96119
/// `XOR %eax, %eax`.
97-
/// TODO: In future, we could implement an alternative approach where this
98-
/// method returns `true` if the input instruction is not dependent on
99-
/// some/all of its input operands. An APInt mask could then be used to
100-
/// identify independent operands.
101-
virtual bool isDependencyBreaking(const MCSubtargetInfo &STI,
102-
const MCInst &Inst) const;
120+
///
121+
/// If \param MI is a dependency breaking instruction for subtarget \param
122+
/// CPUID, then \param Mask can be inspected to identify independent operands.
123+
///
124+
/// Essentially, each bit of the mask corresponds to an input operand.
125+
/// Explicit operands are laid out first in the mask; implicit operands follow
126+
/// explicit operands. Bits are set for operands that are independent.
127+
///
128+
/// Note that the number of bits in Mask may not be equivalent to the sum of
129+
/// explicit and implicit operands in \param MI. Operands that don't have a
130+
/// corresponding bit in Mask are assumed "not independente".
131+
///
132+
/// The only exception is for when \param Mask is all zeroes. That means:
133+
/// explicit input operands of \param MI are independent.
134+
virtual bool isDependencyBreaking(const MCInst &MI, APInt &Mask,
135+
unsigned CPUID) const {
136+
return isZeroIdiom(MI, Mask, CPUID);
137+
}
103138

104139
/// Given a branch instruction try to get the address the branch
105140
/// targets. Return true on success, and the address in Target.

llvm/include/llvm/Target/TargetInstrPredicate.td

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@
6868

6969
// Forward declarations.
7070
class Instruction;
71+
class SchedMachineModel;
7172

7273
// A generic machine instruction predicate.
7374
class MCInstPredicate;
@@ -230,3 +231,100 @@ class CheckFunctionPredicate<string MCInstFn, string MachineInstrFn> : MCInstPre
230231
string MCInstFnName = MCInstFn;
231232
string MachineInstrFnName = MachineInstrFn;
232233
}
234+
235+
// Used to classify machine instructions based on a machine instruction
236+
// predicate.
237+
//
238+
// Let IC be an InstructionEquivalenceClass definition, and MI a machine
239+
// instruction. We say that MI belongs to the equivalence class described by IC
240+
// if and only if the following two conditions are met:
241+
// a) MI's opcode is in the `opcodes` set, and
242+
// b) `Predicate` evaluates to true when applied to MI.
243+
//
244+
// Instances of this class can be used by processor scheduling models to
245+
// describe instructions that have a property in common. For example,
246+
// InstructionEquivalenceClass definitions can be used to identify the set of
247+
// dependency breaking instructions for a processor model.
248+
//
249+
// An (optional) list of operand indices can be used to further describe
250+
// properties that apply to instruction operands. For example, it can be used to
251+
// identify register uses of a dependency breaking instructions that are not in
252+
// a RAW dependency.
253+
class InstructionEquivalenceClass<list<Instruction> opcodes,
254+
MCInstPredicate pred,
255+
list<int> operands = []> {
256+
list<Instruction> Opcodes = opcodes;
257+
MCInstPredicate Predicate = pred;
258+
list<int> OperandIndices = operands;
259+
}
260+
261+
// Used by processor models to describe dependency breaking instructions.
262+
//
263+
// This is mainly an alias for InstructionEquivalenceClass. Input operand
264+
// `BrokenDeps` identifies the set of "broken dependencies". There is one bit
265+
// per each implicit and explicit input operand. An empty set of broken
266+
// dependencies means: "explicit input register operands are independent."
267+
class DepBreakingClass<list<Instruction> opcodes, MCInstPredicate pred,
268+
list<int> BrokenDeps = []>
269+
: InstructionEquivalenceClass<opcodes, pred, BrokenDeps>;
270+
271+
// A function descriptor used to describe the signature of a predicate methods
272+
// which will be expanded by the STIPredicateExpander into a tablegen'd
273+
// XXXGenSubtargetInfo class member definition (here, XXX is a target name).
274+
//
275+
// It describes the signature of a TargetSubtarget hook, as well as a few extra
276+
// properties. Examples of extra properties are:
277+
// - The default return value for the auto-generate function hook.
278+
// - A list of subtarget hooks (Delegates) that are called from this function.
279+
//
280+
class STIPredicateDecl<string name, MCInstPredicate default = FalsePred,
281+
bit overrides = 1, bit expandForMC = 1,
282+
bit updatesOpcodeMask = 0,
283+
list<STIPredicateDecl> delegates = []> {
284+
string Name = name;
285+
286+
MCInstPredicate DefaultReturnValue = default;
287+
288+
// True if this method is declared as virtual in class TargetSubtargetInfo.
289+
bit OverridesBaseClassMember = overrides;
290+
291+
// True if we need an equivalent predicate function in the MC layer.
292+
bit ExpandForMC = expandForMC;
293+
294+
// True if the autogenerated method has a extra in/out APInt param used as a
295+
// mask of operands.
296+
bit UpdatesOpcodeMask = updatesOpcodeMask;
297+
298+
// A list of STIPredicates used by this definition to delegate part of the
299+
// computation. For example, STIPredicateFunction `isDependencyBreaking()`
300+
// delegates to `isZeroIdiom()` part of its computation.
301+
list<STIPredicateDecl> Delegates = delegates;
302+
}
303+
304+
// A predicate function definition member of class `XXXGenSubtargetInfo`.
305+
//
306+
// If `Declaration.ExpandForMC` is true, then SubtargetEmitter
307+
// will also expand another definition of this method that accepts a MCInst.
308+
class STIPredicate<STIPredicateDecl declaration,
309+
list<InstructionEquivalenceClass> classes> {
310+
STIPredicateDecl Declaration = declaration;
311+
list<InstructionEquivalenceClass> Classes = classes;
312+
SchedMachineModel SchedModel = ?;
313+
}
314+
315+
// Convenience classes and definitions used by processor scheduling models to
316+
// describe dependency breaking instructions.
317+
let UpdatesOpcodeMask = 1 in {
318+
319+
def IsZeroIdiomDecl : STIPredicateDecl<"isZeroIdiom">;
320+
321+
let Delegates = [IsZeroIdiomDecl] in
322+
def IsDepBreakingDecl : STIPredicateDecl<"isDependencyBreaking">;
323+
324+
} // UpdatesOpcodeMask
325+
326+
class IsZeroIdiomFunction<list<DepBreakingClass> classes>
327+
: STIPredicate<IsZeroIdiomDecl, classes>;
328+
329+
class IsDepBreakingFunction<list<DepBreakingClass> classes>
330+
: STIPredicate<IsDepBreakingDecl, classes>;

llvm/lib/MC/MCInstrAnalysis.cpp

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,6 @@ bool MCInstrAnalysis::clearsSuperRegisters(const MCRegisterInfo &MRI,
2424
return false;
2525
}
2626

27-
bool MCInstrAnalysis::isDependencyBreaking(const MCSubtargetInfo &STI,
28-
const MCInst &Inst) const {
29-
return false;
30-
}
31-
3227
bool MCInstrAnalysis::evaluateBranch(const MCInst &Inst, uint64_t Addr,
3328
uint64_t Size, uint64_t &Target) const {
3429
if (Inst.getNumOperands() == 0 ||

llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp

Lines changed: 5 additions & 73 deletions
Original file line numberDiff line numberDiff line change
@@ -380,8 +380,9 @@ class X86MCInstrAnalysis : public MCInstrAnalysis {
380380
public:
381381
X86MCInstrAnalysis(const MCInstrInfo *MCII) : MCInstrAnalysis(MCII) {}
382382

383-
bool isDependencyBreaking(const MCSubtargetInfo &STI,
384-
const MCInst &Inst) const override;
383+
#define GET_STIPREDICATE_DECLS_FOR_MC_ANALYSIS
384+
#include "X86GenSubtargetInfo.inc"
385+
385386
bool clearsSuperRegisters(const MCRegisterInfo &MRI, const MCInst &Inst,
386387
APInt &Mask) const override;
387388
std::vector<std::pair<uint64_t, uint64_t>>
@@ -390,77 +391,8 @@ class X86MCInstrAnalysis : public MCInstrAnalysis {
390391
const Triple &TargetTriple) const override;
391392
};
392393

393-
bool X86MCInstrAnalysis::isDependencyBreaking(const MCSubtargetInfo &STI,
394-
const MCInst &Inst) const {
395-
if (STI.getCPU() == "btver2") {
396-
// Reference: Agner Fog's microarchitecture.pdf - Section 20 "AMD Bobcat and
397-
// Jaguar pipeline", subsection 8 "Dependency-breaking instructions".
398-
switch (Inst.getOpcode()) {
399-
default:
400-
return false;
401-
case X86::SUB32rr:
402-
case X86::SUB64rr:
403-
case X86::SBB32rr:
404-
case X86::SBB64rr:
405-
case X86::XOR32rr:
406-
case X86::XOR64rr:
407-
case X86::XORPSrr:
408-
case X86::XORPDrr:
409-
case X86::VXORPSrr:
410-
case X86::VXORPDrr:
411-
case X86::ANDNPSrr:
412-
case X86::VANDNPSrr:
413-
case X86::ANDNPDrr:
414-
case X86::VANDNPDrr:
415-
case X86::PXORrr:
416-
case X86::VPXORrr:
417-
case X86::PANDNrr:
418-
case X86::VPANDNrr:
419-
case X86::PSUBBrr:
420-
case X86::PSUBWrr:
421-
case X86::PSUBDrr:
422-
case X86::PSUBQrr:
423-
case X86::VPSUBBrr:
424-
case X86::VPSUBWrr:
425-
case X86::VPSUBDrr:
426-
case X86::VPSUBQrr:
427-
case X86::PCMPEQBrr:
428-
case X86::PCMPEQWrr:
429-
case X86::PCMPEQDrr:
430-
case X86::PCMPEQQrr:
431-
case X86::VPCMPEQBrr:
432-
case X86::VPCMPEQWrr:
433-
case X86::VPCMPEQDrr:
434-
case X86::VPCMPEQQrr:
435-
case X86::PCMPGTBrr:
436-
case X86::PCMPGTWrr:
437-
case X86::PCMPGTDrr:
438-
case X86::PCMPGTQrr:
439-
case X86::VPCMPGTBrr:
440-
case X86::VPCMPGTWrr:
441-
case X86::VPCMPGTDrr:
442-
case X86::VPCMPGTQrr:
443-
case X86::MMX_PXORirr:
444-
case X86::MMX_PANDNirr:
445-
case X86::MMX_PSUBBirr:
446-
case X86::MMX_PSUBDirr:
447-
case X86::MMX_PSUBQirr:
448-
case X86::MMX_PSUBWirr:
449-
case X86::MMX_PCMPGTBirr:
450-
case X86::MMX_PCMPGTDirr:
451-
case X86::MMX_PCMPGTWirr:
452-
case X86::MMX_PCMPEQBirr:
453-
case X86::MMX_PCMPEQDirr:
454-
case X86::MMX_PCMPEQWirr:
455-
return Inst.getOperand(1).getReg() == Inst.getOperand(2).getReg();
456-
case X86::CMP32rr:
457-
case X86::CMP64rr:
458-
return Inst.getOperand(0).getReg() == Inst.getOperand(1).getReg();
459-
}
460-
}
461-
462-
return false;
463-
}
394+
#define GET_STIPREDICATE_DEFS_FOR_MC_ANALYSIS
395+
#include "X86GenSubtargetInfo.inc"
464396

465397
bool X86MCInstrAnalysis::clearsSuperRegisters(const MCRegisterInfo &MRI,
466398
const MCInst &Inst,

llvm/lib/Target/X86/X86ScheduleBtVer2.td

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -687,4 +687,66 @@ def JSlowLEA16r : SchedWriteRes<[JALU01]> {
687687

688688
def : InstRW<[JSlowLEA16r], (instrs LEA16r)>;
689689

690+
///////////////////////////////////////////////////////////////////////////////
691+
// Dependency breaking instructions.
692+
///////////////////////////////////////////////////////////////////////////////
693+
694+
def : IsZeroIdiomFunction<[
695+
// GPR Zero-idioms.
696+
DepBreakingClass<[ SUB32rr, SUB64rr, XOR32rr, XOR64rr ], ZeroIdiomPredicate>,
697+
698+
// MMX Zero-idioms.
699+
DepBreakingClass<[
700+
MMX_PXORirr, MMX_PANDNirr, MMX_PSUBBirr,
701+
MMX_PSUBDirr, MMX_PSUBQirr, MMX_PSUBWirr,
702+
MMX_PCMPGTBirr, MMX_PCMPGTDirr, MMX_PCMPGTWirr
703+
], ZeroIdiomPredicate>,
704+
705+
// SSE Zero-idioms.
706+
DepBreakingClass<[
707+
// fp variants.
708+
XORPSrr, XORPDrr, ANDNPSrr, ANDNPDrr,
709+
710+
// int variants.
711+
PXORrr, PANDNrr,
712+
PSUBBrr, PSUBWrr, PSUBDrr, PSUBQrr,
713+
PCMPGTBrr, PCMPGTDrr, PCMPGTQrr, PCMPGTWrr
714+
], ZeroIdiomPredicate>,
715+
716+
// AVX Zero-idioms.
717+
DepBreakingClass<[
718+
// xmm fp variants.
719+
VXORPSrr, VXORPDrr, VANDNPSrr, VANDNPDrr,
720+
721+
// xmm int variants.
722+
VPXORrr, VPANDNrr,
723+
VPSUBBrr, VPSUBWrr, VPSUBDrr, VPSUBQrr,
724+
VPCMPGTBrr, VPCMPGTWrr, VPCMPGTDrr, VPCMPGTQrr,
725+
726+
// ymm variants.
727+
VXORPSYrr, VXORPDYrr, VANDNPSYrr, VANDNPDYrr
728+
], ZeroIdiomPredicate>
729+
]>;
730+
731+
def : IsDepBreakingFunction<[
732+
// GPR
733+
DepBreakingClass<[ SBB32rr, SBB64rr ], ZeroIdiomPredicate>,
734+
DepBreakingClass<[ CMP32rr, CMP64rr ], CheckSameRegOperand<0, 1> >,
735+
736+
// MMX
737+
DepBreakingClass<[
738+
MMX_PCMPEQBirr, MMX_PCMPEQDirr, MMX_PCMPEQWirr
739+
], ZeroIdiomPredicate>,
740+
741+
// SSE
742+
DepBreakingClass<[
743+
PCMPEQBrr, PCMPEQWrr, PCMPEQDrr, PCMPEQQrr
744+
], ZeroIdiomPredicate>,
745+
746+
// AVX
747+
DepBreakingClass<[
748+
VPCMPEQBrr, VPCMPEQWrr, VPCMPEQDrr, VPCMPEQQrr
749+
], ZeroIdiomPredicate>
750+
]>;
751+
690752
} // SchedModel

0 commit comments

Comments
 (0)