Skip to content

Commit dad0a64

Browse files
committed
IR: Add COMDATs to the IR
This new IR facility allows us to represent the object-file semantic of a COMDAT group. COMDATs allow us to tie together sections and make the inclusion of one dependent on another. This is required to implement features like MS ABI VFTables and optimizing away certain kinds of initialization in C++. This functionality is only representable in COFF and ELF, Mach-O has no similar mechanism. Differential Revision: http://reviews.llvm.org/D4178 llvm-svn: 211920
1 parent 3260478 commit dad0a64

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+1512
-105
lines changed

llvm/docs/LangRef.rst

+91-4
Original file line numberDiff line numberDiff line change
@@ -562,6 +562,8 @@ is zero. The address space qualifier must precede any other attributes.
562562

563563
LLVM allows an explicit section to be specified for globals. If the
564564
target supports it, it will emit globals to the section specified.
565+
Additionally, the global can placed in a comdat if the target has the necessary
566+
support.
565567

566568
By default, global initializers are optimized by assuming that global
567569
variables defined within the module are not modified from their
@@ -627,8 +629,9 @@ an optional ``unnamed_addr`` attribute, a return type, an optional
627629
:ref:`parameter attribute <paramattrs>` for the return type, a function
628630
name, a (possibly empty) argument list (each with optional :ref:`parameter
629631
attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
630-
an optional section, an optional alignment, an optional :ref:`garbage
631-
collector name <gc>`, an optional :ref:`prefix <prefixdata>`, an opening
632+
an optional section, an optional alignment,
633+
an optional :ref:`comdat <langref_comdats>`,
634+
an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, an opening
632635
curly brace, a list of basic blocks, and a closing curly brace.
633636

634637
LLVM function declarations consist of the "``declare``" keyword, an
@@ -658,6 +661,7 @@ predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
658661

659662
LLVM allows an explicit section to be specified for functions. If the
660663
target supports it, it will emit functions to the section specified.
664+
Additionally, the function can placed in a COMDAT.
661665

662666
An explicit alignment may be specified for a function. If not present,
663667
or if the alignment is set to zero, the alignment of the function is set
@@ -673,8 +677,8 @@ Syntax::
673677
define [linkage] [visibility] [DLLStorageClass]
674678
[cconv] [ret attrs]
675679
<ResultType> @<FunctionName> ([argument list])
676-
[unnamed_addr] [fn Attrs] [section "name"] [align N]
677-
[gc] [prefix Constant] { ... }
680+
[unnamed_addr] [fn Attrs] [section "name"] [comdat $<ComdatName>]
681+
[align N] [gc] [prefix Constant] { ... }
678682

679683
.. _langref_aliases:
680684

@@ -716,6 +720,89 @@ some can only be checked when producing an object file:
716720
* No global value in the expression can be a declaration, since that
717721
would require a relocation, which is not possible.
718722

723+
.. _langref_comdats:
724+
725+
Comdats
726+
-------
727+
728+
Comdat IR provides access to COFF and ELF object file COMDAT functionality.
729+
730+
Comdats have a name which represents the COMDAT key. All global objects which
731+
specify this key will only end up in the final object file if the linker chooses
732+
that key over some other key. Aliases are placed in the same COMDAT that their
733+
aliasee computes to, if any.
734+
735+
Comdats have a selection kind to provide input on how the linker should
736+
choose between keys in two different object files.
737+
738+
Syntax::
739+
740+
$<Name> = comdat SelectionKind
741+
742+
The selection kind must be one of the following:
743+
744+
``any``
745+
The linker may choose any COMDAT key, the choice is arbitrary.
746+
``exactmatch``
747+
The linker may choose any COMDAT key but the sections must contain the
748+
same data.
749+
``largest``
750+
The linker will choose the section containing the largest COMDAT key.
751+
``noduplicates``
752+
The linker requires that only section with this COMDAT key exist.
753+
``samesize``
754+
The linker may choose any COMDAT key but the sections must contain the
755+
same amount of data.
756+
757+
Note that the Mach-O platform doesn't support COMDATs and ELF only supports
758+
``any`` as a selection kind.
759+
760+
Here is an example of a COMDAT group where a function will only be selected if
761+
the COMDAT key's section is the largest:
762+
763+
.. code-block:: llvm
764+
765+
$foo = comdat largest
766+
@foo = global i32 2, comdat $foo
767+
768+
define void @bar() comdat $foo {
769+
ret void
770+
}
771+
772+
In a COFF object file, this will create a COMDAT section with selection kind
773+
``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
774+
and another COMDAT section with selection kind
775+
``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
776+
section and contains the contents of the ``@baz`` symbol.
777+
778+
There are some restrictions on the properties of the global object.
779+
It, or an alias to it, must have the same name as the COMDAT group when
780+
targeting COFF.
781+
The contents and size of this object may be used during link-time to determine
782+
which COMDAT groups get selected depending on the selection kind.
783+
Because the name of the object must match the name of the COMDAT group, the
784+
linkage of the global object must not be local; local symbols can get renamed
785+
if a collision occurs in the symbol table.
786+
787+
The combined use of COMDATS and section attributes may yield surprising results.
788+
For example:
789+
790+
.. code-block:: llvm
791+
792+
$foo = comdat any
793+
$bar = comdat any
794+
@g1 = global i32 42, section "sec", comdat $foo
795+
@g2 = global i32 42, section "sec", comdat $bar
796+
797+
From the object file perspective, this requires the creation of two sections
798+
with the same name. This is necessary because both globals belong to different
799+
COMDAT groups and COMDATs, at the object file level, are represented by
800+
sections.
801+
802+
Note that certain IR constructs like global variables and functions may create
803+
COMDATs in the object file in addition to any which are specified using COMDAT
804+
IR. This arises, for example, when a global variable has linkonce_odr linkage.
805+
719806
.. _namedmetadatastructure:
720807

721808
Named Metadata

llvm/include/llvm/ADT/UniqueVector.h

+18-1
Original file line numberDiff line numberDiff line change
@@ -22,13 +22,18 @@ namespace llvm {
2222
/// class should have an implementation of operator== and of operator<.
2323
/// Entries can be fetched using operator[] with the entry ID.
2424
template<class T> class UniqueVector {
25+
public:
26+
typedef typename std::vector<T> VectorType;
27+
typedef typename VectorType::iterator iterator;
28+
typedef typename VectorType::const_iterator const_iterator;
29+
2530
private:
2631
// Map - Used to handle the correspondence of entry to ID.
2732
std::map<T, unsigned> Map;
2833

2934
// Vector - ID ordered vector of entries. Entries can be indexed by ID - 1.
3035
//
31-
std::vector<T> Vector;
36+
VectorType Vector;
3237

3338
public:
3439
/// insert - Append entry to the vector if it doesn't already exist. Returns
@@ -68,6 +73,18 @@ template<class T> class UniqueVector {
6873
return Vector[ID - 1];
6974
}
7075

76+
/// \brief Return an iterator to the start of the vector.
77+
iterator begin() { return Vector.begin(); }
78+
79+
/// \brief Return an iterator to the start of the vector.
80+
const_iterator begin() const { return Vector.begin(); }
81+
82+
/// \brief Return an iterator to the end of the vector.
83+
iterator end() { return Vector.end(); }
84+
85+
/// \brief Return an iterator to the end of the vector.
86+
const_iterator end() const { return Vector.end(); }
87+
7188
/// size - Returns the number of entries in the vector.
7289
///
7390
size_t size() const { return Vector.size(); }

llvm/include/llvm/Bitcode/LLVMBitCodes.h

+10-1
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,8 @@ namespace bitc {
7171
// MODULE_CODE_PURGEVALS: [numvals]
7272
MODULE_CODE_PURGEVALS = 10,
7373

74-
MODULE_CODE_GCNAME = 11 // GCNAME: [strchr x N]
74+
MODULE_CODE_GCNAME = 11, // GCNAME: [strchr x N]
75+
MODULE_CODE_COMDAT = 12, // COMDAT: [selection_kind, name]
7576
};
7677

7778
/// PARAMATTR blocks have code for defining a parameter attribute set.
@@ -376,6 +377,14 @@ namespace bitc {
376377
ATTR_KIND_JUMP_TABLE = 40
377378
};
378379

380+
enum ComdatSelectionKindCodes {
381+
COMDAT_SELECTION_KIND_ANY = 1,
382+
COMDAT_SELECTION_KIND_EXACT_MATCH = 2,
383+
COMDAT_SELECTION_KIND_LARGEST = 3,
384+
COMDAT_SELECTION_KIND_NO_DUPLICATES = 4,
385+
COMDAT_SELECTION_KIND_SAME_SIZE = 5,
386+
};
387+
379388
} // End bitc namespace
380389
} // End llvm namespace
381390

llvm/include/llvm/IR/Comdat.h

+66
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
//===-- llvm/IR/Comdat.h - Comdat definitions -------------------*- C++ -*-===//
2+
//
3+
// The LLVM Compiler Infrastructure
4+
//
5+
// This file is distributed under the University of Illinois Open Source
6+
// License. See LICENSE.TXT for details.
7+
//
8+
//===----------------------------------------------------------------------===//
9+
//
10+
/// @file
11+
/// This file contains the declaration of the Comdat class, which represents a
12+
/// single COMDAT in LLVM.
13+
//
14+
//===----------------------------------------------------------------------===//
15+
16+
#ifndef LLVM_IR_COMDAT_H
17+
#define LLVM_IR_COMDAT_H
18+
19+
#include "llvm/ADT/StringRef.h"
20+
#include "llvm/Support/Compiler.h"
21+
22+
namespace llvm {
23+
24+
class raw_ostream;
25+
template <typename ValueTy> class StringMapEntry;
26+
27+
// This is a Name X SelectionKind pair. The reason for having this be an
28+
// independent object instead of just adding the name and the SelectionKind
29+
// to a GlobalObject is that it is invalid to have two Comdats with the same
30+
// name but different SelectionKind. This structure makes that unrepresentable.
31+
class Comdat {
32+
public:
33+
enum SelectionKind {
34+
Any, ///< The linker may choose any COMDAT.
35+
ExactMatch, ///< The data referenced by the COMDAT must be the same.
36+
Largest, ///< The linker will choose the largest COMDAT.
37+
NoDuplicates, ///< No other Module may specify this COMDAT.
38+
SameSize, ///< The data referenced by the COMDAT must be the same size.
39+
};
40+
41+
Comdat(Comdat &&C);
42+
SelectionKind getSelectionKind() const { return SK; }
43+
void setSelectionKind(SelectionKind Val) { SK = Val; }
44+
StringRef getName() const;
45+
void print(raw_ostream &OS) const;
46+
void dump() const;
47+
48+
private:
49+
friend class Module;
50+
Comdat();
51+
Comdat(SelectionKind SK, StringMapEntry<Comdat> *Name);
52+
Comdat(const Comdat &) LLVM_DELETED_FUNCTION;
53+
54+
// Points to the map in Module.
55+
StringMapEntry<Comdat> *Name;
56+
SelectionKind SK;
57+
};
58+
59+
inline raw_ostream &operator<<(raw_ostream &OS, const Comdat &C) {
60+
C.print(OS);
61+
return OS;
62+
}
63+
64+
} // end llvm namespace
65+
66+
#endif

llvm/include/llvm/IR/GlobalAlias.h

+15
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,21 @@ class GlobalAlias : public GlobalValue, public ilist_node<GlobalAlias> {
8787
return getOperand(0);
8888
}
8989

90+
const GlobalObject *getBaseObject() const {
91+
return const_cast<GlobalAlias *>(this)->getBaseObject();
92+
}
93+
GlobalObject *getBaseObject() {
94+
return dyn_cast<GlobalObject>(getAliasee()->stripInBoundsOffsets());
95+
}
96+
97+
const GlobalObject *getBaseObject(const DataLayout &DL, APInt &Offset) const {
98+
return const_cast<GlobalAlias *>(this)->getBaseObject(DL, Offset);
99+
}
100+
GlobalObject *getBaseObject(const DataLayout &DL, APInt &Offset) {
101+
return dyn_cast<GlobalObject>(
102+
getAliasee()->stripAndAccumulateInBoundsConstantOffsets(DL, Offset));
103+
}
104+
90105
static bool isValidLinkage(LinkageTypes L) {
91106
return isExternalLinkage(L) || isLocalLinkage(L) ||
92107
isWeakLinkage(L) || isLinkOnceLinkage(L);

llvm/include/llvm/IR/GlobalObject.h

+8-2
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
#include "llvm/IR/GlobalValue.h"
2121

2222
namespace llvm {
23-
23+
class Comdat;
2424
class Module;
2525

2626
class GlobalObject : public GlobalValue {
@@ -29,11 +29,12 @@ class GlobalObject : public GlobalValue {
2929
protected:
3030
GlobalObject(Type *Ty, ValueTy VTy, Use *Ops, unsigned NumOps,
3131
LinkageTypes Linkage, const Twine &Name)
32-
: GlobalValue(Ty, VTy, Ops, NumOps, Linkage, Name) {
32+
: GlobalValue(Ty, VTy, Ops, NumOps, Linkage, Name), ObjComdat(nullptr) {
3333
setGlobalValueSubClassData(0);
3434
}
3535

3636
std::string Section; // Section to emit this into, empty means default
37+
Comdat *ObjComdat;
3738
public:
3839
unsigned getAlignment() const {
3940
return (1u << getGlobalValueSubClassData()) >> 1;
@@ -44,6 +45,11 @@ class GlobalObject : public GlobalValue {
4445
const char *getSection() const { return Section.c_str(); }
4546
void setSection(StringRef S);
4647

48+
bool hasComdat() const { return getComdat() != nullptr; }
49+
const Comdat *getComdat() const { return ObjComdat; }
50+
Comdat *getComdat() { return ObjComdat; }
51+
void setComdat(Comdat *C) { ObjComdat = C; }
52+
4753
void copyAttributesFrom(const GlobalValue *Src) override;
4854

4955
// Methods for support type inquiry through isa, cast, and dyn_cast:

llvm/include/llvm/IR/GlobalValue.h

+7
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323

2424
namespace llvm {
2525

26+
class Comdat;
2627
class PointerType;
2728
class Module;
2829

@@ -110,6 +111,12 @@ class GlobalValue : public Constant {
110111
bool hasUnnamedAddr() const { return UnnamedAddr; }
111112
void setUnnamedAddr(bool Val) { UnnamedAddr = Val; }
112113

114+
bool hasComdat() const { return getComdat() != nullptr; }
115+
Comdat *getComdat();
116+
const Comdat *getComdat() const {
117+
return const_cast<GlobalValue *>(this)->getComdat();
118+
}
119+
113120
VisibilityTypes getVisibility() const { return VisibilityTypes(Visibility); }
114121
bool hasDefaultVisibility() const { return Visibility == DefaultVisibility; }
115122
bool hasHiddenVisibility() const { return Visibility == HiddenVisibility; }

llvm/include/llvm/IR/Module.h

+16
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
#define LLVM_IR_MODULE_H
1717

1818
#include "llvm/ADT/iterator_range.h"
19+
#include "llvm/IR/Comdat.h"
1920
#include "llvm/IR/DataLayout.h"
2021
#include "llvm/IR/Function.h"
2122
#include "llvm/IR/GlobalAlias.h"
@@ -123,6 +124,8 @@ class Module {
123124
typedef iplist<GlobalAlias> AliasListType;
124125
/// The type for the list of named metadata.
125126
typedef ilist<NamedMDNode> NamedMDListType;
127+
/// The type of the comdat "symbol" table.
128+
typedef StringMap<Comdat> ComdatSymTabType;
126129

127130
/// The Global Variable iterator.
128131
typedef GlobalListType::iterator global_iterator;
@@ -197,6 +200,7 @@ class Module {
197200
NamedMDListType NamedMDList; ///< The named metadata in the module
198201
std::string GlobalScopeAsm; ///< Inline Asm at global scope.
199202
ValueSymbolTable *ValSymTab; ///< Symbol table for values
203+
ComdatSymTabType ComdatSymTab; ///< Symbol table for COMDATs
200204
std::unique_ptr<GVMaterializer>
201205
Materializer; ///< Used to materialize GlobalValues
202206
std::string ModuleID; ///< Human readable identifier for the module
@@ -403,6 +407,14 @@ class Module {
403407
/// Remove the given NamedMDNode from this module and delete it.
404408
void eraseNamedMetadata(NamedMDNode *NMD);
405409

410+
/// @}
411+
/// @name Comdat Accessors
412+
/// @{
413+
414+
/// Return the Comdat in the module with the specified name. It is created
415+
/// if it didn't already exist.
416+
Comdat *getOrInsertComdat(StringRef Name);
417+
406418
/// @}
407419
/// @name Module Flags Accessors
408420
/// @{
@@ -504,6 +516,10 @@ class Module {
504516
const ValueSymbolTable &getValueSymbolTable() const { return *ValSymTab; }
505517
/// Get the Module's symbol table of global variable and function identifiers.
506518
ValueSymbolTable &getValueSymbolTable() { return *ValSymTab; }
519+
/// Get the Module's symbol table for COMDATs (constant).
520+
const ComdatSymTabType &getComdatSymbolTable() const { return ComdatSymTab; }
521+
/// Get the Module's symbol table for COMDATs.
522+
ComdatSymTabType &getComdatSymbolTable() { return ComdatSymTab; }
507523

508524
/// @}
509525
/// @name Global Variable Iteration

llvm/include/llvm/Linker/Linker.h

+2
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@
1515

1616
namespace llvm {
1717

18+
class Comdat;
19+
class GlobalValue;
1820
class Module;
1921
class StringRef;
2022
class StructType;

0 commit comments

Comments
 (0)