Bill Nell d74997c3cc Indirect call promotion optimization.
Summary:
Perform indirect call promotion optimization in BOLT.

The code scans the instructions during CFG creation for all
indirect calls.  Right now indirect tail calls are not handled
since the functions are marked not simple.  The offsets of the
indirect calls are stored for later use by the ICP pass.

The indirect call promotion pass visits each indirect call and
examines the BranchData for each.  If the most frequent targets
from that callsite exceed the specified threshold (default 90%),
the call is promoted.  Otherwise, it is ignored.  By default,
only one target is considered at each callsite.

When an candiate callsite is processed, we modify the callsite
to test for the most common call targets before calling through
the original generic call mechanism.

The CFG and layout are modified by ICP.

A few new command line options have been added:
-indirect-call-promotion
-indirect-call-promotion-threshold=<percentage>
-indirect-call-promotion-topn=<int>

The threshold is the minimum frequency of a call target needed
before ICP is triggered.

The topn option controls the number of targets to consider for
each callsite, e.g. ICP is triggered if topn=2 and the total
requency of the top two call targets exceeds the threshold.

Example of ICP:

C++ code:

  int B_count = 0;
  int C_count = 0;

  struct A { virtual void foo() = 0; }
  struct B : public A { virtual void foo() { ++B_count; }; };
  struct C : public A { virtual void foo() { ++C_count; }; };

  A* a = ...
  a->foo();
  ...

original:
  400863:	49 8b 07             	mov    (%r15),%rax
  400866:	4c 89 ff             	mov    %r15,%rdi
  400869:	ff 10                	callq  *(%rax)
  40086b:	41 83 e6 01          	and    $0x1,%r14d
  40086f:	4d 89 e6             	mov    %r12,%r14
  400872:	4c 0f 44 f5          	cmove  %rbp,%r14
  400876:	4c 89 f7             	mov    %r14,%rdi
  ...

after ICP:
  40085e:	49 8b 07             	mov    (%r15),%rax
  400861:	4c 89 ff             	mov    %r15,%rdi
  400864:	49 ba e0 0b 40 00 00 	movabs $0x400be0,%r10
  40086b:	00 00 00
  40086e:	4c 3b 10             	cmp    (%rax),%r10
  400871:	75 29                	jne    40089c <main+0x9c>
  400873:	41 ff d2             	callq  *%r10
  400876:	41 83 e6 01          	and    $0x1,%r14d
  40087a:	4d 89 e6             	mov    %r12,%r14
  40087d:	4c 0f 44 f5          	cmove  %rbp,%r14
  400881:	4c 89 f7             	mov    %r14,%rdi
  ...

  40089c:	ff 10                	callq  *(%rax)
  40089e:	eb d6                	jmp    400876 <main+0x76>

(cherry picked from FBD3612218)
2016-09-07 18:59:23 -07:00
Description
No description provided
5.2 GiB
Languages
LLVM 41.5%
C++ 31.7%
C 13%
Assembly 9.1%
MLIR 1.5%
Other 2.8%