Files
llvm/bolt/CacheMetrics.h
spupyrev a599fe1bbc [BOLT] a new block reordering algorithm
Summary:
A new block reordering algorithm, cache+, that is designed to optimize
i-cache performance.

On a high level, this algorithm is a greedy heuristic that merges
clusters (ordered sequences) of basic blocks, similarly to how it is
done in OptimizeCacheReorderAlgorithm. There are two important
differences: (a) the metric that is optimized in the procedure, and
(b) how two clusters are merged together.
Initially all clusters are isolated basic blocks. On every iteration,
we pick a pair of clusters whose merging yields the biggest increase
in the ExtTSP metric (see CacheMetrics.cpp for exact implementation),
which models how i-cache "friendly" a pecific cluster is. A pair of
clusters giving the maximum gain is merged to a new clusters. The
procedure stops when there is only one cluster left, or when merging
does not increase ExtTSP. In the latter case, the remaining clusters
are sorted by density.
An important aspect is the way two clusters are merged. Unlike earlier
algorithms (e.g., OptimizeCacheReorderAlgorithm or Pettis-Hansen), two
clusters, X and Y, are first split into three, X1, X2, and Y. Then we
consider all possible ways of gluing the three clusters (e.g., X1YX2,
X1X2Y, X2X1Y, X2YX1, YX1X2, YX2X1) and choose the one producing the
largest score. This improves the quality of the final result (the
search space is larger) while keeping the implementation sufficiently
fast.

(cherry picked from FBD6466264)
2017-12-01 16:54:08 -08:00

39 lines
1.3 KiB
C++

//===- CacheMetrics.h - Interface for instruction cache evaluation --===//
//
// Functions to show metrics of cache lines
//
//
//===----------------------------------------------------------------------===//
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_TOOLS_LLVM_BOLT_CACHEMETRICS_H
#define LLVM_TOOLS_LLVM_BOLT_CACHEMETRICS_H
#include "BinaryFunction.h"
#include <vector>
namespace llvm {
namespace bolt {
namespace CacheMetrics {
/// Calculate various metrics related to instruction cache performance.
void printAll(const std::vector<BinaryFunction *> &BinaryFunctions);
/// Calculate Extended-TSP metric, which quantifies the expected number of
/// i-cache misses for a given pair of basic blocks. The parameters are:
/// - SrcAddr is the address of the source block;
/// - SrcSize is the size of the source block;
/// - DstAddr is the address of the destination block;
/// - Count is the number of jumps between the pair of blocks.
double extTSPScore(uint64_t SrcAddr,
uint64_t SrcSize,
uint64_t DstAddr,
uint64_t Count);
} // namespace CacheMetrics
} // namespace bolt
} // namespace llvm
#endif //LLVM_CACHEMETRICS_H