mirror of
https://github.com/intel/llvm.git
synced 2026-02-06 15:18:53 +08:00
This is a recommit of r290149, which was reverted in r290169 due to msan failures. msan was failing because we were calling `isMostDerivedAnUnsizedArray` on an invalid designator, which caused us to read uninitialized memory. To fix this, the logic of the caller of said function was simplified, and we now have a `!Invalid` assert in `isMostDerivedAnUnsizedArray`, so we can catch this particular bug more easily in the future. Fingers crossed that this patch sticks this time. :) Original commit message: This patch does three things: - Gives us the alloc_size attribute in clang, which lets us infer the number of bytes handed back to us by malloc/realloc/calloc/any user functions that act in a similar manner. - Teaches our constexpr evaluator that evaluating some `const` variables is OK sometimes. This is why we have a change in test/SemaCXX/constant-expression-cxx11.cpp and other seemingly unrelated tests. Richard Smith okay'ed this idea some time ago in person. - Uniques some Blocks in CodeGen, which was reviewed separately at D26410. Lack of uniquing only really shows up as a problem when combined with our new eagerness in the face of const. llvm-svn: 290297
IRgen optimization opportunities. //===---------------------------------------------------------------------===// The common pattern of -- short x; // or char, etc (x == 10) -- generates an zext/sext of x which can easily be avoided. //===---------------------------------------------------------------------===// Bitfields accesses can be shifted to simplify masking and sign extension. For example, if the bitfield width is 8 and it is appropriately aligned then is is a lot shorter to just load the char directly. //===---------------------------------------------------------------------===// It may be worth avoiding creation of alloca's for formal arguments for the common situation where the argument is never written to or has its address taken. The idea would be to begin generating code by using the argument directly and if its address is taken or it is stored to then generate the alloca and patch up the existing code. In theory, the same optimization could be a win for block local variables as long as the declaration dominates all statements in the block. NOTE: The main case we care about this for is for -O0 -g compile time performance, and in that scenario we will need to emit the alloca anyway currently to emit proper debug info. So this is blocked by being able to emit debug information which refers to an LLVM temporary, not an alloca. //===---------------------------------------------------------------------===// We should try and avoid generating basic blocks which only contain jumps. At -O0, this penalizes us all the way from IRgen (malloc & instruction overhead), all the way down through code generation and assembly time. On 176.gcc:expr.ll, it looks like over 12% of basic blocks are just direct branches! //===---------------------------------------------------------------------===//