Bruce Simpson wrote:
I notice we aren't even building with -O1 in the default case.
This is probably the suck for template expansions, although I haven't
measured the gristle.
A simple std::vector<uint8_t> example does generate a lot of gristle, I
speculate we probably see the same in most builds.
Given that we're usually building shared libraries, we probably want to
combine template expansions as much as possible.
I stole a few cycles to try to figure out *exactly* why -O1 produces
much smaller code, for a simple STL template instantiation.
It turns out that gcc's -O1 enables a few optimizations which can't be
enabled with individual command line options.
These are:
ipa_pure_const
ipa_reference
tree_sink
tree_salias
The optimization is not coming from the C++ front end in gcc/cp/, but
rather, the SSA tree; it's operating on the RTL, after the C++ front end
has done code generation, and well before it hits the assembler.
The most interesting ones here are probably tree_sink and tree_salias. A
simple vector<uint8_t> instantiation is going to contain mostly mutable
methods, so ipa_pure_const and ipa_reference stages aren't going to do
much (they're for static struct analysis).
The attached example files are a 'straw man' test case. They show how to
turn off all but the essential optimization stages here -- i.e. the ones
which yield smaller compiled C++ code, without expensive GCC tree passes.
-O1 is guaranteed not to add -fomit-frame-pointer on x86 as it would
interfere with debugging.
It yields a 30% reduction in binary size, although I haven't measured
the additional compile time across the tree.
I'm a bit happier now that I understand exactly what the compiler is
doing in the 'straw man' case.
thanks,
BMS
PROG_CXX= vec
SRCS= vec.cc
NO_MAN= defined
CLEANFILES+= vec.ii vec.s vec.cc.*
CFLAGS= -O1 -save-temps -v -fstats \
-fno-defer-pop \
-fno-delayed-branch \
-fno-guess-branch-probability \
-fno-cprop-registers \
-fno-if-conversion \
-fno-if-conversion2 \
-fno-tree-ccp \
-fno-tree-dce \
-fno-tree-dominator-opts \
-fno-tree-dse \
-fno-tree-ter \
-fno-tree-lrs \
-fno-tree-sra \
-fno-tree-copyrename \
-fno-tree-fre \
-fno-tree-ch \
-fno-unit-at-a-time \
-fno-merge-constants
.include <bsd.prog.mk>
#include <memory>
#include <vector>
#include <cstdio>
#include <cstdlib>
using namespace std;
int
main(int argc, char **argv)
{
const char* myfunc = __func__;
vector<uint8_t> vec;
vec.resize(sizeof(__func__));
return (0);
}
_______________________________________________
Xorp-hackers mailing list
[email protected]
http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/xorp-hackers