Bruce Simpson wrote:
I notice we aren't even building with -O1 in the default case.

This is probably the suck for template expansions, although I haven't measured the gristle. A simple std::vector<uint8_t> example does generate a lot of gristle, I speculate we probably see the same in most builds.

Given that we're usually building shared libraries, we probably want to combine template expansions as much as possible.

I stole a few cycles to try to figure out *exactly* why -O1 produces much smaller code, for a simple STL template instantiation.

It turns out that gcc's -O1 enables a few optimizations which can't be enabled with individual command line options.
These are:
     ipa_pure_const
     ipa_reference
     tree_sink
     tree_salias

The optimization is not coming from the C++ front end in gcc/cp/, but rather, the SSA tree; it's operating on the RTL, after the C++ front end has done code generation, and well before it hits the assembler.

The most interesting ones here are probably tree_sink and tree_salias. A simple vector<uint8_t> instantiation is going to contain mostly mutable methods, so ipa_pure_const and ipa_reference stages aren't going to do much (they're for static struct analysis).

The attached example files are a 'straw man' test case. They show how to turn off all but the essential optimization stages here -- i.e. the ones which yield smaller compiled C++ code, without expensive GCC tree passes.

-O1 is guaranteed not to add -fomit-frame-pointer on x86 as it would interfere with debugging. It yields a 30% reduction in binary size, although I haven't measured the additional compile time across the tree.

I'm a bit happier now that I understand exactly what the compiler is doing in the 'straw man' case.

thanks,
BMS
PROG_CXX= vec
SRCS= vec.cc
NO_MAN= defined

CLEANFILES+= vec.ii vec.s vec.cc.*

CFLAGS= -O1 -save-temps -v -fstats \
-fno-defer-pop \
-fno-delayed-branch \
-fno-guess-branch-probability \
-fno-cprop-registers \
-fno-if-conversion \
-fno-if-conversion2 \
-fno-tree-ccp \
-fno-tree-dce \
-fno-tree-dominator-opts \
-fno-tree-dse \
-fno-tree-ter \
-fno-tree-lrs \
-fno-tree-sra \
-fno-tree-copyrename \
-fno-tree-fre \
-fno-tree-ch \
-fno-unit-at-a-time \
-fno-merge-constants

.include <bsd.prog.mk>
#include <memory>
#include <vector>
#include <cstdio>
#include <cstdlib>

using namespace std;

int
main(int argc, char **argv)
{
    const char* myfunc = __func__;
    vector<uint8_t> vec;

    vec.resize(sizeof(__func__));

    return (0);
}
_______________________________________________
Xorp-hackers mailing list
[email protected]
http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/xorp-hackers

Reply via email to