Author Topic: Link-time optimizations & GCC  (Read 5916 times)

daniel.santos

  • Guest
Link-time optimizations & GCC
« on: 6 March 2008, 08:29:43 »
I just wanted to double check with the community, does anybody know of a way to get link-time optimizations with GCC?  I read a little about this LLVM project that does this, but it appears to be a virtual machine that also does run-time optimizations.  The reason I ask is that there is a lot of very small functions (getters & setters) in Glest's cpp files and without link-time optimizations (as m$'s linker now has), these are all going to be called as function, which can add a significant amount of overhead when used in large loops.
« Last Edit: 1 January 1970, 00:00:00 by daniel.santos »

martiño

  • Behemoth
  • *******
  • Posts: 1,095
    • View Profile
(No subject)
« Reply #1 on: 6 March 2008, 10:37:18 »
Most of getters and setters should be in .h files, and if they are not please tell me which files they are in and i will change them. That would make compilers life much easier.
« Last Edit: 1 January 1970, 00:00:00 by martiño »

hailstone

  • GAE Team
  • Battle Machine
  • ********
  • Posts: 1,568
    • View Profile
(No subject)
« Reply #2 on: 6 March 2008, 13:25:34 »
When functions are implemented in the h files (class definition?) they are automatically inline from what I remember, which I think martino was saying, meaning the statements in an inline function are put in place where they are being called rather than being linked as a function reducing the overhead. The negative side to doing this is the program size increases which is why it should only be done for small functions (eg. accessors/mutators). I'm just expanding it for my own benefit, let me know if I got it wrong.  :)
« Last Edit: 1 January 1970, 00:00:00 by hailstone »
Glest Advanced Engine - Admin/Programmer
https://sourceforge.net/projects/glestae/

martiño

  • Behemoth
  • *******
  • Posts: 1,095
    • View Profile
(No subject)
« Reply #3 on: 6 March 2008, 13:47:41 »
Well, most of the time compilers do whatever they want. Putting a function in a header or declaring it inline is more like a "hint" for the compiler, because it might still decide not to do it. Also VC.NET will inline functions in CPP files if you select "Whole program optimization" in the options.
« Last Edit: 1 January 1970, 00:00:00 by martiño »

daniel.santos

  • Guest
(No subject)
« Reply #4 on: 7 March 2008, 02:19:56 »
Yea, exactly, and as far as I know, the GNU family of compile tools doesn't have these link-time optimizations.  I think it's cool that ms has it.  And the compiler does have the final say on rather or not it will inline a function.  However, if you don't have link-time optimizations (a.k.a., "Whole program optimizations") then small functions in an object file aren't even candidates for inlining outside of that object file.

But as far as inlining vs program size, I agree that it is a good idea to keep small functions that are rarely called in the implementation files because they don't need to be inlined.  Inlining does increase the size of your executable.  But with the increasing size of processor caches, this impact is reduced, so it becomes a measure of how much speed you gain from not making the function call compared to how much you loose from having to load more code from main memory into the processor cache.  Usually, the majority of a program is fairly quickly paged out of the processor cache a few seconds into execution because all of the initialization code is only run once, so the code & data pages that are actually relevant to game play are what tends to reside in the processor cache (and sometimes non-relevant pages even get unloaded from  main memory).  For Glest, a large part of this occurs when starting the game and it's reading in all of the units & such.  After the game starts running, a lot of this code is needed anymore and it's hopefully grouped together and less intermixed with the code you want to keep around.  One of the nice features of modern operating systems is that they only load the pages of the executable file that it needs and they let page faults cause whatever it's missing to be loaded from disk, etc.  Sorry for the long shpiel.

But as far as which files, it's really a lot of them throughout the project from shared_lib/sources/graphics/particle.cpp to glest_game/type_instances/unit.cpp.  The most important ones are the ones that are being called from loops.  I can't get the profiling Linux build to actually output a gprof log, but when I do I can find the culprits much better.

Personally, I will sometimes put functions several lines in size in the header file if it seems probably they can be inlined.  And just because a function as several lines of text doesn't mean that it has all that many instructions once it's compiled.  This is especially true where a function is called using a hard-coded value.  If this value causes branching, the compiler will completely remove that code and can inline a function that would otherwise be large, because a portion of it is excluded.  I would rather give the compiler a chance than not unless it's a fat ugly function, especially one with a loop that I'm pretty certain would not be a good candidate for inlining.  Then again, I do a lot of Java so I'm used to seeing the implementation & definition in the same place.
« Last Edit: 1 January 1970, 00:00:00 by daniel.santos »

MatzeB

  • Guest
(No subject)
« Reply #5 on: 7 March 2008, 08:26:57 »
One word first: Before you start optimizing around, do a profile of the program and see where the time is spent. Understanding the performance of a big project like glest is impossible by guessing things like getters/setters could be slow. You will just waste time optimizing things that aren't worth it.

Anyway back to the point: gcc can not do link time optimisation, however often you can hack around by writing a file which simply includes all other .cpp files, so gcc can see the whole program at once. Anyway martinio is right, if most getters setters are in the headers then the compiler can always see them when he has calls to them, so they can and will be inlined if profitable (the compiler will also inline without the inline keyword, it just might be a bit more aggressive if inline is used). And well llvms x86 backend doesn't come close to gcc yet the linke time optimisations doesn't make it up for. There's also the intel compiler for linux which can do all that stuff and is sometimes faster, though in my experience icc/gcc doesn't make a big difference except when you compile the SPEC suite (icc is clearly SPEC optimized).
« Last Edit: 1 January 1970, 00:00:00 by MatzeB »

martiño

  • Behemoth
  • *******
  • Posts: 1,095
    • View Profile
(No subject)
« Reply #6 on: 7 March 2008, 10:27:20 »
I also wanted to point out, that for most computers, Glest should be GPU bound, meaning that the bottleneck should be the GPU and not the CPU, so optimizing CPU things like these should not make much difference.
« Last Edit: 1 January 1970, 00:00:00 by martiño »

hailstone

  • GAE Team
  • Battle Machine
  • ********
  • Posts: 1,568
    • View Profile
(No subject)
« Reply #7 on: 9 March 2008, 23:36:51 »
Quote from: "daniel.santos"
I will sometimes put functions several lines in size in the header file if it seems probably they can be inlined.

Can't you declare it inline in the implementation file?

I checked out http://www.parashift.com/c++-faq-lite/i ... tions.html and it seems that inlining can make a program faster and slower, and larger and smaller depending on the situation.
« Last Edit: 1 January 1970, 00:00:00 by hailstone »
Glest Advanced Engine - Admin/Programmer
https://sourceforge.net/projects/glestae/

daniel.santos

  • Guest
(No subject)
« Reply #8 on: 10 March 2008, 02:40:48 »
Thanks for the response all.  I did finally get profiling to work.  I'm not sure what's going wrong in the configure script, but I have to manually add "-pg" to the LDFLAGS in the Jamconfig.  I'm using the following configure command:
Code: [Select]
./configure --with-x --with-vorbis=/usr --with-ogg=/usr --enable-profileWhile I agree with your comment about making blind optimizations, un-inlined getters and setters aren't the type of thing that will show up on profiling very easily (coming from somebody with limited experience using profilers), I guess unless you look at the number of times called instead of the amount of time spent in the function, I hadn't tried that yet :) hehe.

I did run some profiling and I'm having a LOT of time spent finding paths.  I'm not sure if this is just my code or if it's happening in the mainline as well, but I suspect it's due to my changes in GAE.

@hailstone@, you can declare something as inline in the implementation file, but it can't get inlined across implementation files unless you have link-time optimizations, or you do the hack that MatzeB was talking about.  Busybox used to be able to do this, but for some reason that functionality was removed from the build, which sucks.  If I understand correction, putting it in the header file has an identical weight as sticking the inline keyword on it, but I guess that's up to the compiler implementation.
« Last Edit: 1 January 1970, 00:00:00 by daniel.santos »

 

anything