First of all, I'm fully aware that Intel GPU's, especially the shared memory models (actually I'm not sure whether there are Intel models with dedicated GPU memory), are known to perform badly compared to standard gaming extension cards. I also know that this topic has been discussed before and that bad performance with a 3D game is to be expected. But I have some new facts which I think make it worth discussing this topic again.
I found a nice utility called intel_gpu_top, a user space utility Intel developed to accompany their Linux drivers for the purpose of
performance tuning OpenGL graphics and applications. This utility nicely shows how busy your CPU currently is. Unless something the various performance indicatiors are busy, most of the performance will be claimed by an idle process (found on the very top of the output generated by this utility).
Then i took a video which includes three things:
- A local-only (no networking) Megaglest 3.3.5b11 game is with 1 human and 7 AI players
- A large terminal window in the background showing the intel_gpu_top output
- A small terminal window in the front showing the output of the 'top' command
The gameplay/game performance is almost accurately recorded and presented in this video, the game did not noticeably slow down when I was recording compared to before I started recording. The video seems to play things back just a little bit slower than it actually was. Generally, however, the gameplay was already quite slow in this setup.
As you can see on the intel_gpu_top output, the GPU is mostly idle at the same time. Also, as can be seen on the 'top' output, only one of my two U7300 1.3 GHz CPU cores is fully loaded and my 4 GB of RAM is only slightly loaded. Nevertheless the gameplay is somewhat slow.
In the top left of the intel_gpu_top output it says "core clock: 533 Mhz", this is cut off.
Does is make sense that the gameplay is slow while GPU, CPU and RAM are just partially loaded? Disk I/O does not seem to be the bottleneck (and the game should not cause much, I think, but I will happily add iotop into the test if you think I should).
Is it maybe that threading doesn't work so well yet and the CPU (one of which seems to be fully loaded) is the limiting factor? Is threading maybe optimized to 100% CPU load instead of (100% x number_of_cores) CPU load on Linux?
Thanks for your help!
Here's some more info on my GPU:
$ sudo lspci -nnvk -s 00:02.0
00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a42] (rev 07)
Subsystem: ASUSTeK Computer Inc. Device [1043:1862]
Flags: bus master, fast devsel, latency 0, IRQ 28
Memory at fe400000 (64-bit, non-prefetchable) [size=4M]
Memory at d0000000 (64-bit, prefetchable) [size=256M]
I/O ports at dc00 [size=8]
Capabilities: [90] Message Signalled Interrupts: Mask- 64bit- Queue=0/0 Enable+
Capabilities: [d0] Power Management version 3
Kernel driver in use: i915
Kernel modules: i915
$ sudo lspci -nnvk -s 00:02.1
00:02.1 Display controller [0380]: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a43] (rev 07)
Subsystem: ASUSTeK Computer Inc. Device [1043:1862]
Flags: bus master, fast devsel, latency 0
Memory at fe800000 (64-bit, non-prefetchable) [size=1M]
Capabilities: [d0] Power Management version 3
Edit: I've removed the link to the video previously referenced in this post since it will be offline soon.