Well, there's quite a bit to store in a G3D file. The locations of every vertex, every UV texture coordinate, line, and normal on each frame. G3D models are also stored uncompressed. Compression (such as LZMA) can reduce the filesize by as much as 90% (although it would have to be uncompressed when loaded into memory, and since GAE already supports compressing entire mods, there's little to gain from compressing the models). However, animation between frames is heavily interpolated. A two frame animation won't appear like a flip book, but is smoothed out as movement of each vertex. The biggest current optimization that a modder can do is reduce the number of frames. Most models won't need more than eight frames, tops, and simpler models can get by with just three or four frames. The polycount is also exponential, as even if a vertex stays in the same location for all ten frames, the model still stores that exact location for each frame.
So for your castle example, the flag may be the only part animated, but all the vertexes in the castle are also stored for each of the ten frames. If any optimization were to occur, that would be the first place to change. However, if you ask me, the whole G3D format needs to go. The IQM format, currently
under consideration for use in MegaGlest, is looking like the best way to go. In addition to being a more advanced and capable model format, it would shift the need to develop importers and exporters (not to mention potential G3D optimization) away from us. It also would allow several features (with coding on our end) that G3D currently cannot, such as moveable cannons, ragdolls, and interchangeable armatures.
As for what can optimize G3D models, the
G3DHack tool is the only one currently in existence. It's still officially an alpha, isn't truly cross-platform, and appears to be abandoned, but it's fully featured and contains a few optimizations, such as stripping unnecessary frames.