Well, in its simplest form, it would need to hold vertexes, faces, normals, etc; However, realistically, you'd also have to hold texture vertexes, texture links, animation frames, colors, etc; More advanced model formats also include things like lighting, specular, multiple textures, etc;
However, its a lot more complex than it sounds, and you would have to create a way to render the model, which would almost definitely need a customized or entirely custom-made engine. Just because Glest did it doesn't mean you have to... In fact, the Glest Team could probably have created Glest faster, added more features like shaders and realistic water very easily, and have glest run faster, had they used a ready made engine like OGRE to render everything. Overally, graphics is one of the hardest parts of programming a game, and studies show that the majority of production for games is for graphics (though presumably, that's mostly the literal graphics, though rendering parts of programs make better graphics easier to do).
In fact, engines like Irrlitch are so simple to use that even I can get a 3D model in, add bump maps, custom shaders, a terrain, add "gravity", set a skybox, add boundaries, and place realistic lighting with shadows in relatively little time. I strongly endorse that you try either OGRE or Irrlitch (both is a good idea) as they are easily the most powerful, best feature-wise free engines out there. And why develope on a non free engine when the free ones can do just as well? No royalties FTW.
As for model formats, I think that .3ds and .x are some of the most common and most compatable (.x may be directX's format, but its compatable with most major graphic engines on most operating systems. After all, the model format itself is just a plain text file).