Author Topic: Let's fix the linux multiplayer problem! (new workaround!)  (Read 16921 times)

titi

  • MegaGlest Team
  • Airship
  • ********
  • Posts: 4,240
    • View Profile
    • http://www.titusgames.de
Let's fix the linux multiplayer problem! (new workaround!)
« on: 11 February 2008, 09:47:31 »
It looks like a lot of people have trouble to run a multiplayergame without a crash.Please let's all together try to find the bugs!

1. play a local single player game to ensure that your setup is ok.
2. ensure that you all have the same binary and data ( best would be 3.0.0 )
3. for the moment please play seperate 32/64 bit linux games
4. start from a console to see errors when it crashes
5. report crashes ( and their output! ) here.  If possible include information which Linux distribution/hardware(GFX by ATI or NVIDIA)  was used by all players.
6. !!!!please also report successful games!!!
7. report compiler version which was used to build binary ( gcc --version )

(Update:
Use this script to start glest and you get a logfile for every crash:
http://www.titusgames.de/runglest.tar.gz)

( I never had any trouble playing with my son, but we had very similar hardware and the same linux distribution. I had some successful and some crashed games with others on the inet )
« Last Edit: 13 February 2008, 21:57:37 by titi »
Try Megaglest! Improved Engine / New factions / New tilesets / New maps / New scenarios

AF

  • Guest
(No subject)
« Reply #1 on: 11 February 2008, 11:15:06 »
You may be interested in the work tobi did for spring regarding sync and window<->linux and x86<->x64 and the streflop library.
« Last Edit: 1 January 1970, 00:00:00 by AF »

martiño

  • Behemoth
  • *******
  • Posts: 1,095
    • View Profile
(No subject)
« Reply #2 on: 11 February 2008, 11:53:32 »
Yeah, everything runs perfectly on windows though. windows/linux and 32/64 bit compatibility will be our next focus.
« Last Edit: 1 January 1970, 00:00:00 by martiño »

titi

  • MegaGlest Team
  • Airship
  • ********
  • Posts: 4,240
    • View Profile
    • http://www.titusgames.de
(No subject)
« Reply #3 on: 11 February 2008, 12:27:53 »
Thats great to hear!

Does it make sense when we post our results here?
« Last Edit: 1 January 1970, 00:00:00 by titi »
Try Megaglest! Improved Engine / New factions / New tilesets / New maps / New scenarios

AF

  • Guest
(No subject)
« Reply #4 on: 11 February 2008, 13:02:33 »
It only works because every working glest is all compiled from the same compiler and source under the same platform.

As soon as you pit a VS2005 build against a mingw32 build or a newer gcc build against an older one you get errors because floating point calculations are done slightly differently with different calculations and different accuracies. This generates tiny differences which desync the game, and as the game continues they compound each other into huge differences which can crash the game depending on how network traffic is interpreted.

To do this spring developers used streflop to fix the floating point accuracies, separated out synced and unsynchronized code, and built the windows release under mingw32 for better compatibility with *nix gcc builds.
« Last Edit: 1 January 1970, 00:00:00 by AF »

martiño

  • Behemoth
  • *******
  • Posts: 1,095
    • View Profile
(No subject)
« Reply #5 on: 11 February 2008, 13:34:01 »
Quote from: "titi"
Thats great to hear!

Does it make sense when we post our results here?


Yeah, that would be really useful, I specially interested in the 32-64 bit problem and also in gcc3-4 issues.

We are aware of the floating point not being deterministic issues, and we know that this is not an issue on windows since we provide our own binaries, the issue is when people start compiling with different compilers and using different machines. We are thinking of a way of fixing it, we might use streflop or just fixed point maths.

Regards.

Martiño.
« Last Edit: 1 January 1970, 00:00:00 by martiño »

AF

  • Guest
(No subject)
« Reply #6 on: 11 February 2008, 15:17:45 »
Fixed point maths may not be wise as it entails a performance hit. Streflop is not the only piece of code out there for this, especially since it was a pre-existing library that was totally rewritten by spring devs IIRC. Perhaps some research is in order?
« Last Edit: 1 January 1970, 00:00:00 by AF »

martiño

  • Behemoth
  • *******
  • Posts: 1,095
    • View Profile
(No subject)
« Reply #7 on: 11 February 2008, 15:50:27 »
Yeah, we will investigate, I would like to avoid the use of third party libraries if i can though, Glest already has a lot of external dependencies
« Last Edit: 1 January 1970, 00:00:00 by martiño »

titi

  • MegaGlest Team
  • Airship
  • ********
  • Posts: 4,240
    • View Profile
    • http://www.titusgames.de
(No subject)
« Reply #8 on: 11 February 2008, 17:27:17 »
If you use the following startscript for glest you will have a logfile for every crash:
Unpack this to your glest installation and start glest with the script runglest.sh instead of glest.
http://www.titusgames.de/runglest.tar.gz
This will create a logfile in the glest directory.

If you didn't installed glest in the userdirectory see the last lines of the script and uncomment the things you need.
« Last Edit: 13 February 2008, 08:36:44 by titi »
Try Megaglest! Improved Engine / New factions / New tilesets / New maps / New scenarios

titi

  • MegaGlest Team
  • Airship
  • ********
  • Posts: 4,240
    • View Profile
    • http://www.titusgames.de
(No subject)
« Reply #9 on: 11 February 2008, 17:42:54 »
So here we go, the first crash :(

glest 3.0.0
Server Ubuntu 7.10 32 bit Nvidia gfx.(binary compiled with gcc (GCC) 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2) )
Client Ubuntu 6.06 32 bit Nvidia gfx (binary compiled with gcc (GCC) 4.0.3 (Ubuntu 4.0.3-1ubuntu5))


Everything starts up fine but after 10 minutes the client crashes.
- no error message because we started without console ( sorry next one will have a log! )
-------------------------

Next game, same computers but client and server changed their role
now with errormessage:
Exception: Can not find command type with id: 8 in unit: daemon
-------------------------
« Last Edit: 11 February 2008, 17:58:05 by titi »
Try Megaglest! Improved Engine / New factions / New tilesets / New maps / New scenarios

martiño

  • Behemoth
  • *******
  • Posts: 1,095
    • View Profile
(No subject)
« Reply #10 on: 11 February 2008, 17:47:11 »
i think one critical thing might be which version of GCC was used,
« Last Edit: 1 January 1970, 00:00:00 by martiño »

AF

  • Guest
(No subject)
« Reply #11 on: 11 February 2008, 20:25:27 »
Its not quite a 3rd party lib, its not like you can plug it in, and whatever you do it could well have far reaching effects across the code base, but if you want ever windows vs linux without wine then you don't have many other options.
« Last Edit: 1 January 1970, 00:00:00 by AF »

titi

  • MegaGlest Team
  • Airship
  • ********
  • Posts: 4,240
    • View Profile
    • http://www.titusgames.de
(No subject)
« Reply #12 on: 11 February 2008, 21:46:49 »
AF why are you always so aggressive? Thats probably why noone wants to answer you. Calm down a bit! Choose a more gentle way to say something.
« Last Edit: 1 January 1970, 00:00:00 by titi »
Try Megaglest! Improved Engine / New factions / New tilesets / New maps / New scenarios

ttsmj

  • Guest
(No subject)
« Reply #13 on: 12 February 2008, 20:36:33 »
Quote
gcc (GCC) 4.1.3
Quote
gcc (GCC) 4.0.3


hey titi, next time we gonna use the same binary, ok? and what about testing the latest svn release?

By the way, there is this option: configure --enable-debug. What does it do, when I compile this way? Will the game have more detailed terminal output? Or what's the difference?
« Last Edit: 1 January 1970, 00:00:00 by ttsmj »

AF

  • Guest
(No subject)
« Reply #14 on: 13 February 2008, 14:01:11 »
titi if I wanted to be aggressive I'd use big text and flashy colours!
« Last Edit: 1 January 1970, 00:00:00 by AF »

Duke

  • Guest
(No subject)
« Reply #15 on: 13 February 2008, 18:03:44 »
Agreed the worst I could say about AFs stile, is that is not polite, but not unpolite either and far from aggressive.

About the topic: I had this exception with the demon once even in single player.
I think the situatin was, that the unit was slayn but did not fall down and when I tryed to move it it couldn't find it.

Ao it could be that aside from the asyncronisation, there might be some kind of package loss.
« Last Edit: 1 January 1970, 00:00:00 by Duke »

AF

  • Guest
(No subject)
« Reply #16 on: 13 February 2008, 18:20:14 »
Well before we aim at fixing the sync problem, we should be able to prevent it causing crashes, after all a slew of console error messages and warnings is far more useful than a crash message, and it would help track down desync causes too.

As for me, I'd say to the point and perhaps blunt.
« Last Edit: 1 January 1970, 00:00:00 by AF »

titi

  • MegaGlest Team
  • Airship
  • ********
  • Posts: 4,240
    • View Profile
    • http://www.titusgames.de
(No subject)
« Reply #17 on: 13 February 2008, 19:22:44 »
Duke did you really had this crash in single player mode???
I and my sons are playing so often in linux and we never never never  had a problem like this! BUt If its really the case it's probabaly mostly just a "simple" bug and not the big sync trouble.

another thing that we should try here is my binary build on Ubuntu Dapper 6.06 with glibc 2.36. this should(hopefully) run on a lot of systems. Lets all try to use this binary in multiplayermode.

It's glest 3.0.0
http://www.titusgames.de/linuxglest300.tar.gz

@AF: I looked at the code and it's easy to ignore these errors and give warnings instead. This should easily be done. But lets wait a bit what martino and matzeB will do. I think/hope they are on it and there will be some debug sessions soon.
« Last Edit: 1 January 1970, 00:00:00 by titi »
Try Megaglest! Improved Engine / New factions / New tilesets / New maps / New scenarios

AF

  • Guest
(No subject)
« Reply #18 on: 13 February 2008, 19:52:28 »
hmmm.

My NTai AI project had a buffer overrun bug in it for months, at the time I didn't know how to use a debugger and it never crashed for me, hence why it went undetected. One day I learnt how to debug C++ programs, and found it, and at the next release a large group of people started commenting on how they'd never been able to play before now.

A long time ago that was but it serves the point that sometimes a crash bug only affects some people despite using the same binary and OS. Wierd ^_^
« Last Edit: 1 January 1970, 00:00:00 by AF »

martiño

  • Behemoth
  • *******
  • Posts: 1,095
    • View Profile
(No subject)
« Reply #19 on: 13 February 2008, 20:18:07 »
Quote from: "AF"
Well before we aim at fixing the sync problem, we should be able to prevent it causing crashes, after all a slew of console error messages and warnings is far more useful than a crash message, and it would help track down desync causes too.

As for me, I'd say to the point and perhaps blunt.


I completely disagree, the crashes are intentional (it would be as difficult to let the game run). It is far better to get a crash and know that the game is desynchronized, than just playing a game which is different on every machine.

As for a way to fixing the problem we are considering redistributing some kind of "reference binaries", so everybody has the same exe.
« Last Edit: 1 January 1970, 00:00:00 by martiño »

titi

  • MegaGlest Team
  • Airship
  • ********
  • Posts: 4,240
    • View Profile
    • http://www.titusgames.de
IT WORKS!!!!!!!
« Reply #20 on: 13 February 2008, 21:50:52 »
NO CRASH WITH SAME BINARY!!!!
We played some games today using my binary and there are no more crashes!!!!!! It simply works!
Really different hardware this time and no more errors!

So please use it and tell me about errors:
http://www.titusgames.de/linuxglest300.tar.gz

My "reference" binary has an old glibc 2.36 and for this it should run on most systems ! ( build on ubuntu 6.06 dapper )
Martinho probabaly my binary could be the linux reference its tested :)


-----------------------------------------
tested gaming:
host:
AMD Athlon(tm) 64 Processor 3400+
NVidia Geforce 6600GT
1GB ram
Ubuntu 6.06 Dapper

client:
Intel(R) Celeron(R) CPU 2.00GHz
GeForce 6200/AGP/SSE2
« Last Edit: 1 January 1970, 00:00:00 by titi »
Try Megaglest! Improved Engine / New factions / New tilesets / New maps / New scenarios

titi

  • MegaGlest Team
  • Airship
  • ********
  • Posts: 4,240
    • View Profile
    • http://www.titusgames.de
(No subject)
« Reply #21 on: 13 February 2008, 22:48:37 »
martino, if you release the 3.1.0 version you should probabaly put your old binary check inside. This will help, that all players use the same binary.

If you find such an error there is no need to leave the whole game ( in my opinion) Just give an errormessage and go back to the mainmenu. This would be better!!
« Last Edit: 1 January 1970, 00:00:00 by titi »
Try Megaglest! Improved Engine / New factions / New tilesets / New maps / New scenarios

Duke

  • Guest
(No subject)
« Reply #22 on: 14 February 2008, 01:22:29 »
When I think about it I'm not sure if it was exactly THIS bug, but the one I'm refering to was definately single player since I haven't played multiplayer yet.

It is just a hint that it could be a very rare bug that becomes much less rare due to the async.
Maybe the Client commanding a unit thats does not exist on the server?

Of course reference binaries are a solution to the async caused by sligtly different mathematics.
But Titi only tested on Lan right?

Is there code to catch async that is cause by lag due to high ping over internet? I somehow doubt it, because in my understanding such code should be able to handle the other async as well.
« Last Edit: 1 January 1970, 00:00:00 by Duke »

titi

  • MegaGlest Team
  • Airship
  • ********
  • Posts: 4,240
    • View Profile
    • http://www.titusgames.de
(No subject)
« Reply #23 on: 14 February 2008, 10:00:54 »
No, its tested in real Life with high ping over the internet. The client had some speedups to get back into sync ( when the ping was really bad) but that's all.
We played about 3 hours without a crash. When we used different binaries we only had to wait some minutes until it crashes.
« Last Edit: 1 January 1970, 00:00:00 by titi »
Try Megaglest! Improved Engine / New factions / New tilesets / New maps / New scenarios

jrepan

  • Guest
(No subject)
« Reply #24 on: 14 February 2008, 14:28:28 »
One binary is not very good solution. It has many problems:
1. Linux and Windows (and FreeBSD) binaries are anyway different
2. You can't add patches if you can't compile
3. Everybody may not have same version of libraries and static binaries are waste of space
« Last Edit: 1 January 1970, 00:00:00 by jrepan »