Page 1 of 1

tests results sharing for "unstable" GG5

PostPosted: Sat Dec 20, 2008 12:53 pm
by -=k2=-
Hi guys,
I'd like to share some tests results with you concerning the random crash map changes unstability

My initial config:
Debian 4.0 linux server
eventscript 2.0 beta RC250
metamod 1.4.2.705 (started by gameinfo.txt)
Mani 1.2BetaS (started by metamod)
Gungame utils
EST 0.420
GG5 RC277 configured with deathmath, randomspawpoints, votemap, handicap,mutilevel bonus,turbo, earn grenades, stats login, dissolver,knife pro
custom addon gg_winner
French translation

this config was "unstable" crashing randomly after a few maps

So I have tried to identify or narrow down what could be the causes or at least what are the one that are not causing the crashes

I have tried a couple of things

Map changes speed up the cycle
I have set a very reduced weapon cycle that turns on 4 weapons, no He, no knife so cycle map is switching maps quickly, every 2/3 mins
All the maps used in the cycle are tested mapped (2/3 years used on a GG3 and GG4 in full time production and very stable).

Spawnspoints conversion from GG4 to GG5:
I have used the converter but it's giving me weird results if I do convert all files in one shot
the size of the spwanpoint file is dramatically increasing in sequence from 2K up to 120K for the last one....
The non verified spawpoints files were also causing crashes
So now I 'm using only a few spanwpoints files I'm sure they are correct and working on a deathmath 2.1 version.
I have verified them one by one in format and content


Vote testing:
A1) server with vote map mani,
A2) votemap GG,
A3) without vote map,
no change still crashing after a few maps randomly

plugins
B1) run server without EST, just gg_utils
B2) run the server without mani
B3) run the server without mani and without metamod, just evenscript R250
no change still crashing after a few maps randomly

translations
revert to the "original" english version, no french
Still crashing but less, I have removed all ' caracters from translation and used only the éàè french caracters
So I 'm conviced there is something there, most probably due to my translation and of some caraters that are not "escaped" in the translation wording. I suspect ' and () being interpretted as command or variable operators or delimiters


GG_winner plugin
I have seen in the error log many time the error:

File "/home/ftp/doozer_VI/cstrike/addons/eventscripts/gungame/custom_addons/gg_winner_display/gg_winner_display.py", line 55, in sendDisplay
usermsg.motd(userid, 2, 'GunGame Winner', '%s?winnerName=%s&loserName=%s&wins=%s&place=%s&totalPlaces=%s' % (gungamelib.getVariableValue('gg_winner_display_page'), attackername, username, wins, place, totalPlaces))
File "/home/ftp/doozer_VI/cstrike/addons/eventscripts/_libs/python/usermsg.py", line 65, in motd
showVGUIPanel(users, 'info', visible, data)
NameError: global name 'users' is not defined

I have disabled the custom addon gg_winner and it seems to improve again the stability, testing underway

GG version
downgrade from RC277 to RC257
I moved to this version before I had disabled the GG_winner addon


last config in test currently:
Eventscript 2.0 rc250 (from RC248c)
GG5 RC257
gg_utils
No est, metamod, mani admin anymore

So to summarize for the folks having unstable config, I would advise to look at:
- spawnpoint files (remove them if not verified)
- plugins (start with a minimal setup)
- translation (turn back to english only if doubtfull translation)
- custom addons (just use the included native addons and disable the max of them to start with)

Running now as a public free test server at the following address 91.121.181.138:27090

I would suggest to include in debugger an option to trace the change map and keep track of the map pending when crash occurs, this may help to tune the server as there are many parameters that can crew up, bad maps, spawnpoint files etc

Is there anything you want me to test ? I have full control on the gameserver and I'm root on the linux server.
I'll have 2/3 weeks time free during chrstmas for debugging.

let me know!

cheers!

-=k2=-

PostPosted: Sat Dec 20, 2008 2:01 pm
by Saul
Memory leak?

Thats the only thing I can think of at the moment...

Re: tests results sharing for "unstable" GG5

PostPosted: Sat Dec 20, 2008 4:32 pm
by -=k2=-
Hi Saul,

I dont think it is a memory meak because I have 4+1 other servers on the same machine, running without flows and that are very stable on line 24H/7
I also do not have memory trap errors on linux kernal.

I keep you informed about my various testings

cheers

-=k2=-

Re: tests results sharing for "unstable" GG5

PostPosted: Sun Dec 21, 2008 5:38 pm
by your-name-here
It's a memory leak.

Tnbporsche911 has been telling me quite a bit about how his servers are consuming ludicrous amounts of ram. He is using a windows server. So Saul, your assertion would be correct.

PostPosted: Sun Dec 21, 2008 9:42 pm
by RideGuy
It's not a memory leak. The server crashes if the memory footprint hits 1.5GB. There is no way the server is reaching that after only a few maps.

We have coded instability into one of the later revisions of gungame. None of our servers (both Windows and Linux) crash but due to the number of Linux server owners reporting crashes there has to be a problem.

We have asked a couple of our testers who have servers that crash to revert back to an earlier version. If that works we might revert everything back and then slow introduce the last couple of changes.


RideGuy

Re: tests results sharing for "unstable" GG5

PostPosted: Mon Dec 22, 2008 12:52 am
by -=k2=-
Concerning the memory, I have 4GB on the hw server and 4 servers only running on it, 2 publics of 24 slots, never full at the same time and 2 wars servers of 12 slots
I have never ever reached more than 2GB mem usage see graphs here: http://91.121.181.138/munin/ovh.net/ns363699.ovh.net-memory.html

Concerning the gunmod look here for the 27090, a RC 577 that has worked out no problem today during 9 hours and than suddenly saw its fps down from 800 to 0 and has eaten all cpu up to 200%, we had to shut it down. It also allocated too much memory, estimated to 500K for this server only.
This event has been captured on the graphs here: http://91.121.181.138/munin/ovh.net/ns363699.ovh.net.html#Sourceds
I have attached and documented this in the issue log id 98

The only type of error I'm getting in the log is:
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Exception caught: 21/12/2008 @ [19:38:17] [Occurences: 219]
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Traceback (most recent call last):
File "/home/ftp/doozer_VI/cstrike/addons/eventscripts/es.py", line 207, in tick x()
File "/home/ftp/doozer_VI/cstrike/addons/eventscripts/_libs/python/gamethread.py", line 179, in tick_executenode(task)
File "/home/ftp/doozer_VI/cstrike/addons/eventscripts/_libs/python/gamethread.py", line 156, in _executenode function(*a, **kw)
File "/home/ftp/doozer_VI/cstrike/addons/eventscripts/_libs/python/testrepeat.py", line 276, in fire raise RepeatError('Cannot fire repeat: \"%s\" does not exist.' %repeatName)
RepeatError: Cannot fire repeat: "gungameWarmupTimer" does not exist.


Cheers

-=k2=-

PostPosted: Sat Jan 03, 2009 4:07 pm
by Saul
-=k2=- is there any lines in the console like:
Assertion Failed: IsIdxValid(i)

Right before the crash (the last line in the console)?

Re: tests results sharing for "unstable" GG5

PostPosted: Tue Jan 06, 2009 8:30 pm
by -=k2=-
Hi Saul,

not at all, first time I hear from this kind of error.

nevertheless I have continued my investigations and here is some of the findings/confirmations and interrogations....

certainities/confirmations:
    I have now clearly the certainty taht spawnpoints files are playing a role in the crashes, might be not for all but for some, just doing a new spawnpointfile for some map solved the problem
    When I do not use the deathmatch and associated spawnpoints management, no crash occurs same if spanpoints file are removed for the particular map, Iwould thus say deathmatch module is not tdirectly the cause of it than

Interrogations
    Still Ii'm facing a weird situation in between my two servers, one in production and the second one the test one.
[list=]Some maps despite the re-generation of the spawnpointfile are still crashing the production server..... BUT NOT the testing server.[/list]

the fact is that all the maps and materials, sounds etc.... are exactly the same for the two servers as i'm owning the dedicated linux server I made symbolic links for maps, material and sound directories for all my servers so i have just to install the contents once...héhéhé clever isnt it ?

The only difference currently between y test and my production are:
- RC577 for procduction (failing some maps) and R585 for testing
- the lnaguage file translation in french available in RC577 not in RC585
    The RC577 has only gg_utils while the RC585 has gg_utils and est but both are configured for gg to use gg_utils
[list=]both are using same modules deathmatch, spawnpoints, earn nades intro message, dissolving turbo and knive rookie[/list]
[list=]Both servers are using mani_admin and metamod but I was having crashes even without them ,adding them did not change anything to the behaviors, eventscript is RC250[/list]
[list=]So with the setup I have on both servers, i can say they are stable (daily basis estimation, because I reboot every single day the minux server)[/list]
[list=]Still the maps and their spawnpoints files or the included spawpoints addon is driving the "unstability" in my case :( [/list]

If you want me to make some tests let me know, i still have a few day the opportunity to proceed on those after, i'll be back at work :cry:
may be talking about that on teampeak would be easier for both of us, let me know

Cheers

-=k2=-