Hazeron Forums
Automated server crash detection & reboot - Printable Version

+- Hazeron Forums (https://hazeron.com/mybb)
+-- Forum: Shores of Hazeron (https://hazeron.com/mybb/forumdisplay.php?fid=1)
+--- Forum: Arena of Ideas (https://hazeron.com/mybb/forumdisplay.php?fid=15)
+--- Thread: Automated server crash detection & reboot (/showthread.php?tid=2204)

Pages: 1 2


Automated server crash detection & reboot - Mal - 03-07-2020

Hi!

As the title suggests, I was wondering if you ever thought of implementing scripts to detect when a server stops responding, and bring them back online automatically?

I have been getting 'limbo'd' due to server crashes almost every night since the restart--in fact both of my avatars have been unable to log in since last night--and it's gotten to a point where I think a system like this could be hugely beneficial. Would it be feasible to write some shell scripts to accomplish this? I think it would go a long way towards improving the game experience for everyone.


RE: Automated server crash detection & reboot - Deantwo - 03-07-2020

http://hazeron.com/wiki/index.php/Limbo
Quote:Nowadays however, the cause of limbo is commonly due to a server crash. The servers are all running in GDB, which causes them to lockup when a crash happens and wait for inspection before being able to resume or restart. This helps a lot with fixing server crashing bugs.



RE: Automated server crash detection & reboot - Xantheose - 03-08-2020

It is possible if it's a crash, there is many ways to do it, for example in python;

https://stackoverflow.com/questions/22370928/is-it-possible-to-automaticly-restart-program-in-gdb-after-exception-segfault

But might some limbos are due to deadlocks and not crash, then it will be a little more difficult (not that difficult, but a little more difficult, with an alive ping or whatever).

In both cases yes, an automatic solution should be found.


RE: Automated server crash detection & reboot - AnrDaemon - 03-08-2020

(03-07-2020, 11:39 PM)Mal Wrote: As the title suggests, I was wondering if you ever thought of implementing scripts to detect when a server stops responding, and bring them back online automatically?

If a server stopped responding, it means something bad happened. If you blindly reboot it, the chances are high that this will happen again. Or that bad happened could spread to other servers.

This question has been raised over and over again for last seven years. With all the same result.
Only thing that changed is that now all servers are running unde GDB and when they crash, they don't just wanish, but are being held in their last state by the debugger, so a programmer could take a look and probably fix the root issue later.


RE: Automated server crash detection & reboot - Xantheose - 03-08-2020

(03-08-2020, 06:16 PM)AnrDaemon Wrote: If a server stopped responding, it means something bad happened. If you blindly reboot it, the chances are high that this will happen again. Or that bad happened could spread to other servers.

In almost every case, it works like you started it the first time. It SHOULD restart, and log why and where it crashed. It will happen again? Okay, it will in few hours, but at least players can play without having to wait sometimes for days until a manual server reboot is done.


RE: Automated server crash detection & reboot - Norm49 - 03-08-2020

We NEED this!


RE: Automated server crash detection & reboot - Nitro2030ce - 03-08-2020

Imagine if a game like world of warcraft used a system like this to handle crashes (and I'm sure their servers do crash)


RE: Automated server crash detection & reboot - Fedorya - 03-09-2020

(03-08-2020, 11:27 PM)Nitro2030ce Wrote: Imagine if a game like world of warcraft used a system like this to handle crashes (and I'm sure their servers do crash)

The probleme here, is that you are comparing a game with only one person working on it, and one with a team of hundreds....
I agree thats an automatic restart come with troubles AKA problems not solved and thus crash again after some times.
But ! When you are alone, doing the code, the level design, the server management, the community management etc etc ... You need to compromise on some points.
If Hazeron was still free, personally i wouldn't mind, but right now, we are paying a product, and lot of our play time is wasted by these "Limbo everywhere".
I sure don't like bringing the money talk on the table, but thats a fact, once you pay for a service, this one need to do its utmost to satsfy his customers.
Don't make me wrong, i enjoy Hazeron a lot, and im happy to support its devlopment, but as every human being on this planet, i don't like seeing my right to access a service i subscribe to being denied because of servers issues thats are years old and still not fixed... Specialy when it can last a full day.


RE: Automated server crash detection & reboot - Deantwo - 03-09-2020

(03-09-2020, 02:50 AM)Fedorya Wrote: I sure don't like bringing the money talk on the table, but thats a fact, once you pay for a service, this one need to do its utmost to satsfy his customers.
... i don't like seeing my right to access a service i subscribe to being denied because of servers issues thats are years old and still not fixed...

If the servers are down for a lengthy amount of time, Haxus normally credit a little gametime to everyone to compensate.

But no, the server issue isn't years old. I don't remember Haxus saying he has any major server crashing issues that has been eluting him for years.


RE: Automated server crash detection & reboot - AlrianneG - 03-09-2020

being able to play ~3h for the past 5 days because of limbo, a solution must be found.