Thread Rating:
  • 2 Vote(s) - 3.5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Automated server crash detection & reboot
#1
Hi!

As the title suggests, I was wondering if you ever thought of implementing scripts to detect when a server stops responding, and bring them back online automatically?

I have been getting 'limbo'd' due to server crashes almost every night since the restart--in fact both of my avatars have been unable to log in since last night--and it's gotten to a point where I think a system like this could be hugely beneficial. Would it be feasible to write some shell scripts to accomplish this? I think it would go a long way towards improving the game experience for everyone.
Reply
#2
http://hazeron.com/wiki/index.php/Limbo
Quote:Nowadays however, the cause of limbo is commonly due to a server crash. The servers are all running in GDB, which causes them to lockup when a crash happens and wait for inspection before being able to resume or restart. This helps a lot with fixing server crashing bugs.
Hazeron Forum and Wiki Moderator
hazeron.com/wiki/User:Deantwo
Reply
#3
It is possible if it's a crash, there is many ways to do it, for example in python;

https://stackoverflow.com/questions/2237...n-segfault

But might some limbos are due to deadlocks and not crash, then it will be a little more difficult (not that difficult, but a little more difficult, with an alive ping or whatever).

In both cases yes, an automatic solution should be found.
[Image: unknown.png]
Reply
#4
(03-07-2020, 11:39 PM)Mal Wrote: As the title suggests, I was wondering if you ever thought of implementing scripts to detect when a server stops responding, and bring them back online automatically?

If a server stopped responding, it means something bad happened. If you blindly reboot it, the chances are high that this will happen again. Or that bad happened could spread to other servers.

This question has been raised over and over again for last seven years. With all the same result.
Only thing that changed is that now all servers are running unde GDB and when they crash, they don't just wanish, but are being held in their last state by the debugger, so a programmer could take a look and probably fix the root issue later.
Reply
#5
(03-08-2020, 06:16 PM)AnrDaemon Wrote: If a server stopped responding, it means something bad happened. If you blindly reboot it, the chances are high that this will happen again. Or that bad happened could spread to other servers.

In almost every case, it works like you started it the first time. It SHOULD restart, and log why and where it crashed. It will happen again? Okay, it will in few hours, but at least players can play without having to wait sometimes for days until a manual server reboot is done.
[Image: unknown.png]
Reply
#6
We NEED this!
Reply
#7
Imagine if a game like world of warcraft used a system like this to handle crashes (and I'm sure their servers do crash)
Reply
#8
(03-08-2020, 11:27 PM)Nitro2030ce Wrote: Imagine if a game like world of warcraft used a system like this to handle crashes (and I'm sure their servers do crash)

The probleme here, is that you are comparing a game with only one person working on it, and one with a team of hundreds....
I agree thats an automatic restart come with troubles AKA problems not solved and thus crash again after some times.
But ! When you are alone, doing the code, the level design, the server management, the community management etc etc ... You need to compromise on some points.
If Hazeron was still free, personally i wouldn't mind, but right now, we are paying a product, and lot of our play time is wasted by these "Limbo everywhere".
I sure don't like bringing the money talk on the table, but thats a fact, once you pay for a service, this one need to do its utmost to satsfy his customers.
Don't make me wrong, i enjoy Hazeron a lot, and im happy to support its devlopment, but as every human being on this planet, i don't like seeing my right to access a service i subscribe to being denied because of servers issues thats are years old and still not fixed... Specialy when it can last a full day.
Reply
#9
(03-09-2020, 02:50 AM)Fedorya Wrote: I sure don't like bringing the money talk on the table, but thats a fact, once you pay for a service, this one need to do its utmost to satsfy his customers.
... i don't like seeing my right to access a service i subscribe to being denied because of servers issues thats are years old and still not fixed...

If the servers are down for a lengthy amount of time, Haxus normally credit a little gametime to everyone to compensate.

But no, the server issue isn't years old. I don't remember Haxus saying he has any major server crashing issues that has been eluting him for years.
Hazeron Forum and Wiki Moderator
hazeron.com/wiki/User:Deantwo
Reply
#10
being able to play ~3h for the past 5 days because of limbo, a solution must be found.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)