This server will never be up or it will be one big fail with 20ppl online ... just have fun guys ^^
This is a discussion on HellsReapersKO - BETA CLOSED - RELAUNCH ANNOUNCED FOR NOV. 8TH. 100% Stability. within the Private Servers forums, part of the Knight Online (ko4life.com) category; This server will never be up or it will be one big fail with 20ppl online ... just have fun ...
Page: 26
This server will never be up or it will be one big fail with 20ppl online ... just have fun guys ^^
Yeah, why open server if your not even gunna have it open for first 2 weeks?
The logging errors penuing was talking about is with the itemlogging in the ebenezer shared memory that you had a fix for a while ago but we never thought to implement it because we never had a problem with less then 40 or so players so when the server started having alot more activity we got the errors and were really confused at first. The Death Logger I wrote doesn't really "read" the whole file every 500 MS, it jumps to the last place it checks in the file, then checks for new lines that have been written, which happens every 500ms, and even then it checks it without locking out the file for RW access, and it gets the rest of the information straight from memory. The big issue right now is getting a decent ODBC driver to work in Debian Linux so we can have a stable webserver that wont crash when cpu hits 100. Windows is really good about crashing when anything gets above 90.
FreeTDS was my first option, I was having troubles implementing it on the website only because of how picky it is. First if you don't read in all the lines it will throw an error, or if you have more then one connection at a time, OR if you don't clear the connection properly between queries ect. and implementing it on the website showed no signs of luck. So I now have installed the official linux port of MS SQL's odbc driver (for redhat) but there was a work around for debian. The only problem now is getting php to use the DSN or Driver (works in console).
Well, last night, I was carefully watching the server, and processes when I was asking people to login so I could watch these things. Here is what happened when the server would start lagging. This is pulled from Aujard.
2013-11-3 4:26, *** 08S01, 121, [Microsoft][SQL Server Native Client 10.0]TCP Provider: The semaphore timeout period has expired.
Then after more people logged in, it would hit this.
2013-11-3 4:26, *** 08S01, 121, [Microsoft][SQL Server Native Client 10.0]Communication link failure, 68 ***
At this point monsters would become unattackable, and couldn't talk to NPC's, and our CPU was being maxed out because the Aujard was about to crash.
Then this would happen.
2013-11-3 4:27, *** 08S01, 10054, [Microsoft][SQL Server Native Client 10.0]TCP Provider: An existing connection was forcibly closed by the remote host.
At this point everyone online would get DC'd, I assume it was the game server booting people off before Aujard crashed. Once it booted everyone, the CPU was fine, the Aujard was fine, and the server was perfectly stable. I told people to get back on, 12 people were fine, then once we got to about 20 people, our TCP was maxing out at 70, and then it all happened again in the exact same order.
These are just my observations, Dark is trying to figure it out however some twostars insight would be really appreciated at this point. This is what is making us unstable.
And you know I've googled this all night until I went to sleep. Most people couldn't give any kind of good answer, and the ones that did just sort of said "Seems like a networking issue, seems like whatever you're doing your network can't handle it."
Oh and here's something interesting, our monsters have become non-attackable, but players aren't lagging. You can talk to one another, skills still use MP, and I killed a worm about 5 minutes ago, and just now leveled up. Aujard isn't getting an error. Nothing is.
Last edited by TheREALPenguin; 11-03-2013 at 04:31 PM.
CPU maxed out by what... Aujard? I'm thinking something might be making an obscene number of connections to something (the database server?).
And yeah, FreeTDS is pretty fussy. But once you've identified all its kinks, it is, at least, reliable.
That's fucked up.
a. It should only be using a single core (your server has 1 core?).
b. It shouldn't have anything to do with kicking players out (unless all connections to everything dies as a result of the CPU maxing out).
c. Monsters not being able to be attacked suggests the AI server desynced (if it's not completely disconnected right now). It won't remove them from Ebenezer on disconnection, and when the AI server's restarted it'll use new IDs for everything, so it'll get totally confused and stuff breaks.
I'd probably start by killing off anything you've implemented (e.g. the log stuff), and then monitoring incoming traffic / connections. If it remains stable, most likely it is indeed your stuff. So, at this point, you'd need to assess what it is exactly you're doing with your stuff (vague, but all I know about is the death stuff) & identify any bottlenecks/performance issues associated with it, test to verify, and attempt to resolve them (rinse and repeat, obviously).
The CPU being maxed out really does bother me though. Which CPU are you using?
Edit:
Hm, then it's probably just extremely slow.
You guys don't by any chance have your AI server / MSSQL ports open, right? Just game / web / login (/ FTP)?
And there's no SYN flooding?
server is good. I don't care comments. I just getting fun, am looking for that. it's really good server for me. Idc.
Our main server files (ai, ebenez, auj, login) are running on a dedicated instance with 1 core for computing.
The only problem I could see with the Death Logs is the frequency of sql query insertions on deaths, but even then its not enough to bottlneck any network proticols it should be able to handle many many many queries p/s and its thread-safe. Our network and cpu are stable so this is why we were thinking it was either a security patch we missed or some type of loop or error the server files aren't handling properly.
The webserver is on a seperate instance, so is the sql, both are running via cloud and the only way to access the sql is through the server ip's that I allow from within the cloud network. Flooding is definitely something we are looking at but its not as apparent when JUST looking at the instance with the server files, and it would take a bit more work to actually get a flood working properly in the server files then it would in the webserver or some other easily accessable proticol, not only that but I would expect a flood to take down the login server or ebenezer not the aujard or aiserver.
Yeah, but you're talking about Windows. Aujard's easy as hell to take down; just introduce database connection instability (as above), and there you go. It'll eventually crash.
The real question is why it's struggling to connect to the database server, what's causing this to happen? SYN floods are a really good example of this, as they'll leave a phenomenal amount of half-open connections which sit there, being useless, using up sockets, preventing other connections from being made, and consequently things failing.
Having said that, using 1 core is a very, very bad idea. The AI server was never designed to be run like that; it exists primarily to be run on a separate box (as single core servers were the norm back then), as it's a considerable resource hog. So much so, it deems it necessary to give itself higher priority. On modern servers, we usually get by because context switching between multiple cores alleviates the strain a lot (though, try to debug it, and watch everything die).
Honestly, if there really is 1 core being allocated for those servers, I'm guessing that's your main problem. More players means more NPC activity (they're "awake" and moving about), which means more threads (the AI server will most likely be running at least 200 threads, most likely more) hogging CPU time, and everything slowing to a crawl. :/
You make a very valid point, and upgrading to 2 cores or even putting the server files on a seperate instance is a very liable option right now. Have there been much success with seperating the aiserver and ebenezer on 2 seperate instances? if not then I will probabably just stick with 2 cores.
Bookmarks