The Generic Game server development blog: Stress testing

Here is a little video showing us stress testing the GGS prototype.

Here are the conclusions we reached today, the stuff we need to work on, and
the thoughts that came up when we tested the application.

*Can ping be used to measure the network saturation?*
Quite possibly yes; ping measures the round-trip time of one ICMP packet (as
far as I know), and "if all goes well", the RTT will be the same for all
packages. It is possible, though, that some harwdare (read: the switch)
prioritizes ICMP packages through QoS (I think it's called that). It's also
possible that the OS treats pings differently than other traffic. This is
the main "problem" I have with using ping to measure saturation. We should
look these things up, if pings are not specially treated, they are a good
measurement of saturation.

*Optimize network traffic by sending less data*
This is obvious. Less traffic means less saturation. We should/could/may
send less data, and therefore increase performance both by removing the
(potential) network bottleneck, and remove some processing load from the
network cards / network stacks of the OS:es.

*Are screen print-outs a performance issue?*
We can possibly lower the IO bottleneck by removing printouts, we did this.
There is a information loss here for us when debugging, so a tradeoff has to
be made. A logging system which is customizable would be ideal.

*Load average*
On Niklas' machine the load was *really* low, on my machine it never went
below 1, and never above 1.3. On Richard's Mac it was like 4.. Is this
really a good measurement? Are there others which we can use? Erlang has a
notion of "reductions" - maybe we can use that? Maybe we can measure the
processor time consumed?

*# sockets limit*
Common machines seem to have limits on open files, and this includes
sockets. We should look at how web servers are configured here. Is it common
to raise this limit.. Are there penalties / alternatives to this? On Linux,
change /etc/security/security.conf to increase the limit.

*Warning; Mnesia is overloaded!*
Can we use the DB less? Exactly what does this mean? Can we use a different
machine as DB server?

*Can we get super-accurate CPU stats?*
Can we get stats from /proc/ instead of top and the likes? Can we compare
different CPUs easily? Remember that the important thing is how the system *
scales*. How much more load do we have when there are 20 players compared to
10, for example.

These are the notes I kept when we did the tests. I think we can discuss
these topics in the practice section of the report, once we've figured them
all out!

The Generic Game server development blog

Thursday, April 14, 2011

Stress testing

1 comment: