Monday, May 30, 2011

Report and presentation

Hi everyone!

We are nearly done with our bachelor's thesis, yeeha!

You can download out thesis from here:

A Generic Game Server - Bachelor's Thesis (PDF)

And today we gave a 20 minutes presentation, here are the presentation slides.

Sunday, May 1, 2011

JavaScript Virtual Machine in GGS

In short: it works bitches!

Oh yeah, thanks to Mattias work this week we are able to use JavaScript as a programming language for our games now. There remains still a little bit of work but it does not hang when we try to get something into of from the database, check this out:

2> GameVm = ggs_gamevm:start_link("test_table").
2> ggs_gamevm:call_js(GameVm, "GGS.jeena.setItem('a','foo')").
3> ggs_gamevm:call_js(GameVm, "GGS.jeena.getItem('a')").     

This is so awesome, I won't be able to sleep tonight :D I think I need beer or something :D

What needs to be done now is to adapt the code so the first client, who will act as kind of a host, is able to upload the server source code (JavaScript) at the beginning and then other clients shall be able to connect to this one table.

Tuesday, April 26, 2011

Second testing session

Today we had an online testing session. Jeena started an instance of the GGS on his server in Germany. I(Niklas), Mattias and Jeena connected pong clients written in Erlang to the server in order to test how much data/how many client the GGS could handle. In the last testing session we tested the server by how many clients it could handle, however this seems like a bad way of testing a system like the GGS. Jeena found out that a good way of testing servers is how many messages they can handle per second and in today’s session we focused on that instead of how many clients that could be connected to the server. To be able to measure the number of messages per second did we add some minor things in the GGS.

When the GGS was operational we started to connect bots, the messages per second reached 5000 when about 200 bots where connected. However the number of messages per second did soon decrease to about 3500 even though we added more bots. We continued the testing for a while and the server did never proceed more than 5000 messages per second, most of the time it was pending between 3000 and 4000 messages per second. The conclusion we can draw from this is that the GGS currently is limited to around 5000 messages per second and when this number is exceeded the server will queue messages. A short message queue is acceptable but when it becomes to long the clients will get delays and the game will be impossible to play if to many clients are added to the same server.

The blue line is the amount of clients, the yellow one is how many messages per second the clients send and the red one how many messages per second the server ist sending to the clients.

Thursday, April 14, 2011

Stress testing

Here is a little video showing us stress testing the GGS prototype.

Here are the conclusions we reached today, the stuff we need to work on, and
the thoughts that came up when we tested the application.

*Can ping be used to measure the network saturation?*
Quite possibly yes; ping measures the round-trip time of one ICMP packet (as
far as I know), and "if all goes well", the RTT will be the same for all
packages. It is possible, though, that some harwdare (read: the switch)
prioritizes ICMP packages through QoS (I think it's called that). It's also
possible that the OS treats pings differently than other traffic. This is
the main "problem" I have with using ping to measure saturation. We should
look these things up, if pings are not specially treated, they are a good
measurement of saturation.

*Optimize network traffic by sending less data*
This is obvious. Less traffic means less saturation. We should/could/may
send less data, and therefore increase performance both by removing the
(potential) network bottleneck, and remove some processing load from the
network cards / network stacks of the OS:es.

*Are screen print-outs a performance issue?*
We can possibly lower the IO bottleneck by removing printouts, we did this.
There is a information loss here for us when debugging, so a tradeoff has to
be made. A logging system which is customizable would be ideal.

*Load average*
On Niklas' machine the load was *really* low, on my machine it never went
below 1, and never above 1.3. On Richard's Mac it was like 4.. Is this
really a good measurement? Are there others which we can use? Erlang has a
notion of "reductions" - maybe we can use that? Maybe we can measure the
processor time consumed?

*# sockets limit*
Common machines seem to have limits on open files, and this includes
sockets. We should look at how web servers are configured here. Is it common
to raise this limit.. Are there penalties / alternatives to this? On Linux,
change /etc/security/security.conf to increase the limit.

*Warning; Mnesia is overloaded!*
Can we use the DB less? Exactly what does this mean? Can we use a different
machine as DB server?

*Can we get super-accurate CPU stats?*
Can we get stats from /proc/ instead of top and the likes? Can we compare
different CPUs easily? Remember that the important thing is how the system *
scales*. How much more load do we have when there are 20 players compared to
10, for example.

These are the notes I kept when we did the tests. I think we can discuss
these topics in the practice section of the report, once we've figured them
all out!

Wednesday, April 13, 2011

First attempt to get some statistics

Today was a great day, we have been able to fix the bugs which didn't allow us to run a bigger number of clients. After that we have tested to run more and more clients. At the end of the day we were able to run ca. 3000 clients and they all played pong with each other, we had too one normal player who played against a bot.

We had a look at the load of the computer which run the server and it was not really stressed but only up at a load of ca. 0.31. The problem seemed to be the network we have been working on, it was a ad-hock WLAN network between all our laptops which isn't the fastest. And since we use TCP which resends all the data packages which got lost we got quite a big amount of traffic and therefore a high round trip time ("lag" in game speek).

For the future we want to try to get a GB-network-switch and attach our computers via cable. It's a pitty that we will not have the time to implement UDP to get it working with lower lag.

Another bottleneck was running all the bots, which are written in ruby (and each bot is a real process), on our limited amount of laptops (we had only 3). Jonatan was able to start ca. 2500 on his linux machine but I, on OS X, was only able to start ca. 200, after that it said I was not allowed to start more processes. Therefore, before we will be able to do some real statistic stuff, we will implement the bots in erlang, processes there are cheap.

But it was a great feeling to see 1500 games of pong, it is indeed a real time game, running on our GGS-prototype without having performance problems other then the network-bottleneck.

Bots playing Pong

We got quite long with our GGS prototype now, I'm not sure if we'll be able to add the erlang<->js stuff because it seems not to work like we intend. Anyways, we have to focus on some statistics for now, so I just implemented a bot in ruby which can play the Pong game we already have written in erlang. So now we can run bots who play against each other Pong on our GGS prototype.

We have to do some more work before we can run thousends of games simultanously, but I think it will work next week or something. Then we will produce some nice statistics for our report.

You can take a look at the code, it is implemented in ruby and I devided it into different "modules" so we can reuse the GGSNetwork in other games too:

Writing the report is working quite well too, we already have 15 of the minimum 30 pages with actual relevant content. We'll have to work on the language though.

Thursday, March 24, 2011

Some rewrites, report, status

I'm working on some rewrites now, I am rewriting the ggs_player to be a gen_server because it is just much easier to handle and I am rewriting the ggs_protocol module because it didn't take in account that TCP is something like a stream and gen_tcp does not the work to seperate between different messages a client sends but only sends a callback when a tcp package arrives. Therefore sometimes only half of the data arrives and stuff like that. I had to add something like a accumulator to handle all of that.

Perhaps the more important thing is that we changed the direction of our thesis from "just" implementing a game server to researching about scalability and fault tolerance of game servers at a academic level. Currently we are reading many scientific reports and books about scalability and fault tolerance and try to be as scientific as possible when we write our report. You can actually take a look at it:

We still want a working Generic Game Server but it won't be a real world product but more some type of a prototype of such one where we test our hypothesis on performance, fault tolerance and scalability if we need to get our hands dirty.

To be able to test such stuff we need to write some bots which will play games via our server. And we need to write code which collects data during the games.

To get the code more stable we are trying to replace our unit tests with quickCheck tests.

Hope the project is still interesting for you who have been following it until now :-)