Grafana monitoring

Scaling TYPO3 with a loadbalancer

Depending on your application, sometimes a simple webserver isn’t going to cut it anymore. One way to fix this is scaling. There are 2 ways to scale: up or out (there might be some exotic scaling methods, but lets leave it here).

Basics to think about

It is really easy to think: go go scale. To be honest one if the most difficult things to think about is where your bottlenecks are. Sometimes you application is so badly made code-wise that you can scale indefinitely without any effect. First of all you must be sure that the application can actually be scaled. What are the bottlenecks, what specs are you missing so scaling would actually help? If the server only uses 50% of its resources but its slow; maybe the problem is somewhere else..

Scaling-up

So the basics what mostly happens is scaling-up. The reason is fairly simple: it is actually simple. The downside is that you can hit a ceiling early on and then scaling-up does not work anymore. I simple example is having a 2 CPU – 2 Gig memory server and you increased both values to 4. This could work if you were hitting the limit on either cpu/memory (or even both). Some applications only have a slow growth and eventually you hit some sort of maximum on the current server/stack. A simple fix is indeed to scale-up. Just add some memory or cpu’s to stay up-to-date with this small growth.

Scaling-out

When scaling out, instead of increasing the amount of cpu’s, you simple take it wider by adding an extra server. Either way this server is a 1 on1 copy of the first server doing the exact same job, or it takes a typical process over so that server 1 does process X, server 2 does process Y.

The real advantage of scaling-out is that you could -always- scale further without hitting any limit. I’m sure there are servers with 128 cores but in one way or another you will hit a limit with scaling-up. With scaling-out you could add server after server without really hitting a limit ever.

An other advantage is that you could built a stack thats specialized in a type of process. Having 1 server doing ‘all’ the jobs for the application could result in some type of bottlenecks just because it has to do everything. For instance PHP does not really need a lot of memory, while SQL actually works great with a load of memory. Having PHP on a dedicated server with a nice amount of CPU-cores, while SQL runs a dedicated server with a good amount of memory will result in much better performance than 1 huge server having both a load of memory and CPU’s. For the sake to keep it semi-simple I do not include other things as I/O and such 😉

Scaling TYPO3

So at my work we wanted to move an application to the cloud. It used to rest on a fairly big managed server. We are using SOLR (indexation), MySQL (database) with PHP (script processor) and Apache2 (webserver).

It was actually working pretty good, but far from effective. Every part I just named would be better of on its own dedicated server. It its not really a big thing to move SOLR or MySQL to another server. That’s basically changing the hostname in the application to use a “remote” server instead of the “127.0.0.0 / localhost” host.

The more challenging problem is to use multiple web-servers or database-servers. The reason is: sessions and local storage or just simply called data-sync/validation

Sessions

When a client creates a session, this will mostly be handled by the web-server. That is a problem when you have 2 or more web-servers. How does web-server X know that a session was made on web-server Y? The great part of TYPO3 (some might disagree) is that sessions are stored in the database. All the web-servers will fetch this from our (for now) single database-server.

Even though this part is super simple since you don’t have to do anything about it, I still wrote it down. It is something you should always keep in mind how these sessions work.

Specially if you want to scale a lot further, these session requests to the database can be killing for performance. This is the case when you have a lot of visitors having a session. A solution can be file-based session storage that is shared between every web-server.

Local storage

The other thing is local storage. Files created by TYPO3 (or whatever application) that should be available between every web-server. For instance what if a (backend) user uploads a new image. It will get stored on server X, while server Y also should server that image.

There are a few solution like a CDN (content delivery network), settings for the proxy so files are uploaded & served from 1 web-server only. The other option is adding a disk-mount that can be used by all web-servers. In my opinion this is the most easy solution.

Load balancer

With nginx and apache you can make a load balancer. The essentials are pretty easy: simply add your web-servers as upstreams.

Personally nginx is a bit more mature to function as a load balancer.

http {
    upstream myapp1 {
        server srv1.example.com;
        server srv2.example.com;
        server srv3.example.com;
    }

    server {
        listen 80;

        location / {
            proxy_pass http://myapp1;
        }
    }
}

Above code is about it what you need to do. Obviously you need to tweak a bunch of settings but the basics are fairly simple!

NFS mount

And back to mounting (in this case its for ubuntu 14/16)

/home/apache-user/shared/typo3temp 11.22.33.*(rw,sync,no_subtree_check)

Above example is how you could make a share of the typo3temp folder, for any server in the 11.22.33.* internal IP range. You place this in /etc/exports followed by a service nfs-kernel-server restart

On each web-server you should mount this point

mount 123.123.123.123:/home/apache-user/typo3temp /home/apache-user/typo3temp

After checking with df -h you should see the newly made NFS mount. You should do this for the typo3temp folder, uploads and the fileadmin.

An other way is to create 1 single folder with these 3 folders. Then create symlinks for your application.

For instance

ln -s /home/apache-user/shared/typo3temp typo3temp

that way you only need 1 mount point to the “shared” folder giving you a bit more options and it is easier to mount for the other servers.

Result

I have redacted some names but below is a setup for a website that has around 60.000 visitors each day.

Grafana monitoring

Grafana monitoring

The database-server has 4 cores and 8 gig of memory. Each web-server has 3 cores and 3 gig memory.

The server-load is around 1 for each web-server, while this could easy go towards the 2-3 if needed.  The load of the database is only around 0.5, while this easily could go towards the 4.

Not only can we scale-out (multiple web-servers) we could also scale-up if needed. If any the database-server can be scaled-up big time before hitting a ceiling but that will most likely never happen. If it is really needed we can place extra data-base servers based on a master-slave setup, but that would be a huge overkill for now.

Even the webservers are fairly simple. Either we simply place more web-servers or we upscale the servers with more cpu/cores.

Conclusion

No matter what application you are using or with what techniques you want to scale; you should always think. In the base-case you want to be able to have possibilities. Scale-up for an expected spike,  scale-out for expected growth. Scale whatever if something unexpected occurs.

The basics should be simple and able to withstand spikes while allowing you to invest in further growth. Don’t overcomplicate things like a major CDN setup or a huge database farm when you are able to grow x10 times on your current setup, yet do not close the gate on that option. Let it be available for the case you actually need it in a year, or two.

Leave a Reply

Your email address will not be published. Required fields are marked *