A developer friend of mine doing work for a mobile operator in West Africa needed advice because their database setup experienced severe congestion over Valentine’s Day. I could recall us discussing the same problem back when we studied together and my advice then was to replace MySQL with Percona. Eventually MariaDB was the chosen replacement, which is also a great choice.
Talks with government officials are underway to migrate their databases to a specialist PaaS (Platform as a Service) which should ultimately ensure ongoing scalability for the future but in the meantime developers will each be provisioned with a virtual system installed with MariaDB 5.5. Workloads and tweaks can then be isolated but resource allocation is scarce because of limited hardware.
My personal memory allocator of choice has always been jemalloc simply because it operates with leaner memory utilisation. Memory consumption can be difficult to measure because the reading can differ greatly from one extremely short interval to another when an optimised memory allocator is used. My recommendation was jemalloc because the developer servers won’t be single purpose but he wished to know my opinion of TCMalloc.
TCMalloc is a seriously fast memory allocator which is part of gperftools but there are common inaccuracies surrounding its memory handling. Many mention that memory is never released back to the system by TCMalloc but this is not factual information and can even be controlled by adjusting the TCMALLOC_RELEASE_RATE and TCMALLOC_HEAP_LIMIT_MB environment variables as outlined in page_heap.cc. The default TCMALLOC_RELEASE_RATE value is 1.0 which means memory is released, albeit slowly.
There is no denying that TCMalloc can overpower jemalloc in overall performance but only by a small margin in many cases. It has been some time since last I checked, so I decided to conduct some benchmarks on very limited virtual servers to test the benefits when compared with glibc.
The usage of an alternative memory allocator can be determined in a variety of different ways such as checking the MariaDB log file after startup or perhaps manually with pmap, lsof or in /proc such as in the examples below:
[root@benchmark ~]# pmap $(pidof mysqld) | grep malloc
00007f7e1404f000 196K r-x-- libjemalloc.so.1
00007f7e14080000 2044K ----- libjemalloc.so.1
00007f7e1427f000 8K r---- libjemalloc.so.1
00007f7e14281000 4K rw--- libjemalloc.so.1
[root@benchmark ~]# grep malloc /proc/$(pidof mysqld)/maps
7f7e1404f000-7f7e14080000 r-xp 00000000 fd:01 27453423 /usr/lib64/libjemalloc.so.1
7f7e14080000-7f7e1427f000 ---p 00031000 fd:01 27453423 /usr/lib64/libjemalloc.so.1
7f7e1427f000-7f7e14281000 r--p 00030000 fd:01 27453423 /usr/lib64/libjemalloc.so.1
7f7e14281000-7f7e14282000 rw-p 00032000 fd:01 27453423 /usr/lib64/libjemalloc.so.1
Each benchmark is based on a default MariaDB installation with sysbench 0.4.12 and is executed four times. The standard complex OLTP (On Line Transaction Processing) benchmarks are used instead of the newer customisable Lua workloads. The following arguments are specified for each test, proceeded with a table drop / recreation and reboot:
–oltp-table-size=2000000
–max-time=300
Two separate CentOS 7 virtual servers with matching E5645 CPU clockspeeds were used but with differing memory and core count assignments:
- 1 core at 2.4 GHz with 2 GiB of memory
- 4 cores of 2.4 GHz with 1 GiB of memory
The results show a benefit for InnoDB with either TCMalloc or jemalloc over glibc even for low end specifications. There’s a narrowed performance gap between the server with constrained memory versus the server with a limited processor configuration as thread counts increase when glibc is replaced.
MariaDB 10 is now built with jemalloc by default.