Why you should update your gcc (and c++ library)

Consider the code below:

It could be compiled and run on CentOS 5 (gcc-4.1.2), but will core dump at runtime.

The gdb stack shows the breakpoint is in string_hashfunc::operator():

Let’s see the source code of “ext/hash_map” in /usr/include/c++/4.1.2/ext/hashtable.h:

And in the implementation of _M_bkt_num():

It use _M_hash() to compute the bucket number of the key, and the _M_hash() is actually string_hashfunc::operator(). The reason is clear now: the iterator want to increase, so it call operator++() –> _M_bkt_num() –> _M_bkt_num_key() –> _M_hash() –> string_hashfunc::operator() and it can’t fetch the key because it has been freed in “delete it->first”.

How about new g++ and new c++ library? Let’s try to write the same program on CentOS 7 (gcc-4.8.5) and change “ext/hash_map” to “unordered_map” (for c++ 11 standard):

Then build it:

Everything goes normal because the new implementation of c++ library use “_M_nxt” to point to the next hash node instead of using hash function (could see it in /usr/include/c++/4.8.5/bits/hashtable_policy.h).

Why you should update your gcc

Consider this c++ code:

I compiled it on CentOS-5 on which the version of gcc is 4.1.2 in the first place and it report:

Can anyone find out the problem at first glance of this mess report ? The error report of c++ template is terrible difficult to understand, since I used it 9 years ago.
Then I try to compile the source on CentOS-6 with gcc-4.4.6

Looks almost the same. How about CentOS-7 with gcc-4.8.5

Aha, much better as it tell us the exact position of problem: “vec.begin()” will return a “const_iterator” which is not coherent to “iterator”.
To save your time for debugging c++ template code and enjoy life, please update your gcc.

Performance bottleneck in Jedis

I have had write a test program which using Jedis to read/write Redis Cluster. It create 32 threads and every thread init a instance of JedisCLuster(). But it will cost more than half minute to create total 32 JedisCluster Instances.
By tracking the problem, I found out that the bottleneck is in setNodeIfNotExist():

In the method setNodeIfNotExist() of class JedisClusterInfoCache, “new JedisPool()” will cost a lot of time because it will use apache commons pool and apache-commons-pool will register MBean Server with JMX. The register operation of JMX is the bottelneck.

The first solution for this problem is to disable JMX when calling JedisCluster():

The second solution is “create one JedisCluster() instance for all threads”. After I commited patch for Jedis to set JMX disable as default, Marcos Nils remind me that JedisCluster() is thread-safe, for it has using commons-pool to manage the connection resource.

Upgrade to kernel-4.4.1 on CentOS 7

After I compiled and installed kernel-4.4.1 (from kernel.org) on my CentOS 7, I reboot the machine. But it can’t boot up correctly.

to extract the content in initramfs and check them, I found out the ‘mpt2sas’ kernel driver had not been added into initramfs so /boot partition could not be loaded.

Seems this problem is common. Because changing dracut source code or configure file on all servers is not viable, I chose to add command in my kernel rpm spec file:

This will add drivers to the corresponding initramfs file.

But the kernel could not boot up either. This time, I found that the command line in GRUB2 is like:

Looks we should change it to UUID. Add another command in kernel rpm spec file:

This will get UUID of boot disk from /proc/cmdline and give it to GRUB2 configure file.

Now, the kernel-4.4.1 boot up correctly on CentOS 7.

The importance of Declaration in c language

Let’s create two c language files,

and compile & link them:

But after run “./test”, the result is

Why the result from myNumber() become different in main() function?

Let’s see the assembly language source of test.c (gcc -S test.c)

It only get the 32-bits result from function ‘myNumber’ (The size of %eax register is just 32-bits, smaller than the size of “long long”). Actually, we missed the declaration of myNumber() in test.c file so it only consider the result of myNumber() as 32-bits size.
After adding the declaration of myNumber() into test.c, we could check the assembly language source has changed:

(The size of %rax register is 64-bits)
And the result of running ‘./test’ is correct now.

Perf to the detail of stack frame

Use perf to profile the redis daemon

The report shows:

I could only see that function “ziplistFind” is very hot in redis but I can’t see how the code routine reaches the “ziplistFind”. By searching around I find this article. So I use perf as:

The report indicates more details in program now:

LZ4 is faster, but not better

I need to compress small size of data in my project without too heavy side-effect on performance. Many people recommend LZ4 for me since it is almost the fastest compression algorithm at present.

Compression ratio is not the most important element here, but it still need to be evaluated. So I create a 4KB file contains normal text (from a README file from a open source software) and compress it with LZ4 and GZIP.

Compression Ratio 1.57 2.24

Hum, Looks like the compression ratio of LZ4 is not bad. But when I run test with this special content (changed from here):

the result became interesting:

Compression Ratio 0.99 2.25

GZIP could compress this content unexceptionally, but LZ4 don’t. So, LZ4 can’t compress some special content (like, numbers), although it is faster than any other compression algorithm.
Somebody may ask: why don’t you use special algorithm to compress content of numbers? The answer is: in real situation, we could not know the type of our small data segments.

Import data to Redis from JSON format

Redis use rdb file to make data persistent. Yesterday, I used the redis-rdb-tools for dumping data from rdb file to JSON format. After that, I write scala code to read data from JSON file and put it into Redis.

Firstly, I found out that the JSON file is almost 50% bigger than rdb file. After checking the whole JSON file, I make sure that the root cause is not the redundant symbols in JSON such as braces and brackets but the “Unicode transformation” in JSON format, especially the “\u0001” for ASCII of “0x01”. Therefore I write code to replace it:

Then the size of JSON file became normal.

There was still another problem. To read the JSON file line by line, I use code from http://naildrivin5.com/blog/2010/01/26/reading-a-file-in-scala-ruby-java.html:

But this code will load all data from file and then run “foreach”. As my file is bigger than 100GB, it will cost too much time and RAM ….
The best way is to use IOStream in java:

This is exactly read file “line by line”.

From scala Array[String] / Seq[String] to java varargs

While testing performance of redis these days, I need to use mset() interface of jedis (a java version redis client). But the prototype of mset() in jedis is:

Firstly I write my scala code like:

But it report compiling errors:

After searching many documents about scala/java on google, I finally find the answer: http://docs.scala-lang.org/style/types.html. So, let’s write code this way:

Then Array[String] of scala changes to varargs in java now. It also viable for Seq[String].