128bit atomic operation in arm64

In x86_64 architecture, we could use cmpxchg16b to compare and exchange a 128bit value in one atomic operation. But how to operate a 128bit value in aarch64 architecture (arm64) machine ? The answer is __atomic_compare_exchange(). As below code:

use "gcc test.c -o test" to compile code, but it reports:

Run aarch64 binary on x86_64 machines

If we use qemu-arm64-system directly, it will cost too much time on IO and systemcalls.So I try to use SuSE's userpsace mode qemu, which only reinterprets the arm64 instructions to x86_64 but processes all systemcalls to local host. This installation manual for user-mode qemu-arm64 has tested on debian-7.7.0 Step 1,