Category Archives: ops

A successful rescue for a remote server

After installed CUDA-9.2 on a remote server, I found that the system can’t load nvidia.ko (kernel module) with dmesg:

The reason is the current kernel running on my system has turned on the CONFIG_CC_STACKPROTECTOR compiler option. Therefore I change the default entry of grub2 and reboot the server, for… Read more »

Finding core-dump file

      No Comments on Finding core-dump file

In a new server, my program got ‘core dump’. But I haven’t found the core-dump file in the current directory as usual. First I checked the ‘ulimit’ configuration:

Seems ok. The system will generate core-dump file when the program crashed. But where is it? Eventually, I found out the… Read more »

Finding the lost memory

      No Comments on Finding the lost memory

We find out a strange phenomenon in a product server. By using “free” command, it shows there is no free memory in this server. But when we add all processes’s memory allocation:

it show all processes cost only 60GB memory (The whole physical memory of this server is 126GB)…. Read more »

Run docker on centos6

      No Comments on Run docker on centos6

Docker use thin-provision of device mapper as its default storage, therefore if we wan’t run docker on centos6, we should update kernel first. I use linux kernel 4.11 and notice these kernel options should be set:

After build and reboot the kernel, I still can’t launch docker service, and… Read more »

puppet 3 certification problem on centos 7

I configure the puppet master and agent followed by this step. But when I run “puppet agent -t”, it report error:

My OS version is “Centos 7” and puppet version is “3.7.5”. After I have tried the way as this page answered, the problem still exists. Therefore, I write… Read more »

How to set the value of “$releasever” permanently for yum

In a test server I typed “sudo yum update”, it reported errors like:

Then I found this web in google for introducing how to get the value of “$releasever”, but it does not tell us how to set “$releasever” permanently. Therefore, I have to search word like “releasever” in… Read more »

Problems about using zookeeper

Problem 1: The zookeeper cluster is running well for half year a year. But today, after I re-configurate it and run command

It failed to startup and report

The point is the last term “Invalid config”(log4j is just warning); therefore I reviewed zoo.cfg many times but finding no… Read more »