We use pipe in our program and face a new problem: it fail when we try to write 16MB data into a pipe in one time. Looks pipe has a limited size. But what exactly the size is? After searching on the web, the answers are not inconsistent, some say it’s 16KB and others say it’s 64KB. Therefore I have to watch kernel code by myself to find the correct answer.
Since all the servers in my company is using ali_kernel, which is based on 2.6.32 centos kernel, I find the original routine of codes:

sys_pipe() --> sys_pipe2() --> do_pipe_flags() --> create_write_pipe():

struct file *create_write_pipe(int flags)
{
......
        path.dentry->d_flags &= ~DCACHE_UNHASHED;
        d_instantiate(path.dentry, inode);

        err = -ENFILE;
        f = alloc_file(&path, FMODE_WRITE, &write_pipefifo_fops);
        if (!f)
                goto err_dentry;
        f->f_mapping = inode->i_mapping;
......

Looks all the operations to the pipe about write are managed by “write_pipefifio_fops”. Let’s get in:

const struct file_operations write_pipefifo_fops = {
        .llseek         = no_llseek,
        .read           = bad_pipe_r,
        .write          = do_sync_write,
        .aio_write      = pipe_write,
        .poll           = pipe_poll,
        .unlocked_ioctl = pipe_ioctl,
        .open           = pipe_write_open,
        .release        = pipe_write_release,
        .fasync         = pipe_write_fasync,
};

Clearly, pipe_write() is responsed for writting. Keep going.

static ssize_t
pipe_write(struct kiocb *iocb, const struct iovec *_iov,
            unsigned long nr_segs, loff_t ppos)
{
......
        for (;;) {
                int bufs;

                if (!pipe->readers) {
                        send_sig(SIGPIPE, current, 0);
                        if (!ret)
                                ret = -EPIPE;
                        break;
                }
                bufs = pipe->nrbufs;
                if (bufs < PIPE_BUFFERS) {
                        int newbuf = (pipe->curbuf + bufs) & (PIPE_BUFFERS-1);
                        struct pipe_buffer *buf = pipe->bufs + newbuf;
                        struct page *page = pipe->tmp_page;
                        char *src;
                        int error, atomic = 1;

                        if (!page) {
                                page = alloc_page(GFP_HIGHUSER);
                                if (unlikely(!page)) {
                                        ret = ret ? : -ENOMEM;
                                        break;
                                }
                                pipe->tmp_page = page;
                        }
......
                        pipe->nrbufs = ++bufs;
                        pipe->tmp_page = NULL;

                        total_len -= chars;
                        if (!total_len)
                                break;
                }
......
                pipe_wait(pipe);
......

As above, kernel will allocate a page if new operation of write comes and pipe has not enough space. Every time it add a page, it increase the ‘pipe->nrbufs’, and if the ‘nrbufs’ is great than PIPE_BUFFERS, the routine will be blocked, which means the system-call of write() will be waiting. The ‘PIPE_BUFFERS’ is setted to 16, and a page in linux kernel is 4KB, so a pipe in ali_kernel can store 64KB (16 * 4KB) data at one time.
This condition has changed since kernel version of 3.6.35, which add a new proc entry in ‘/proc/sys/fs/pipe-max-size’.