jc/pikl

Fork 0

Files

J. Champagne 6187b83f26

doc: Lesson about fd handling to avoid buffered pipes.

2026-03-13 22:00:00 -04:00

5.5 KiB

Raw Blame History

Unix Pipes Are Not Buffered (But Everything Else Is)

If you've ever piped a command into another and wondered why the output seems to "lag" or arrive in chunks, this one's for you. The pipe isn't the problem. It never was.

Pipes are just a kernel FIFO

When the shell sets up cmd1 | cmd2, it does roughly this:

int fds[2];
pipe(fds);          // fds[0] = read end, fds[1] = write end
// fork cmd1, dup2(fds[1], STDOUT)
//   cmd1's stdout writes into the pipe
// fork cmd2, dup2(fds[0], STDIN)
//   cmd2's stdin reads from the pipe

The pipe itself is a dumb byte queue in the kernel. No buffering strategy, no flushing, no opinions. Bytes written to the write end are immediately available on the read end. It has a capacity (64KB on Linux, varies elsewhere) and write() blocks if it's full. That's your backpressure.

Think of it like a bounded tokio::sync::mpsc::channel but for raw bytes instead of typed messages. One side writes, the other reads, the kernel handles the queue.

So where does the buffering come from?

The C standard library (libc / glibc). Specifically, its FILE* stream layer (the thing behind printf, puts, fwrite to stdout, etc.).

When a C program starts up, before your main() even runs, libc's runtime initializes stdout with this rule:

stdout points to...	Buffering mode
A terminal (tty)	Line-buffered: flushes on every `\n`
A pipe or file	Fully buffered: flushes when the internal buffer fills (~4-8KB)

This detection happens via isatty(STDOUT_FILENO). The program checks if its stdout is a terminal and picks a buffering strategy accordingly.

This is not a decision the shell makes. The shell just wires up the pipe. The program decides to buffer based on what it sees on the other end.

The classic surprise

# Works fine. stdout is a terminal, line-buffered, lines
# appear immediately.
tail -f /var/log/something

# Seems to lag. stdout is a pipe, fully buffered, lines
# arrive in 4KB chunks.
tail -f /var/log/something | grep error

The pipe between tail and grep is instant. But tail detects its stdout is a pipe, switches to full buffering, and holds onto output until its internal buffer fills. So grep sits there waiting for a 4KB chunk instead of getting lines one at a time.

Same deal with any command. awk, sed, cut, they all do the same isatty check.

The workarounds

`stdbuf`: override libc's buffering choice

stdbuf -oL tail -f /var/log/something | grep error

-oL means "force stdout to line-buffered." It works by LD_PRELOADing a shim library that overrides libc's initialization. This only works for dynamically-linked programs that use libc's stdio (most things, but not everything).

`unbuffer` (from `expect`)

unbuffer tail -f /var/log/something | grep error

Creates a pseudo-terminal (pty) so the program thinks it's talking to a terminal and uses line buffering. Heavier than stdbuf but works on programs that don't use libc's stdio.

In your own code: just don't add buffering

In Rust, raw std::fs::File writes are unbuffered. Every .write() call goes straight to the kernel via the write syscall:

use std::io::Write;

// Immediately available on the read end. No flush needed.
write_file.write_all(b"first line\n")?;

// Reader already has that line. Do whatever.
tokio::time::sleep(Duration::from_secs(1)).await;

// This also lands immediately.
write_file.write_all(b"second line\n")?;

If you wrap it in BufWriter, now you've opted into the same buffering libc does:

use std::io::{BufWriter, Write};

let mut writer = BufWriter::new(write_file);
writer.write_all(b"first line\n")?;
// NOT visible yet. Sitting in an 8KB userspace buffer.
writer.flush()?;
// NOW visible on the read end.

Rust's println! and stdout().lock() do their own tty detection similar to libc. If you need guaranteed unbuffered writes, use the raw fd or explicitly flush.

How pikl uses this

In pikl's test helpers, we create a pipe to feed action scripts to the --action-fd flag:

let mut fds = [0i32; 2];
unsafe { libc::pipe(fds.as_mut_ptr()) };
let [read_fd, write_fd] = fds;

// Wrap the write end in a File. Raw, unbuffered.
let mut write_file =
    unsafe { std::fs::File::from_raw_fd(write_fd) };

// Write the script. Immediately available on read_fd.
write_file.write_all(script.as_bytes())?;
// Close the write end so the reader gets EOF.
drop(write_file);

Then in the child process, we remap the read end to the expected fd:

// pre_exec runs in the child after fork(), before exec()
cmd.pre_exec(move || {
    if read_fd != target_fd {
        // make fd 3 point to the pipe
        libc::dup2(read_fd, target_fd);
        // close the original (now redundant)
        libc::close(read_fd);
    }
    Ok(())
});

For streaming/async scenarios (like feeding items to pikl over time), the same approach works. Just don't drop the write end. Each write_all call pushes bytes through the pipe immediately, and the reader picks them up as they arrive. No flush needed because File doesn't buffer.

tl;dr

Pipes are instant. They're a kernel FIFO with zero buffering.
The "buffering" you see is libc's FILE* layer choosing full buffering when stdout isn't a terminal.
stdbuf -oL or unbuffer to fix other people's programs.
In your own code, use raw File (not BufWriter) and every write lands immediately.
It was always libc. Bloody libc.

5.5 KiB Raw Blame History