doc: Lesson about fd handling to avoid buffered pipes.

2026-03-13 22:00:00 -04:00
parent fdfb4eaab5
commit 6187b83f26
1 changed files with 186 additions and 0 deletions
--- a/docs/lessons/unix-pipes-and-buffering.md
+++ b/docs/lessons/unix-pipes-and-buffering.md
@@ -0,0 +1,186 @@
+# Unix Pipes Are Not Buffered (But Everything Else Is)
+
+If you've ever piped a command into another and wondered why
+the output seems to "lag" or arrive in chunks, this one's
+for you. The pipe isn't the problem. It never was.
+
+## Pipes are just a kernel FIFO
+
+When the shell sets up `cmd1 | cmd2`, it does roughly this:
+
+```c
+int fds[2];
+pipe(fds);          // fds[0] = read end, fds[1] = write end
+// fork cmd1, dup2(fds[1], STDOUT)
+//   cmd1's stdout writes into the pipe
+// fork cmd2, dup2(fds[0], STDIN)
+//   cmd2's stdin reads from the pipe
+```
+
+The pipe itself is a dumb byte queue in the kernel. No
+buffering strategy, no flushing, no opinions. Bytes written
+to the write end are immediately available on the read end.
+It has a capacity (64KB on Linux, varies elsewhere) and
+`write()` blocks if it's full. That's your backpressure.
+
+Think of it like a bounded `tokio::sync::mpsc::channel` but
+for raw bytes instead of typed messages. One side writes,
+the other reads, the kernel handles the queue.
+
+## So where does the buffering come from?
+
+The C standard library (`libc` / `glibc`). Specifically, its
+`FILE*` stream layer (the thing behind `printf`, `puts`,
+`fwrite` to stdout, etc.).
+
+When a C program starts up, before your `main()` even runs,
+libc's runtime initializes stdout with this rule:
+
+| stdout points to... | Buffering mode             |
+|----------------------|----------------------------|
+| A terminal (tty)     | **Line-buffered**: flushes on every `\n` |
+| A pipe or file       | **Fully buffered**: flushes when the internal buffer fills (~4-8KB) |
+
+This detection happens via `isatty(STDOUT_FILENO)`. The
+program checks if its stdout is a terminal and picks a
+buffering strategy accordingly.
+
+**This is not a decision the shell makes.** The shell just
+wires up the pipe. The *program* decides to buffer based on
+what it sees on the other end.
+
+## The classic surprise
+
+```bash
+# Works fine. stdout is a terminal, line-buffered, lines
+# appear immediately.
+tail -f /var/log/something
+
+# Seems to lag. stdout is a pipe, fully buffered, lines
+# arrive in 4KB chunks.
+tail -f /var/log/something | grep error
+```
+
+The pipe between `tail` and `grep` is instant. But `tail`
+detects its stdout is a pipe, switches to full buffering,
+and holds onto output until its internal buffer fills. So
+`grep` sits there waiting for a 4KB chunk instead of getting
+lines one at a time.
+
+Same deal with any command. `awk`, `sed`, `cut`, they all
+do the same isatty check.
+
+## The workarounds
+
+### `stdbuf`: override libc's buffering choice
+
+```bash
+stdbuf -oL tail -f /var/log/something | grep error
+```
+
+`-oL` means "force stdout to line-buffered." It works by
+LD_PRELOADing a shim library that overrides libc's
+initialization. This only works for dynamically-linked
+programs that use libc's stdio (most things, but not
+everything).
+
+### `unbuffer` (from `expect`)
+
+```bash
+unbuffer tail -f /var/log/something | grep error
+```
+
+Creates a pseudo-terminal (pty) so the program *thinks*
+it's talking to a terminal and uses line buffering. Heavier
+than `stdbuf` but works on programs that don't use libc's
+stdio.
+
+### In your own code: just don't add buffering
+
+In Rust, raw `std::fs::File` writes are unbuffered. Every
+`.write()` call goes straight to the kernel via the `write`
+syscall:
+
+```rust
+use std::io::Write;
+
+// Immediately available on the read end. No flush needed.
+write_file.write_all(b"first line\n")?;
+
+// Reader already has that line. Do whatever.
+tokio::time::sleep(Duration::from_secs(1)).await;
+
+// This also lands immediately.
+write_file.write_all(b"second line\n")?;
+```
+
+If you wrap it in `BufWriter`, now you've opted into the
+same buffering libc does:
+
+```rust
+use std::io::{BufWriter, Write};
+
+let mut writer = BufWriter::new(write_file);
+writer.write_all(b"first line\n")?;
+// NOT visible yet. Sitting in an 8KB userspace buffer.
+writer.flush()?;
+// NOW visible on the read end.
+```
+
+Rust's `println!` and `stdout().lock()` do their own tty
+detection similar to libc. If you need guaranteed unbuffered
+writes, use the raw fd or explicitly flush.
+
+## How pikl uses this
+
+In pikl's test helpers, we create a pipe to feed action
+scripts to the `--action-fd` flag:
+
+```rust
+let mut fds = [0i32; 2];
+unsafe { libc::pipe(fds.as_mut_ptr()) };
+let [read_fd, write_fd] = fds;
+
+// Wrap the write end in a File. Raw, unbuffered.
+let mut write_file =
+    unsafe { std::fs::File::from_raw_fd(write_fd) };
+
+// Write the script. Immediately available on read_fd.
+write_file.write_all(script.as_bytes())?;
+// Close the write end so the reader gets EOF.
+drop(write_file);
+```
+
+Then in the child process, we remap the read end to the
+expected fd:
+
+```rust
+// pre_exec runs in the child after fork(), before exec()
+cmd.pre_exec(move || {
+    if read_fd != target_fd {
+        // make fd 3 point to the pipe
+        libc::dup2(read_fd, target_fd);
+        // close the original (now redundant)
+        libc::close(read_fd);
+    }
+    Ok(())
+});
+```
+
+For streaming/async scenarios (like feeding items to pikl
+over time), the same approach works. Just don't drop the
+write end. Each `write_all` call pushes bytes through the
+pipe immediately, and the reader picks them up as they
+arrive. No flush needed because `File` doesn't buffer.
+
+## tl;dr
+
+- Pipes are instant. They're a kernel FIFO with zero
+  buffering.
+- The "buffering" you see is libc's `FILE*` layer choosing
+  full buffering when stdout isn't a terminal.
+- `stdbuf -oL` or `unbuffer` to fix other people's
+  programs.
+- In your own code, use raw `File` (not `BufWriter`) and
+  every write lands immediately.
+- It was always libc. Bloody libc.