Memory mapping of pipes under Linux

 

Description

Memory mapping (API function mmap()) is a alternative way of accessing files. Compared to the IO functions read() and write() it avoids unnecessary copy operations. The method is not widely used, though, because its standard implementations cannot be applied to stream objects like pipes, sockets or (pseudo-)terminal devices. It is this consistent applicability of the standard IO functions to all kinds of objects, that makes them more attractive.

To eliminate this disadvantage I present the model of an extended mmap() call, that allows for mapping arbitrary stream objects into a program's address space. A program using this extended call can be written adhering to only one paradigm of interacting with external objects, the one of mmap(), and still avoid unecessary copying.

The properties of the memory object that is created by mapping a stream object, are modeled after the behavior of a mapped file, though they still keep some properties of the original object, e.g. the size of a pipe or a socket is by definition not known. Although mmap() cannot change this, the large virtual address spaces of current architectures make the model very useful.

Using the example of memory mapped pipes I specify the model of extended mmap() in detail and implemented it in Linux.

Extended memory mapping, especially the shift from page granularity to byte granularity, requires some refinements in virtual memory management.
 

Results

I have adapted some simple commands (cat,wc and grep) and compared their performance to the unmodified versions implemented on basis of standard IO.

As expected, the versions using mmap() of ordinary files performed better. The difference being mostly in system time used. For the standard IO implementations the measurements required more than twice as much time for mmap().

A second experiment measured the run-times of programs communicating via a pipe. Access methods were either read()/write() or mmap(). Here we see that the result was not as expected. The slight increase of by factor of 1.1 for mmap() can be attributed to the sample implementation not being optimized.

The whole table of measurements can be found on this site.

To read all about extended-mmap() take a look to my  thesis

Installation

Apply patch to the  kernel-source-tree (works only on a 2.0.36 kernel), configure, compile and run the new kernel. For infos about the patch read  README

Use one of the modified programs (wc, grep) or one of the test-programms (cat, pipemap) to learn more about memory mapping of pipes.

P.Syrowatka