Trace rewriting for Python Jakob Sievers Python is an object-oriented, dynamic programming language, implemented by a switch-based bytecode interpreter. The generality of Python's virtual machine instruction set necessitates a large number of run-time checks, which place a noticeable performance burden on the current implementation. This thesis investigates the use of run-time information to specialize frequently-executed traces of VM instructions on the bytecode level, as well as the effectiveness of traditional VM optimizations in the context of Python. A variant of the Python virtual machine based on modern implementation techniques such as threaded code and superinstructions was developed. Special VM instructions which use inline-caching to avoid much of the run-time resolution overhead of stock Python were added to this virtual machine. It was then extended to identify and record hot instruction sequences at run-time and to rewrite these using the new inline-caching instructions as well as special guard instructions for insuring cache integrity. The effect of these transformations was evaluated on a number of benchmarks, Python libraries, and full applications. Speedups of up to about 20% were achieved for certain classes of programs; for other applications we cannot even recoup the overhead caused by the recording machinery. Classical VM optimizations were found to have a considerable impact on typical benchmarks, but only marginally influence running-times of large applications.