### VU2 185.324

#### Compilation Techniques for VLIW Architectures

Dietmar Ebner ebner@complang.tuwien.ac.at Florian Brandner brandner@complang.tuwien.ac.at

http://complang.tuwien.ac.at/cd/vliw







# Phases of an ILP Oriented Compiler



### Instruction Scheduling

- Reorder instructions to
  - Minimize the execution time
  - Maximize the utilization of computational resources
  - Data dependencies need to be preserved

#### • For VLIWs

- Essential for correct code & high performance
- Often combined with instruction bundling
- Accounting for clusters

05/16/08 Ebner, Brandner | Compilation Techniques for VLIWs | SS08 Slide #7

### Phase Ordering

- Interacts heavily with other optimizations
  - Cluster Assignment
    - · Limits freedom of the scheduler
  - Register allocation (RA)
  - Before scheduling
    - May limit freedom of the scheduler
    - Introduces false dependencies
  - After scheduling
  - Scheduling may increase register pressure
    May lead to spilling



- Clustering adds significant complexity to scheduling
  - Usually done beforehand
  - Resembles a min-cut problem
    - But different constraints
  - Do not minimize the moves, but execution time
- A promising approach
  - Color the DDG and find partial connected components (PCC)
- Iterative refinement by reassigning PCCs
- 16/08 Ebner, Brandner | Compilation Techniques for VLIWs | SS08 Slide #9



- Typically schedule pre & post RA
- Integrated Prepass Scheduling
- Two scheduling modes
- CSR: Schedule to minimize register usage
- CSP: Schedule to minimize pipeline delays
- Estimate the current register usage
- Integrated techniques
  - Solve both problems simultaneously

































- · States of the automaton
  - Capture the architecture state
  - Other constraints (encoding, etc.)
- Transitions
  - Model the scheduling of instructions
  - Prohibit transition for instructions causing hazards or violating some constraint







- The scope of basic blocks is too limited



# Trace Scheduling (1)

- Extend instruction scheduling to traces
  - Traces allow to schedule across basic blocks
  - Developed for micro-code compaction by J. Fisher
- A sequence of basic blocks form a trace
  - Build a linear path through the CFG
  - May contain several entries (joins)
  - May contain several exits (splits)













- Forming profitable traces depends heavily on profiling information
  - Selection of hot blocks
  - Rating candidate blocks during trace expansion
- It is beneficial if the program executes mostly the same (few) paths and is predictable
- It is hard to find traces if all execution paths have equal frequencies













# Limitations of Trace scheduling

- Sensitive to static branch prediction accuracy
  - Off-trace paths are penalized
  - Favor shorter traces in these cases
- Code size growth
  - Caused by compensation code
  - May hurt instruction cache
  - Can be controlled using thresholds
- Handling loops
  - Trace scheduling is an acyclic technique

