Optimizing indirect branches in a system-level dynamic binary translator

Toshihiko Koju; Xin Tong; Ali Ijaz Sheikh; Moriyoshi Ohara; Toshio Nakatani

doi:10.1145/2367589.2367599

SYSTOR 2012

Conference paper

22 Oct 2012

Optimizing indirect branches in a system-level dynamic binary translator

View publication

Abstract

A dynamic binary translator (DBT) is a runtime system that translates binary code on the fly, for example to emulate the execution of the binary code on a processor with a different instruction set. One of the major sources of the overhead is the resolution of the branch target addresses for indirect branch instructions. Previous work has addressed this problem for a single virtual address space, but none has addressed it for multiple virtual address spaces in the context of the system-level DBT. This is challenging for compiler optimizations because the compiler cannot compute the virtual addresses of the branch targets for indirect branches at compile-time since they are affected by the runtime states of the emulated TLB. In this paper, we propose a new compiler optimization technique to address the problem for a system-level DBT. Our key idea is to use an offset from the virtual address of each page that contains a branch instruction, since this offset is not affected by the emulated TLB. We found that the compiler can often compute the offset using compile-time constants and that this approach significantly simplifies the guard code necessary for an indirect branch. We implemented this technique in a compiler of a system-level DBT for the z/Architecture. Our experimental results showed our technique can reduce the execution times of the CBW2 benchmarks, part of the standard LSPR benchmark, by up to 5.9% and 2.5% on average. Our analysis indicated that our technique was able to optimize 3.8% of the total dynamic instructions in the original binary code, while completely removing the guard code for 98.9% of these indirect branches. © 2012 ACM.

Conference paper