What is Just-In-Time Compilation? Why is it important?

I have been working on a compiler that supports JIT(Just-In-Time) compile for 3 years. This was a question I got most confused about at the beginning and have been asked most. In this article, I will try to explain JIT and its main use cases without associating with a specific language. Hopefully, this can help you grasp the idea and motivation behind Just-in-Time compilation.

There are a lot of good answers on the Internet. Most of them are tied to specific language like Java, JavaScript, Pascal, C#, etc and specific use cases. I would strongly recommend you read those explanations once you finish this one.

https://aboullaite.me/understanding-jit-compiler-just-in-time-compiler/ (Java)

https://devblogs.nvidia.com/cuda-pro-tip-understand-fat-binaries-jit-caching/ (CUDA)

https://hacks.mozilla.org/2017/02/a-crash-course-in-just-in-time-jit-compilers/ (Javascript)

What is JIT

According to Wikipedia:

In computing, just-in-time (JIT) compilation is a way of executing computer code that involves compilation during execution of a program — at run time — rather than before execution.

The typical compilation process (non-JIT or Ahead-of-Time compilation) translates a high-level programming language to machine code at compile time. During execution of the program (runtime), machine code is fixed. Just-in-Time is a technique to generate/re-generate machine code at runtime. How it is implemented varies, the machine code can be generated through intermediate representation saved at compile time, or re-generated through a less-optimized version of machine code.

Why is it useful? There are 2 main motivations to use JIT today:

1. Use runtime data to find optimization opportunities.

2. Support more hardware architectures.

Use Case 1: Use runtime data to find optimization opportunities

Programs contain static parts and dynamic parts. Dynamic parts are not known at compile time. To understand it, take the following JavaScript code for example:

function add(a, b) {

return a + b;

}

The above code will “add” variable a and b. What unknown at compile time are:

  1. The type/range of a and b.
  2. How many times does this function get called?
  3. How much memory is needed to store the result?
  4. Is this function used or not?

The compiler would make a smarter optimization decision had it known any of them. They are not feasible at compile time because the user didn’t provide them in the code and the compiler can’t guess/find them easily. However, the information may be available at runtime. With that, the program can be further optimized.

Let’s look at a specific problem and see how JIT can reduce save execution time dramatically.

The program contains “hot code” that occupies most of the execution time. Optimizing “hot code” can significantly speed up the program. But without running the code, the compiler does not know where the hot code is.

In the baseline case, where “hot code” is not identified and further optimized, the program can deal with 1 input every second. The initial compile time takes 2 seconds. The total execution time for 100 input data is :

2 + 1 * 100 examples = 102 seconds.

With JIT optimization enabled, the runtime would first use the initially compiled version (2 seconds) and run it on the first 10 inputs with profiling which takes 1 * 10 = 10 seconds. Then it will look at profiling data and decide to optimize the “hot code” aggressively using JIT which takes 5 seconds and reduces the processing time for one input to 0.1 seconds. The total runtime is reduced to:

2 + 1 * 10 examples + 5 + 0.1 * 90 examples = 26 seconds.

In the real world, it’s more complicated than this. But you got the idea. The process of gathering runtime information and deciding what code needs to be heavily optimized by applying what specific optimization is sophisticated and people have invested decades of research into it. They are all driven by a strong and simple idea: leveraging runtime information to enable JIT optimizations.

Use Case 2: Support more hardware architectures

To understand this problem, we need to look at history and know what problems computer scientists were facing at the moment.

When hardware architectures design CPUs, they expose a programming interface called “machine code” or ISA(instruction set architecture) as CPUs’ input. Different CPUs use different machine code as programming interfaces.

A compiler translates high-level programming language to machine code. However, machine code is different on every CPU architecture. To support a language on N type of architectures, developers have to build N compilers which is impossible. This also makes it really hard for users to port their applications across different CPU architectures.

To solve this problem, the concept of “virtual machine” is introduced. In simple words, it’s a software layer on different CPU architectures and exposing the same programming interface (virtual machine ISA) for applications running on top of it. The most famous virtual machine ISA is Java bytecode.

When JIT is used, at compile time, only virtual machine ISA is generated. The executable can be ported to any machine that has runtime support. Virtual machine ISA is JIT-compiled to machine code at runtime. (Write Once Run Anywhere)

Not only is this a good idea for application developers, it’s also an amazing invention for language creators and hardware vendors. The existence of virtual machines (and other intermediate representations) really separates the work boundary between high level compiling processes and low-level target-specific transformations. Imagine how hard it can be for a new language to adopt all architectures. it would be even impossible for a hardware vendor to support every programming language directly.

In addition, “more architecture” today is not limited just to CPU architectures. In heterogeneous computing and deep learning world where a compiler can target not only CPU but GPU and other processors, JIT techniques help the executable compile once and run optimized on a variety of processors.

Summary

To summarize, JIT is a technique that involves compilation at execution time.

the main use cases for JIT compilation today are:

  1. Leverage runtime information to optimize code.
  2. Abstract away the complexity of underlying hardware in runtime.

Reference

[1] https://en.wikipedia.org/wiki/Just-in-time_compilation

--

--

Building Compilers

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store