Introduction Emulators are cool pieces of technology that allow users to run a totally different system on top of another one. Here, we'll explore how a CPU works and how its software can emulate one by implementing a simple machine that will run programs written for a fantasy CPU. Emulators have a wide range of applications, like running x86 programs on ARM devices or running ARM Android applications on x86 Windows Desktops and even running your favorite retro game on an emulated computer/console on a Raspberry Pi. When a full system is emulated, the emulator software needs to handle all the hardware devices of those systems. This may include not only the CPU but also the video system, input devices, etc. However, the core concept of an emulator is nevertheless emulation of the CPU. Programs are just a series of bytes It's well known that CPUs are the core of a machine. Their main job is to execute programs. And programs are nothing more than a series of instructions in the computer's memory. At this point, you may be tempted to believe that your CPU knows JavaScript. Although a tempting concept, this is not the case. Imagine changing a CPU if a new version of JavaScript appeared! In reality, a CPU understands only a finite set of instructions; the instructions the CPU engineering team designed the CPU to understand. Users can create programs by using these instructions directly or by coding in a higher-level language and then using a compiler/system to translate the program into the set of instructions understood by the CPU. Regardless of how the user created the program, the instructions end up in memory as a series of bytes like these: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , 11 0 10 42 6 255 30 0 11 0 0 11 1 1 11 3 1 60 1 10 2 0 20 2 1 60 2 10 0 1 10 1 2 11 2 1 20 3 2 31 2 30 2 41 3 2 19 31 0 50 The CPU role is to fetch these instructions from memory and to execute them. Using mnemonics makes the task of writing programs much easier for humans. If one writes programs using the instructions mnemonics it is said that he codes in assembly language. It only takes a simple step to transform these programs written in assembly language into binary instructions (e.g. machine language). Instructions and mnemonics for our fantasy CPU, as described below, are quite intuitive. But we need to know the CPU instructions if we want to emulate that CPU into software. : Under each instruction is specified the number used to encode that particular instruction. Also registers , , , are encoded as numbers 0, 1, 2, 3): Note R0 R1 R2 R3 Loads the value from regsrc into regdst. E.g. regdst = regsrc reg_dst, reg_src = MOVR MOVR 10 Loads the numeric value into register regdst. E.g. regdst = value reg_dst, value = MOVV MOVV 11 Adds the value from regsrc to the value of regdst and store the result in reg_dst ADD reg_dst, reg_src ADD = 20 Subtracts the value of regsrc from the value of regdst and store the result in reg_dst reg_dst, reg_src = SUB SUB 21 Pushes the value of reg_src on the stack reg_src = PUSH PUSH 30 Pops the last value from stack and loads it into register reg_dst reg_dst = POP POP 31 Jumps the execution to address addr. Similar to a GOTO! = JP addr JP 40 Jump to the address addr only if the value from reg1 < reg2 (IF reg1 < reg2 THEN JP addr) reg_1, reg_2, = JL addr JL 41 Pushes onto the stack the address of instruction that follows CALL and then jumps to address addr addr = CALL CALL 42 Pops from the stack the last number, assumes is an address and jump to that address = RET RET 50 Print on the screen the value contained in the register reg reg = 60 PRINT PRINT Stops our VM. The virtual CPU doesn't execute instructions once HALT is encountered. = HALT HALT 255 Emulating the CPU Armed with the fantasy CPU specs, we can now move ahead to emulate the CPU in software – in JavaScript. As presented above, on real machines programs are stored in memory. In our emulator, we will emulate the memory using a simple array structure. As a matter of fact, we’ll only place a single program in the memory. program = [ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ]; let 11 0 10 42 6 255 30 0 11 0 0 11 1 1 11 3 1 60 1 10 2 0 20 2 1 60 2 10 0 1 10 1 2 11 2 1 20 3 2 31 2 30 2 41 3 2 19 31 0 50 Our virtual CPU will need to fetch instructions one by one from this array and execute them. The CPU will keep track of the instruction that needs to fetch by using a special-purpose register named “PC” (Program Counter). Fact: PC register is a real register on many physical CPU architectures. The core of the CPU emulator is just a big “switch” statement that will process each instruction according to the specs. pc = ; halted = ; { (!halted) { runone(); } } { (halted) ; instr = program[pc]; (instr) { } } let 0 let false ( ) function run while ( ) function runone if return let switch // handle each instruction according to specs // also advance pc to prepare for the next fetch // ... That’s all! This is the structure of our glorious CPU emulator. Processing instructions is also a very simple task. It requires only careful reading and implementation of the instruction’s specs. (instr) { : pc++; rdst = program[pc++]; rsrc = program[pc++]; regs[rdst] = regs[rsrc]; ; : pc++; rdst = program[pc++]; val = program[pc++]; regs[rdst] = val; ; ... } switch // movr rdst, rsrc case 10 var var break // movv rdst, val case 11 var var break Take your time, read the specifications, and try to see if you can complete this virtual CPU implementation. You’ll know that you did a great job when you’ll see the results of the execution program. If you coded carefully, the program should display the first 10 Fibonacci numbers. You can also consult our version from this playground: https://codeguppy.com/code.html?simple_vm/vm0_fibonacci Loading the program As you saw in the above playground, our emulator implementation exists in a simple virtual machine that loads initially the program from an array of bytes and then asks the fantasy CPU to execute it. vm.load([ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ]); 11 0 10 42 6 255 30 0 11 0 0 11 1 1 11 3 1 60 1 10 2 0 20 2 1 60 2 10 0 1 10 1 2 11 2 1 20 3 2 31 2 30 2 41 3 2 19 31 0 50 Loading like this is fine if your only intent is to load the program we’ve provided. But if you want to design your own programs in machine language and execute them, this may not be a very intuitive or efficient way. Here's a small technique that can be used to load the programs using a human-friendly CPU instruction mnemonics. It only takes defining a few constants: MOVR = ; MOVV = ; ADD = ; SUB = ; PUSH = ; POP = ; JP = ; JL = ; CALL = ; RET = ; PRINT= ; HALT = ; R0 = ; R1 = ; R2 = ; R3 = ; vm.load([ MOVV, R0, , CALL, , HALT, PUSH, R0, MOVV, R0, , MOVV, R1, , MOVV, R3, , PRINT, R1, MOVR, R2, R0, ADD, R2, R1, PRINT, R2, MOVR, R0, R1, MOVR, R1, R2, MOVV, R2, , ADD, R3, R2, POP, R2, PUSH, R2, JL, R3, R2, , POP, R0, RET ]); const 10 const 11 const 20 const 21 const 30 const 31 const 40 const 41 const 42 const 50 const 60 const 255 const 0 const 1 const 2 const 3 10 6 // PrintFibo: (addr = 6) 0 1 1 // continue: (addr = 19) 1 19 With this trick, it's like you’re programming in assembly language inside JavaScript. You can find the new program with mnemonics in this playground: https://codeguppy.com/code.html?simple_vm/vm1_fibonacci Now, we'll take the opportunity to quickly cover a few tools commonly used when working with machine language code. Assembler Perhaps the most important tool when working with machine language programs is an assembler program. This program lets users write programs as text files, in easier to use assembly language. Then, the assembler does the heavy lifting by converting the assembly source code into binary data understood by the CPU. We will attempt to build a simplified assembler program that does the basic job. It will work like this: code = ; bytes = asm.assemble(code); let ` // Loads value 10 in R0 // and calls Fibonacci routine MOVV R0, 10 CALL 6 HALT ... ` let If you want to see the code, please check this playground: https://codeguppy.com/code.html?simple_vm/vm3_assembler Disassembler Another useful tool that for working with machine language is a disassembler. A disassembler takes binary format and outputs its listing in a human-readable way, i.e. in assembly language. When run on the following program... src = asm.disassemble( [ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ] ); let 11 0 10 42 6 255 30 0 11 0 0 11 1 1 11 3 1 60 1 10 2 0 20 2 1 60 2 10 0 1 10 1 2 11 2 1 20 3 2 31 2 30 2 41 3 2 19 31 0 50 … our disassembler should output this: MOVV R0, CALL HALT PUSH R0 MOVV R0, MOVV R1, MOVV R3, PRINT R1 MOVR R2, R0 ADD R2, R1 PRINT R2 MOVR R0, R1 MOVR R1, R2 MOVV R2, ADD R3, R2 POP R2 PUSH R2 JL R3, R2, POP R0 RET 0 11 0 10 10 3 42 6 6 5 255 6 30 0 8 11 0 0 0 11 11 1 1 1 14 11 3 1 1 17 60 6 19 10 2 0 22 20 2 1 25 60 6 27 10 0 1 30 10 1 2 33 11 2 1 1 36 20 3 2 39 31 2 2 41 30 2 43 41 3 2 19 19 47 31 0 2 49 50 The listing contains the memory addresses and binary instructions and associated mnemonic for each instruction in our program. To see a very simplistic implementation of a disassembler, please check this playground: https://codeguppy.com/code.html?simple_vm/vm2_disassembler About Fibonacci program You're probably wondering how we came up with the assembly program to print the Fibonacci numbers? We first wrote first the algorithm in JavaScript and then convert it step by step to assembler: Once experienced with writing in assembly, this task can be done in one step. It just requires practice! Conclusion Even if there is still a lot of work before emulating a full system, hopefully this provided a basic overview of what it takes to emulate the core of a system – the CPU. All the programs in this article and explanations are included in the following playground: https://codeguppy.com/code.html?simple_vm/vm As a next step, I invite you to create additional programs for this fantasy CPU and if needed extend the CPU instruction set with new instructions to support your programs. Looking forward to your feedback and comments. Please let me know if you want to see more about this subject, perhaps even seeing the implementation of a full emulator. For more recreational coding programs please feel free to browse codeguppy.com