Hardware Virtualization Naveed Mahmud University of Kansas (KU)

28 Slides1.07 MB

Hardware Virtualization Naveed Mahmud University of Kansas (KU) EECS 768: Virtual Machines Friday, April 12th, 2019

Outline Introduction x86 Virtualization Hardware Accelerated Virtualization Virtualizing using Reconfigurable HW 2

Introduction What is hardware virtualization? Underlying technology supporting virtual servers ESX, Xen, and Hyper-V. Different from operating system virtualization. OS provides weak isolation. Need for hypervisor or VMM. VMM manages virtual machines (Guest OS applications) like an OS manages processes and threads. 3

Introduction Virtualization challenge in the past Trap-and-emulate virtualization on mainframes, IBM 370 x86 virtualization impossible? Inherited architectural quirks Retains compatibility with legacy code Solution in 1998 Binary translation based VMM Trap-and-emulate virtualization of x86 2005: Intel hardware support for ISA virtualization, VT-X 2007: AMD hardware support for memory virtualization, RVI 4

Introduction Hardware Virtualization provides Server consolidation Fault tolerance Security and resource management OS development and deployment Cloud computing 5

Outline Introduction x86 Virtualization Hardware Accelerated Virtualization Virtualizing using Reconfigurable HW 6

x86 Virtualization Hypervisor and VMM narrower definitions VMM: and entity specifically virtualizing a given architecture (ISA, memory, interrupts, I/O) Hypervisor: combines an OS with a VMM Virtualizes the x86 architecture boot loader HW abstraction layer I/O stacks Schedulers Figure 1: The ESX hypervisor: one vmkernel per host, and one VMM per virtual machine. Image source: Agesen, et al [1] 7

x86 Virtualization Virtualizing x86 General approach: trap-and-emulate Guest code runs directly on CPU with reduced privilege Guest executing privileged instr. generates trap CPU transfers control to VMM VMM emulates instruction using interpreter Resumes execution of guest code x86 initially considered non-virtualizable POPF instruction: stack - %eflags Guest kernel trying to clear interrupt flag using POPF 8

x86 Virtualization Binary translation Handle entire x86 ISA (x86 to x86 translator) Intercept sensitive instructions without requiring trap semantics. Translate privileged instructions to safer user-mode code. Replaces sensitive code with code that manipulates virtual HW. Maintain translation caches for speedup. Image source: Agesen, et al [1] 9

x86 Virtualization Memory Management Support for virtual memory : HW TLB and page table walker. VMM’s MMU Maintains isolation. Mapping guest virtual address (gVA) to guest physical address (gPA). Mapping guest physical address (gPA) to host physical address (hPA). Provide native-speed memory access for direct execution Mapping from gVA hPA must reside in HW TLB. Point page table root control register (%CR3) to a shadow page table 10

x86 Virtualization Memory Management Shadow page tables translate the virtual addresses of the guest OS into the real physical pages Most used parts of the page table are cached in the TLBs. Shadow page table adjustment costs performance. Takes 3 to 400 (!) times more cycles than in the native situation. Memory intensive applications suffer. MMU needs to detect when the guest modifies a page table entry Traced faults Image source: Gelas [2] 11

Outline Introduction x86 Virtualization Hardware Accelerated Virtualization Virtualizing using Reconfigurable HW 12

Hardware Accelerated Virtualization Idea is to fix the problem that x86 cannot be virtualized without BT Trap all exceptions and privileged instructions Transition between guest OS VMM (Vmexit/Vmentry) Does not reduce the overheads Advantage VMM privilege level Guest OS privilege level. System calls are not always intercepted. So guest OS can provide kernel services to applications. 13

Hardware Accelerated Virtualization Problem: transition requires large number of CPU cycles. (2-3 orders) Intel VT-x or AMD-V (first gen) Intel added Extended Page Tables (EPT) AMD added Rapid Virtualization Indexing (RVI) System calls overhead less impacted Simpler interceptions cost more Creating processes Context switches Page table updates BT and para-virtualization faster! Image source: Gelas [2] 14

Hardware Accelerated Virtualization Instruction Set Virtualization: VT-x and AMD-V Virtual machine control block (VMCB) In-memory data structure Combines control state with a subset of the guest VCPU state Guest mode Guest OS can run without interference from VMM Less-privileged execution mode Supports direct execution of guest code guest privileged code vmrun HW loads guest state from VMCB Continues execution in guest mode Stop condition set by control bits of VMCB HW performs exit operation 15

Hardware Accelerated Virtualization Instruction Set Virtualization: VT-x and AMD-V vmexit Guest state is saved to the VMCB VMM state is loaded Execution resumes in host mode in the VMM Guest mode allows VM To issue system calls Switch between kernel and user mode Use segmentation Run sensitive instructions like POPF Take faults without causing exits to VMM 16

Hardware Accelerated Virtualization Instruction Set Virtualization: VT-x and AMD-V Performance Early implementations had high overhead (exit costs) Improved with later generation of processors SW cost of handling exit. Overhead function of exit frequency and average cost of handling an exit Optimization: reduce frequency of exits Image source: Gelas [2] 17

Hardware Accelerated Virtualization Instruction Set Virtualization: VT-x and AMD-V Exit frequency optimization Certain types of exits can be buffered in the VMCB Example: POPF » Guest mode execution of POPF would trigger exits » High frequency of POPF, unacceptable exit rate » VMCB keeps HW-maintained shadow of the guest %eflags register » Instructions changing %eflags operate on the shadow 18

Hardware Accelerated Virtualization Memory Virtualization: RVI and EPT VMM must write protect primary page tables VMM must request exits on page faults to distinguish between hidden faults and true faults VMM must request exits on guest context switch to update page table Second generation of HW support targeting memory opt. 19

Hardware Accelerated Virtualization Memory Virtualization: RVI and EPT The VMM maintains a hardware-controlled nested page table Translates gPAs to hPAs "super" TLB that keeps track of both the Guest OS and the VMM memory management. Image source: Agesen, et al [1] 20

Hardware Accelerated Virtualization CPUs with virtualization support Processor list: https://ark.intel.com/content/www/us/en/ark.html Image source: Gelas [2] 21

Hardware Accelerated Virtualization Intel Processor Indentification Utility https://downloadcenter.intel.com/download/28539 22

Outline Introduction x86 Virtualization Hardware Accelerated Virtualization Virtualizing using Reconfigurable HW 23

Virtualizing using Reconfigurable HW Heterogeneous architecture Image source: El-Araby [3] 24

Virtualizing using Reconfigurable HW Virtualization Infrastructure Image source: El-Araby [3] 25

Virtualizing using Reconfigurable HW Speedup using Partial Runtime Reconfiguration (PRTR) Image source: El-Araby [3] 26

References [1] Ole Agesen, Alex Garthwaite, Jeffrey Sheldon, and Pratap Subrahmanyam. 2010. The evolution of an x86 virtual machine monitor. SIGOPS Oper. Syst. Rev. 44, 4 (December 2010), 318. DOI: https://doi.org/10.1145/1899928.1899930 [2] J. D. Gelas. Hardware Virtualization: the Nuts and Bolts. Anandtech. March 17, 2008. Available online: https://www.anandtech.com/show/2480 [3] El-Araby, E., Gonzalez, I., & El-Ghazawi, T. (2008). Virtualizing and Sharing Reconfigurable Resources in High-Performance Reconfigurable Computing Systems. In Second International Workshop on High-Performance Reconfigurable Computing Technology and Applications (HPRCTA’08), held in conjunction with SC’08, Austin, TX, USA, November, 2008. 28

Back to top button