Threat Level: green Handler on Duty: Brad Duncan

SANS ISC: Guest Diary (Etay Nir) Kernel Hooking Basics - SANS Internet Storm Center SANS ISC InfoSec Forums


Sign Up for Free!   Forgot Password?
Log In or Sign Up for Free!
Guest Diary (Etay Nir) Kernel Hooking Basics

A note from HOD: We are recruiting, Etay is mostly through the roadmap. If you are interested in becoming a handler please check out our handler roadmap!

https://isc.sans.edu/handlerroadmap.html

The basic requirements to be considered:

  • GIAC Certification or significant contribution to the field
  • Open participation in the security community
  • Shown an ability and willingness to write (GIAC Gold, blogs or other published papers and articles)
  • Able to publish on your own quickly without review by others (for example employer. but you may recuse yourself if you find that a topic presents a conflict with your day job)
  • Vetted by existing handlers

Please check the roadmap for more details!

 

 

Kernel Hooking Basics


Kernel in general, whether it be Windows or Linux, is a subject not too many take an interest in, nor is it widely taught. It is however, a place for various types of malicious activities, not often in plain sight and not always easy to detect, many times due to the fact that kernels are not properly understood. This paper gives an overview of what a kernel is, its architecture, common concepts and nomenclatures, and finally explains how kernel hooking works. The focus will mainly be on the Microsoft Windows kernel, although Linux kernel will be mentioned throughout.

 

The Kernel – “Made as simple as possible, but no simpler”

In simple terms, A kernel is the part of the operating system that mediates access to system resources. It's responsible for enabling multiple applications to effectively share the hardware by controlling access to CPU, memory, disk I/O, and networking, while the operating system is the kernel plus applications that enable users to get something done (i.e. compiler, text editor, window manager, etc.). The critical code of the kernel is usually loaded into a protected area of memory, which prevents it from being overwritten by applications or other, more minor parts of the operating system. The kernel performs its tasks, such as running processes and handling interrupts, in kernel space/mode. In contrast, everything a user does is in user space/mode like writing text in a text editor, running programs in a GUI. This separation prevents user data and kernel data from interfering with each other and causing instability, slowness, and malicious activity.

 

The primary function of the kernel is to mediate access to the computer's resources, such as:

  • The CPU: The kernel takes responsibility for deciding at any time which of the many running programs should be allocated to the processor or processors (each of which can usually run only one program at a time).
  • RAM: Used to store both program instructions and data. Typically, both need to be present in memory in order for a program to execute. Often multiple programs will want access to memory, frequently demanding more memory than the computer has available. The kernel is responsible for deciding which memory each process can use, and determining what to do when not enough memory is available.
  • I/O Devices: Include such peripherals as keyboards, mice, disk drives, printers, USB devices, network adapters, and display devices. The kernel allocates requests from applications to perform I/O to an appropriate device and provides convenient methods for using the device (typically abstracted to the point where the application does not need to know implementation details of the device).
  • Memory and device Management
  • System Calls

 

User mode and kernel mode

Before we continue any further with the kernel overview, two very important components were mentioned that need proper explanation before we move on: Kernel mode and user mode.

A processor in a computer running Windows has two different modes: user mode and kernel mode. The processor switches between the two modes depending on what type of code is running on the processor. Applications run in user mode, and core operating system components run in kernel mode. While many drivers run in kernel mode, some drivers may run in user mode[2]. To further explain the difference between the two:

User Mode

When you start a user-mode application, Windows creates a process for the application. The process provides the application with a private virtual address space and a private handle table. Because an application's virtual address space is private, one application cannot alter data that belongs to another application. Each application runs in isolation, and if an application crashes, the crash is limited to that one application. Other applications and the operating system are not affected by the crash. In addition to being private, the virtual address space of a user-mode application is limited. A processor running in user mode cannot access virtual addresses that are reserved for the operating system. Limiting the virtual address space of a user-mode application prevents the application from altering, and possibly damaging, critical operating system data.

Kernel Mode

All code that runs in kernel mode shares a single virtual address space. This means that a kernel-mode driver is not isolated from other drivers and the operating system itself. If a kernel-mode driver accidentally writes to the wrong virtual address, data that belongs to the operating system or another driver could be compromised. If a kernel-mode driver will crash, the entire operating system crashes

Memory Management

The last two items that outline the function of the kernel deserve more focus. Let’s start with the memory management: The kernel has full access to the system's memory and must allow processes to safely access this memory as they require it. Often the first step in doing this is virtual addressing[3]. When a processor reads or writes to a memory location, it first translates virtual address inferred by a program to a physical address.

 

Accessing memory through a virtual address has the following three advantages:

  • A program can use a contiguous range of virtual addresses to access a large memory buffer that is not contiguous in physical memory.
  • A program can use a range of virtual addresses to access a memory buffer that is larger than the available physical memory. As the supply of physical memory becomes small, the memory manager saves pages of physical memory (typically 4 kilobytes in size) to a disk file. Pages of data or code are moved between physical memory and the disk as needed.
  • The virtual addresses used by different processes are isolated from each other. The code in one process cannot alter the physical memory that is being used by another process or the operating system.

The range of virtual addresses that is available to a process is called the virtual address space for the process. Each user-mode process has its own private virtual address space. For a 32-bit process, the virtual address space is usually the 2-gigabyte range 0x00000000 through 0x7FFFFFFF. For a 64-bit process, the virtual address space is the 8-terabyte range 0x000'00000000 through 0x7FF'FFFFFFFF. A range of virtual addresses is sometimes called a range of virtual memory.

The diagram shows the virtual address spaces for two 64-bit processes: Notepad.exe and MyApp.exe. Each process has its own virtual address space that goes from 0x000'0000000 through 0x7FF'FFFFFFFF. Each shaded block represents one page (4 kilobytes in size) of virtual or physical memory. Notice that the Notepad process uses three contiguous pages of virtual addresses, starting at 0x7F7'93950000. But those three contiguous pages of virtual addresses are mapped to noncontiguous pages in physical memory. Also notice that both processes use a page of virtual memory beginning at 0x7F7'93950000, but those virtual pages are mapped to different pages of physical memory.

 

Device Management

We are getting closer and closer towards explaining kernel hooking, this part about device management is now bringing everything we learned so far into a better understanding of the kernel hooking concepts. In order to perform useful functions, processes need access to the peripherals connected to the computer, which are controlled by the kernel through device drivers. A device driver is a computer program that enables the operating system to interact with a hardware device. It provides the operating system with information of how to control and communicate with a certain piece of hardware. The driver is an important and vital piece to a program application. The design goal of a driver is abstraction; the function of the driver is to translate the OS-mandated function calls (programming calls) into device-specific calls. In theory, the device should work correctly with the suitable driver. Device drivers are used for such things as video cards, sound cards, printers, scanners, modems, and LAN cards. The common levels of abstraction of device drivers are:

  • On the hardware side:
  1. Interfacing directly.
  2. Using a high-level interface (Video BIOS).
  3. Using a lower-level device driver (file drivers using disk drivers).
  4. Simulating work with hardware, while doing something entirely different.
  • On the software side:
  1. Allowing the operating system direct access to hardware resources.
  2. Implementing only primitives.
  3. Implementing an interface for non-driver software.
  4. Implementing a language, sometimes high-level.

System Calls

A system call is how a process requests a service from an operating system's kernel that it does not normally have permission to run. System calls provide the interface between a process and the operating system. Most operations interacting with the system require permissions not available to a user level process, for example. I/O performed with a device present on the system, or any form of communication with other processes requires the use of system calls.

A system call is a mechanism that is used by the application program to request a service from the operating system. They use a machine-code instruction that causes the processor to change mode. An example would be from supervisor mode to protected mode. This is where the operating system performs actions like accessing hardware devices or the memory management unit. Generally, the operating system provides a library that sits between the operating system and normal programs. Usually it is a C library such as Glibc or Windows API. The library handles the low-level details of passing information to the kernel and switching to supervisor mode. System calls include close, open, read, wait and write.

To actually perform useful work, a process must be able to access the services provided by the kernel. This is implemented differently by each kernel, but most provide a C library or an API, which in turn invokes the related kernel functions.

Protection Ring

Last fundamental to tackle before proceeding to kernel hooking.

Protection ring [4] or hierarchical protection domains are mechanisms to protect data and functionality from faults and malicious behavior, especially in the field of cyber security. As we have learned from the previous sections, computer operating systems provide different levels of access to resources. A protection ring is one of two or more hierarchical levels or layers of privilege within the architecture of a computer-system. This is generally hardware-enforced by some CPU architecture that provide different CPU modes at the hardware level. Rings are arranged in a hierarchy from most privileged (most trusted, usually numbered zero) to least privileged (least trusted, usually with the highest ring number). On most operating systems, Ring 0 is the level with the most privileges and interacts most directly with the physical hardware such as the CPU and memory.

Special gates between rings are provided to allow an outer ring to access an inner ring's resources in a predefined manner, as opposed to allowing arbitrary usage. Correctly gating access between rings can improve security by preventing programs from one ring or privilege level from misusing resources intended for programs in another. For example, spyware running as a user program in Ring 3 should be prevented from turning on a web camera without informing the user, since hardware access should be a Ring 1 function reserved for device drivers. Programs such as web browsers running in higher numbered rings must request access to the network, a resource restricted to a lower numbered ring. A privilege level in the x86 and x86-64 instruction sets controls the access of the program currently running on the processor to resources such as memory regions, I/O ports, and special instructions. There are 4 privilege levels ranging from 0 which is the most privileged, to 3 which is least privileged. Most modern operating systems use level 0 for the kernel/executive, and use level 3 for application programs. Any resource available to level n is also available to levels 0 to ‘n’, so the privilege levels are "rings". When a lesser privileged process tries to access a higher privileged process, a General Protection Fault is reported by the OS.

It is not necessary to use all four privilege levels, because the current Operating Systems like Windows, Linux, etc. mostly are using Paging mechanism and Paging only has one bit to specify the privilege level which is either Supervisor or User (U/S Bit). Windows NT uses the two-level system. The real mode programs in 8086 are executed at level 0 (highest privilege level) whereas virtual mode in 8086 executes all programs at level 3.

The following diagram shows privilege rings for the x86 architecture:

Kernel Hooking

The term hooking [5] covers a range of techniques used to alter or augment the behavior of an operating system, an application or any other software components by intercepting function calls, messages and events passed between the different software component. The code that performs the interception of function calls, events or messages is called a hook. Typically hooks are inserted while software is already running, but hooking is a tactic that can also be employed prior to the application being started.

 

The two main methods of hooking are:

  • Physical modification: Achieved by physically modifying an executable or library before an application is running. Through techniques of reverse engineering, you can also achieve hooking. This is typically used to intercept function calls to either monitor them or replace them entirely. For example, by using a disassembler, the entry point of a function within a module can be found. It can then be altered to instead dynamically load some other library module and then have it execute desired methods within that loaded library. If applicable, another related approach by which hooking can be achieved is by altering the import table of an executable. This table can be modified to load any additional library modules as well as changing what external code is invoked when a function is called by the application. An alternative method for achieving function hooking is by intercepting function calls through a wrapper library. When creating a wrapper, you make your own version of a library that an application loads with all the same functionality of the original library that it will replace. That is, all the functions that are accessible are essentially the same between the original and the replacement. This wrapper library can be designed to call any of the functionality from the original library, or replace it with an entirely new set of logic.
  • Runtime modification: Operating systems and software may provide the means to easily insert event hooks at runtime. Microsoft Windows for example, allows you to insert hooks that can be used to process or modify system events and application events for dialogs, scrollbars, and menus as well as other items. It also allows a hook to insert, remove, process or modify keyboard and mouse events. Linux provides another example where hooks can be used in a similar manner to process network events within the kernel through NetFilter.

 

Hooking and Malware

 

When it comes to malware, and malware analysis, three main categories emerges:

  • Malware that is non-intrusive, which means that the malware does not perform any modification to the OS or processes within the OS in any way.
  • Malware that is intrusive by modifying things which should never be modified (For example kernel code, BIOS which has its HASH storedin TPM, MSR registers, and so on.). This type of malware in general is easier to spot.
  • Malware that is intrusive by modifying things which are designed to be modified (DATA sections), this malware category in general is much harder to spot.

 

Hooking Techniques

There are about five main hooking techniques:

  • IAT Hooks
  • Inline Hooks
  • SSDT Hooks
  • SYSENTER_EIP Hook
  • IRP Major Function Hook

IAT Hooks

The Import Address Table (IAT) is a really a lookup table to when an application is calling a function in a different module. It can be in the form of both import by ordinal and import by name. Because a compiled program cannot know the memory location of the libraries it depends upon, an indirect jump is required whenever an API call is made. The general format looks like the following: "jmp dword ptr ds:[address]". Because functions in DLLs change address, instead of calling a DLL function directly, the application will make a call to the relevant jmp in its own jump table. When the application is executed the loader will place a pointer to each required DLL function at the appropriate address in the IAT.

The following diagram visualize an Import address table:

 

Based on the explanation above, if a rootkit for example would inject itself inside the application and modify the addresses in the IAT, that rootkit would be able to gain control every time a target function is called.

Bypass - Detection

Since the Export Address Table (EAT) of each DLL remains intact, an application could easily bypass IAT hooks by just calling ‘GetProcAddress’ in order to get the real address of each DLL function. In order to prevent this type of a bypass, a rootkit would likely hook ‘GetProcAddress’/’LdrGetProcedureAddress’ and use that to return fake addresses. In order to bypass these type of hooks, manually implementing a local ‘GetProcAddress’ and use that local function to get the actual or real function address.

Inline Hooks

Also called trampoline or detours hooks is a method of receiving control when calling a function, before the function has done its job. The flow of execution is redirected by modifying the first few (usually five) bytes of a target function. A standard way to achieve this is to overwrite the first 5 bytes of the function with a jump to a malicious block of code, the malicious code can then read the function arguments and do whatever it needs. If the malicious code requires results from the original function (the one it hooked). it may call the function by executing the five bytes that were overwritten, then jump five bytes into the original function, which will miss the malicious jump/call to avoid infinite recursion or redirection.

The following diagram illustrates the concept of inline hooks:

Bypass – Detection on Ring 3

While in user mode, inline hooks are usually place inside functions that are exported by a DLL. The most accurate way to detect and bypass these hooks would be to compare each DLL against the original code. First, the program would need to get a list of each DLL that is loaded, find the original file and read it, align and load the sections into memory then perform base relocation. Once the new copy of the DLL is loaded into memory, the application can walk the export address table and compare each function vs that in the original DLL. In order to bypass hooks, an application can then either replace the overwritten code using the code from the newly loaded DLL, this could resolve imports in the newly loaded DLL and use it instead. It is important to be aware of the fact that some DLLs will not work if more than one instance is loaded.

The above method of bypassing DLL hooks practically involves writing your own implementation of LoadLibrary. Another method is to use manual DLL loading to detect or even fix EAT hooks, also, EAT hooks are very uncommon.

Bypass – Detecting on Ring 0

Inter-modular jumps occur rarely in Kernel mode. Hooks in ntoskrnl can usually be detected by disassembling each instruction in each function, then searching for jumps or calls that point outside of ntoskrnl. It is also possible to use the same method explained for user mode hook detection by using a driver that would be able to read each ntoskrnl module from disk, load it into memory, and then compare the instructions against the original.

For inline hooks within drivers, scanning for jmp/call instructions that point outside of the driver body is much more likely to result in false positives, however, non-standard drivers that are the target of jumps/calls inside standard kernel drivers should raise a red flag. It is also possible to read drivers from disk. As drivers generally do not export many functions, and IRP major function pointers are only initialized at runtime, it would probably be required to compare the entire code section of the original and new driver. It is important to note that relative calls/jumps are susceptible to changes during relocation, this means that there will be some differences between the original and new driver, however both relative calls/jumps should point to the same place.

SSDT Hooks

The System Service Dispatch Table (SSDT) is a table of pointers for various Zw/Nt [6] functions, that are callable from user mode. A malicious application can replace pointers in the SSDT with pointers to its own code.

Detection

All pointers in the SSDT should point to code within ntoskrnl, if any pointer is pointing outside of ntsokrnl it is likely hooked. It's possible a rootkit could modify ntoskrnl.exe or one of the related modules in memory, and slip some code into an empty space, in which case the pointer would still point to within ntoskrnl. As far as I'm aware, functions starting with "Zw" are intercepted by SSDT hooks, but those beginning with "Nt" are not, therefore an application should be able to detect SSDT hooks by comparing Nt* function addresses with the equivalent pointer in the SSDT.

Bypass

A simple way to bypass SSDT hooks would be by calling only Nt* functions instead of the Zw* equivalent. It is also possible to find the original SSDT by loading ntoskrnl.exe (this can be done with LoadLibraryEx in user mode) then finding the export "KeServiceDescriptorTable" and using it to calculate the offset of KiServiceTable within the disk image (user mode applications can use NtQuerySystemInformation to get the kernel base address), a kernel driver would be required to replace the SSDT.

SYSENTER_EIP Hook

SYSENTER_EIP points to the code to be executed when the SYSENTER instruction is used. User mode applications use SYSENTER to transition into kernel mode and call a kernel function (Those beginning with Nt/Zw), usually it would point to KiFastCallEntry, but can be replaced in order to hook all user mode calls to kernel functions.

Detection / Bypass


SYSENTER_EIP hooking does not affect kernel mode drivers, and cannot be bypassed from user mode. In order to allow user mode applications to bypass this hook, a kernel driver must set SYSENTER_EIP to its original value (KiFastCallEntry), this can be done using the WRMSR instruction, however since KiFastCallEntry is not exported by ntoskrnl, getting the address would not be simple.

 

IRP Major Function Hook
 

The driver object of each driver, contains a table of twenty-eight functions pointer, these pointers are to be called by other drivers via IoCallDriver, the pointers correspond to operations such as read/write (IRP_MJ_READ/IRP_MJ_WRITE). These pointers can easily be replaced by another driver.


Detection


In general, all IRP major function pointers for a driver should point to code within the driver's address space. This is not always the case, but is a good start to identifying malicious drivers which have redirected the IRP major functions of legitimate drivers to their own code.


Bypass


Due to IRP major function pointers being initialized from within the driver entry point - during runtime, it's not really possible to get the original values by reading the original driver from disk, there are also issues with loading a new copy of the driver due to collisions. The only way for bypassing these types of hooks would be calling the lower driver (Drivers are generally stacked and the top driver passes the data to the driver below and so on, if the lowest driver isn't hooked, an application could just send the request directly to the lowest driver).

 

Resources

[1]  https://en.wikipedia.org/wiki/Kernel_(operating_system)  

[2]  https://docs.microsoft.com/en-us/windows-hardware/drivers/gettingstarted/user-mode-and-kernel-mode

[3]  https://docs.microsoft.com/en-us/windows-hardware/drivers/gettingstarted/virtual-address-spaces

[4] https://en.wikipedia.org/wiki/Protection_ring

[5] https://en.wikipedia.org/wiki/Hooking     

[6] https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/using-nt-and-zw-versions-of-the-native-system-services-routines

 

Richard

160 Posts
ISC Handler

Sign Up for Free or Log In to start participating in the conversation!