eBPF (Extended Berkeley Packet Filter) lets programmers load and execute lightweight programs within the Linux kernel without restarting it.
Kernel controls everything from processing data to communicating over networks. Traditionally, if you wanted to inspect or tweak how the kernel operates, you'd have to modify the kernel code directly or load specialized kernel modules. This process is time-consuming, risky, and often requires a system reboot, which is far from ideal.
However, with eBPF, you can attach eBPF programs to different hook points within the kernel. These programs can then observe and modify the kernel's behavior without changing the kernel code.
In this article, we’ll explore eBPF in detail.
eBPF is made up of three core concepts. Here’s what they are:
Hooks are the predefined locations within the kernel code. They correspond to specific events or actions, such as network packet processing, system call entry/exit, or tracepoint events.
eBPF programs encapsulate your desired functionality. These programs are written in a restricted C language and compiled into bytecode instructions that you can execute within the kernel.
Maps are data structures that facilitate efficient data sharing and communication between eBPF programs and user-space applications. They can store key-value pairs, arrays, or hash tables and provide a way to share data across different eBPF program executions.
Here’s how eBPF works: Developers write eBPF programs in C language using the eBPF instruction set and data structures.
The eBPF program is compiled into bytecode instructions using the LLVM compiler infrastructure and the BPF backend. Before execution, this bytecode is verified to ensure that it:
Adheres to memory safety rules.
Does not perform unauthorized operations.
Terminates within a finite number of instructions.
Once verified, the eBPF bytecode is loaded into the kernel using a dedicated system call. In this phase:
Any required maps are created and initialized to facilitate data sharing between eBPF programs and user-space applications.
The loaded eBPF program is attached to one or more hooks within the kernel, specifying the events or actions that will trigger its execution.
Next, the associated eBPF program is executed within the kernel context when the specified events or actions occur. Now, user-space applications can retrieve and analyze the data generated or modified by eBPF programs through the shared maps.
eBPF incorporates strong security measures to ensure programs are executed safely within the kernel. This way, it maintains system integrity and prevents potential vulnerabilities.
During this phase, the eBPF bytecode is analyzed by a static verifier, which checks for various safety properties, including:
Memory safety: The verifier ensures that eBPF programs can’t access or modify kernel memory outside of their designated memory regions. It prevents buffer overflows and other memory-related vulnerabilities.
Termination guarantee: eBPF programs must terminate within a finite number of instructions to avoid infinite loops or resource exhaustion attacks.
Restricted operations: eBPF programs are restricted from disabling interrupts or modifying critical kernel data structures to maintain system stability and security.
Bounded loops: Loop iterations and recursion depths prevent excessive resource consumption or denial-of-service attacks.
If an eBPF program fails to meet these safety requirements, the verification process rejects it and does not execute it within the kernel. As a result, eBPF programs don’t compromise the integrity or stability of the kernel and provide a controlled environment for executing user-defined code.
eBPF programs are written in C language and then compiled into bytecode instructions using the LLVM compiler infrastructure and the BPF backend. You can do this using different tools and frameworks, such as:
BCC (BPF Compiler Collection) is a popular toolkit for creating kernel tracing and manipulation programs using eBPF. It provides Python bindings, making writing eBPF programs in Python easier and compiling them to bytecode.
BPFtrace is a high-level tracing language for Linux based on eBPF. It allows developers to write concise scripts and trace kernel and user-space events without low-level C programming.
libbpf is a library from the Linux kernel project that provides a C API for loading and interacting with eBPF programs from user-space applications.
bpftool is a command-line tool for managing and inspecting eBPF programs, maps, and links from the command line.
These tools and frameworks abstract some of the complexities of writing and compiling eBPF programs. They provide higher-level interfaces and simplify the development process.
(Splunk is a proud participant in the OpenTelemetry open standards project. Which is why we donated the OpenTelemetry eBPF collector.)
In this video, Jonathan Perry and the Splunk architect team explain some challenges they faced when building the Flowmill Collector and how OpenTelemetry solves them:
eBPF has several benefits that make it an attractive choice for various use cases, and here are some of them:
eBPF enables kernel-level tracing and profiling without the need for kernel modules. You can attach small eBPF programs to various kernel hooks for deep visibility into kernel activities and system behavior. This helps with:
Performance analysis
Debugging
Monitoring
(Related reading: application performance monitoring.)
eBPF programs are executed in a virtualized instruction set and run within a secure sandbox environment. This sandboxing mechanism mitigates potential security risks by preventing eBPF programs from directly modifying kernel data structures or executing privileged operations.
As a result, the overall posture of the system remains secure.
You can dynamically load, unload, and modify eBPF programs without restarting or rebooting the kernel. This makes it quite flexible for on-the-fly adjustments and updates to eBPF programs by giving real-time monitoring and analysis capabilities.
eBPF is widely used in observability and monitoring tools, such as:
System and application profilers
It provides a powerful mechanism for collecting fine-grained data from the kernel and user-space applications to provide deep insights into system behavior.
While eBPF offers numerous benefits, its adoption and usage require a certain level of expertise and understanding of the underlying kernel and system internals, which can be tricky for new users.
eBPF is a powerful technology that allows developers to run custom programs safely within the Linux kernel. While it has significant advantages like kernel tracing and monitoring, some level of expertise is required to use it appropriately.
See an error or have a suggestion? Please let us know by emailing ssg-blogs@splunk.com.
This posting does not necessarily represent Splunk's position, strategies or opinion.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.