Learn

April 09, 2024

4 Minute Read

What is eBPF?

By Laiba Siddiqui

eBPF (Extended Berkeley Packet Filter) lets programmers load and execute lightweight programs within the Linux kernel without restarting it.

Kernel controls everything from processing data to communicating over networks. Traditionally, if you wanted to inspect or tweak how the kernel operates, you'd have to modify the kernel code directly or load specialized kernel modules. This process is time-consuming, risky, and often requires a system reboot, which is far from ideal.

However, with eBPF, you can attach eBPF programs to different hook points within the kernel. These programs can then observe and modify the kernel's behavior without changing the kernel code.

In this article, we’ll explore eBPF in detail.

Defining eBPF & how it works

eBPF is made up of three core concepts. Here’s what they are:

Hooks

Hooks are the predefined locations within the kernel code. They correspond to specific events or actions, such as network packet processing, system call entry/exit, or tracepoint events.

eBPF programs

eBPF programs encapsulate your desired functionality. These programs are written in a restricted C language and compiled into bytecode instructions that you can execute within the kernel.

Maps

Maps are data structures that facilitate efficient data sharing and communication between eBPF programs and user-space applications. They can store key-value pairs, arrays, or hash tables and provide a way to share data across different eBPF program executions.

How eBPF Works

Here’s how eBPF works: Developers write eBPF programs in C language using the eBPF instruction set and data structures.

The eBPF program is compiled into bytecode instructions using the LLVM compiler infrastructure and the BPF backend. Before execution, this bytecode is verified to ensure that it:

Adheres to memory safety rules.
Does not perform unauthorized operations.
Terminates within a finite number of instructions.

Once verified, the eBPF bytecode is loaded into the kernel using a dedicated system call. In this phase:

Any required maps are created and initialized to facilitate data sharing between eBPF programs and user-space applications.
The loaded eBPF program is attached to one or more hooks within the kernel, specifying the events or actions that will trigger its execution.

Next, the associated eBPF program is executed within the kernel context when the specified events or actions occur. Now, user-space applications can retrieve and analyze the data generated or modified by eBPF programs through the shared maps.

Verifying programs for security purposes

eBPF incorporates strong security measures to ensure programs are executed safely within the kernel. This way, it maintains system integrity and prevents potential vulnerabilities.

During this phase, the eBPF bytecode is analyzed by a static verifier, which checks for various safety properties, including:

Memory safety: The verifier ensures that eBPF programs can’t access or modify kernel memory outside of their designated memory regions. It prevents buffer overflows and other memory-related vulnerabilities.
Termination guarantee: eBPF programs must terminate within a finite number of instructions to avoid infinite loops or resource exhaustion attacks.
Restricted operations: eBPF programs are restricted from disabling interrupts or modifying critical kernel data structures to maintain system stability and security.
Bounded loops: Loop iterations and recursion depths prevent excessive resource consumption or denial-of-service attacks.

If an eBPF program fails to meet these safety requirements, the verification process rejects it and does not execute it within the kernel. As a result, eBPF programs don’t compromise the integrity or stability of the kernel and provide a controlled environment for executing user-defined code.

Writing and compiling eBPF programs

eBPF programs are written in C language and then compiled into bytecode instructions using the LLVM compiler infrastructure and the BPF backend. You can do this using different tools and frameworks, such as:

BCC (BPF Compiler Collection) is a popular toolkit for creating kernel tracing and manipulation programs using eBPF. It provides Python bindings, making writing eBPF programs in Python easier and compiling them to bytecode.
BPFtrace is a high-level tracing language for Linux based on eBPF. It allows developers to write concise scripts and trace kernel and user-space events without low-level C programming.
libbpf is a library from the Linux kernel project that provides a C API for loading and interacting with eBPF programs from user-space applications.
bpftool is a command-line tool for managing and inspecting eBPF programs, maps, and links from the command line.

These tools and frameworks abstract some of the complexities of writing and compiling eBPF programs. They provide higher-level interfaces and simplify the development process.

(Splunk is a proud participant in the OpenTelemetry open standards project. Which is why we donated the OpenTelemetry eBPF collector.)

In this video, Jonathan Perry and the Splunk architect team explain some challenges they faced when building the Flowmill Collector and how OpenTelemetry solves them:

Benefits of using eBPF

eBPF has several benefits that make it an attractive choice for various use cases, and here are some of them:

Efficient kernel tracing and profiling

eBPF enables kernel-level tracing and profiling without the need for kernel modules. You can attach small eBPF programs to various kernel hooks for deep visibility into kernel activities and system behavior. This helps with:

Performance analysis
Debugging
Monitoring

(Related reading: application performance monitoring.)

Better security

eBPF programs are executed in a virtualized instruction set and run within a secure sandbox environment. This sandboxing mechanism mitigates potential security risks by preventing eBPF programs from directly modifying kernel data structures or executing privileged operations.

As a result, the overall posture of the system remains secure.

Flexible to use

You can dynamically load, unload, and modify eBPF programs without restarting or rebooting the kernel. This makes it quite flexible for on-the-fly adjustments and updates to eBPF programs by giving real-time monitoring and analysis capabilities.

Observability and monitoring

eBPF is widely used in observability and monitoring tools, such as:

It provides a powerful mechanism for collecting fine-grained data from the kernel and user-space applications to provide deep insights into system behavior.

While eBPF offers numerous benefits, its adoption and usage require a certain level of expertise and understanding of the underlying kernel and system internals, which can be tricky for new users.

Wrapping up

eBPF is a powerful technology that allows developers to run custom programs safely within the Linux kernel. While it has significant advantages like kernel tracing and monitoring, some level of expertise is required to use it appropriately.

See an error or have a suggestion? Please let us know by emailing splunkblogs@cisco.com.

This posting does not necessarily represent Splunk's position, strategies or opinion.

Laiba Siddiqui

Laiba Siddiqui is an SEO writer who loves simplifying complex topics. She has helped companies like Data World, DataCamp, and Rask AI create engaging and informative content for their audiences. You can connect with her on LinkedIn.

Learn 10 Min Read

What's The CIA Triad? Confidentiality, Integrity, & Availability, Explained

The CIA security triad guides information security strategies to inform areas like security framework implementation and cyber threat.

Learn 8 Min Read

How Supply Chain Attacks Work: Definition, Types, Security and Prevention

Supply chain attacks exploit trusted third parties and vendors to gain access to compromise the supply chain. Read on to learn more.

Learn 7 Min Read

Cloud Automation Explained

Cloud automation streamlines cloud management by automating deployment, scaling, and maintenance. Learn its benefits, challenges, and best practices.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.

Learn more about Splunk

Subscribe to our blog

Get the latest articles from Splunk straight to your inbox.

Connect with Splunk on X

Follow @Splunk

Connect with Splunk on Instagram