Journey to libbpf 1.0
Libbpf 1.0 release is here!
It has been a pretty long journey to get to libbpf 1.0, so to commemorate this event I decided to write a post that would highlight main features and API changes (especially breaking ones) and show large amount of work done by libbpf community that went into improved usability and robustness of libbpf 1.0.
The idea to clean up and future-proof and shed some organically grown over time cruft was born almost 1.5 years ago. At that time libbpf was in active development for a while already and all the main conventions and concepts on how to structure and work with BPF programs and maps, collectively grouped into "BPF objects", were more or less formulated and stabilized, so it felt like a good time to start cleaning up and setting up a good base for the future without dragging along suboptimal early decisions.
The journey started with public discussion on what should be changed and dropped from libbpf APIs to improve library's usability and long term maintainability. The resulting plan for all the backwards-incompatible changes was recorded in "Libbpf: the road to v1.0" wiki page, initial 46 tracking issues were filed, and since then libbpf community have been diligently working towards libbpf 1.0 release.
During this time, libbpf went through five minor version releases (v0.4 – v0.8). We've developed a transitioning plan and mechanisms for early adoption of new (stricter) behaviors, deprecated lots of APIs, and for many of those we added better and cleaner alternatives.
But it wasn't just about removing stuff. A lot of new features were implemented and added, closing some of the long standing gaps in functionality (e.g., as compared to BCC), making many typical scenarios simpler for end users, as well as adding enough control and flexibility to support more advanced scenarios. A lot of work went into helping users to deal with unavoidable differences between different kernel versions, Clang versions, and sometimes even special Linux distro quirks. Whenever possible this was done in a completely transparent and robust way to free end users from such distracting details.
We also expanded and improved support for various CPU architectures beyond the most popular and actively used in production x86-64 (amd64) architecture.
Of course, lots of bugs were reported and fixed by BPF community as we went along, together making a better BPF loader library for everyone.
Lastly, we've started an effort of improving libbpf documentation. Automated pipeline was setup and we now host libbpf docs here. This is an ongoing effort and could always use more active contributions, of course. I'm sure with time and community effort we'll get to the point where libbpf documentation will be exemplary and a self-sufficient way for BPF newbies to start in an exciting BPF world. But while we are working towards that, we've also started a companion libbpf-bootstrap repository with simple and clean examples of using libbpf to build BPF applications of various kinds, which seem to be quite popular with users.
Libbpf 1.0 marks the release in which all the deprecated APIs and features are physically removed from the code base and new stricter behaviors are mandatory and not an opt-in anymore. This both signifies maturity of libbpf and sets it up for better future functionality with less maintenance burden for future backwards-compatible v1.x libbpf versions.
Oh, and, as you might have noticed already, libbpf now has its own logo to give the project a bit more personality! I hope you like it!
On behalf of libbpf project I'd like to thank all the contributors to the overall libbpf ecosystem, which, besides libbpf itself, includes also:
- BPF selftests;
- BPF CI and related infrastructure;
- bpftool;
- libbpf-bootstrap;
- libbpf-tools project of many libbpf-based observability tools, inspiring multitude of libbpf features;
- libbpf-rs Rust library companion to libbpf;
- libbpf-sys Rust low-level bindings.
Thanks a lot for your continuing help and contributions, without which libbpf wouldn't be were it is today!
To make this not just a celebratory post, in the rest of it I'll try to describe major breaking changes users should be aware of and summarize various new additions to libbpf functionality. Separately, I'll try to highlight libbpf functionality that most people might not be aware of, because it generally does it's job very well and stays out of the way, shielding BPF users from pain or handling various quirks of kernels, compilers, and Linux distros. This is not intended as an exhaustive list of features and changes, it's just things I chose to highlight, so please don't be offended if I missed or skipped a favorite feature of yours.
Breaking changes
Error reporting streamlining
Libbpf started out as internal Linux kernel project and inherited some of
kernel-specific traits and conventions. One of the most prominent and
pervasive throughout API was a convention to return error code embedded into
the pointer. Almost all pointer-returning APIs in libbpf on failure would
return error code (one of many
-Exxx
values) encoded as a pointer. And while convenient for those
in the know, it was way too easy to forget about this and perform natural
but incorrect NULL
error check.
For integer-returning APIs, the situation with returning errors wasn't completely straightforward as well. Some APIs would return -1
on error and set errno
to actual Exxx
error code, just like typical libc
API would do. Other APIs would return -Exxx
directly as return value, usually not caring about setting errno
at all. And what's worse, some APIs did both depending on specific errors. Quite a mess, indeed.
Libbpf 1.0 breaks apart from internal kernel convention of encoding error code in pointers and sets up clear and consistently followed error reporting rules across any API:
- any pointer-returning API returns
NULL
on error, and setserrno
to actual (positive)Exxx
error code; - any other API that might fail returns (negative)
-Exxx
error code directly as anint
result and sets (positive)errno
toExxx
.
This allows to consistently rely on errno
for extracting error code across
any type of API. It also makes an intuitive and natural NULL
check safe
and correct. And for integer-returning API, user can directly capture
-Exxx
error code from return result (so no more useless -1
error codes,
confusingly translated as -EPERM
) and not be too careful about capturing
and not clobbering errno
.
Note also, that any destructor-like API in libbpf (e.g., bpf_object__close ()
, btf__free()
, etc) always safely accepts NULL
pointers, so you don't
need to guard them with extra NULL
checks. It's a small, but nice,
usability improvement that adds up in a big code base.
BPF program SEC() annotation streamlining
Libbpf expects BPF programs to be annotated with SEC()
macro, where string
argument passed into SEC()
determines BPF program type and, optionally,
additional attach parameters, like kernel function name to attach to for
kprobe programs or hook type for cgroup programs. This SEC()
definition
ends up being recorded as ELF section name.
Initially libbpf only allowed one BPF program for each unique
SEC()
definition, so you couldn't have two BPF programs with, say,
SEC("xdp")
annotation. Which led to two undesirable consequences:
-
such section names were used as unique identifiers for BPF programs. This convention made it all the way to generic tools like
bpftool
andiproute2
which expected section names to identify specific BPF program out of potentially many programs in a single BPF ELF object file; -
people started adding random suffixes (e.g.,
SEC("xdp_my_prog1")
) to create a unique identifier both as a convenience and as a work around for the single BPF program perSEC()
limitation.
Both were problematic. First, since those ancient times libbpf started
supporting as many BPF programs with the same SEC()
annotation as user
needs. Second, non-uniform use of SEC()
annotation was harming overall
ecosystem, especially confusing newcomers as to what exact SEC()
annotation
is correct and appropriate.
Libbpf 1.0 is breaking with this legacy and sloppy approach. Libbpf
normalized a set of supported SEC()
annotations and doesn't support
random suffixes anymore. For the above example, no matter how many XDP
programs one has in BPF object file, all of them should be annotated
strictly as SEC("xdp")
. Not SEC("xdp1")
, or SEC("xdp_prog1")
,
or SEC("xdp/prog1")
.
As for unique identification of BPF program for generic tools, please use BPF program name (i.e., its C function name) to identify BPF programs uniquely, if your application or tool isn't doing that already.
Note also, that as of libbpf 1.0, BPF program name is used when pinning BPF program, while previously BPF program's (non-unique) section name was used. This is another breaking change to keep in mind, if you rely on libbpf's pinning logic.
Another important change related to SEC()
handling is that for a lot of BPF
programs that used to require to always specify attach target (e.g.,
SEC("kprobe/__x64_sys_bpf")
), this requirement has been lifted and it's
completely supported to specify just BPF program type (i.e.,
SEC("kprobe")
). In the latter case, BPF program auto-attachment through BPF
skeleton or through bpf_program__attach()
won't be supported, but otherwise
BPF program will be marked with correct program type and loaded into kernel
during BPF object loading phase.
API extensibility through OPTS framework
Libbpf 1.0 got rid of a bunch of APIs that were not extendable without breaking backwards (i.e, newer application dynamically linked against older libbpf) and/or forward (i.e., older application dynamically linked against newer libbpf) compatibility. Such APIs historically were using fixed-sized structs to pass extra arguments, but we've learned the hard way that this approach doesn't work well.
To solve these problems, we've developed a so-called OPTS
-based framework,
and a lot of APIs (e.g., bpf_object__open_file()
) accept optional
struct xxx_opts
arguments. Use of OPTS
approach lets libbpf deal with
backwards and forward compatibility transparently without relying on
cumbersome complexities of ELF symbol versioning or defining entire families
of very similar APIs, each differing just slightly from each other to
accommodate some new optional argument. Please utilize LIBBPF_OPTS()
macro to simplify instantiation of such OPTS
structs, e.g.:
char log_buf[64 * 1024];
LIBBPF_OPTS(bpf_object_open_opts, opts,
.kernel_log_buf = log_buf,
.kernel_log_size = sizeof(log_buf),
.kernel_log_level = 1,
);
struct bpf_object *obj;
obj = bpf_object__open_file("/path/to/file.bpf.o", &opts);
if (!obj)
/* error handling */
...
Other API clean ups, renames, removals, etc
We used libbpf 1.0 milestone to clean up libbpf API surface significantly. We removed, renamed, or consolidates a good chunk of APIs:
-
AF_XDP-related parts of libbpf (APIs from
xsk.h
) were consolidated into libxdp. -
We streamlined naming for getter and setter high-level APIs: getters don't add "get_" prefix (e.g.,
bpf_program__type()
), while setters always have "set_" in the name (e.g.,bpf_program__set_type()
). -
Entire family of low-level APIs for BPF program loading were consolidated into a single extensible
bpf_prog_load()
API; similarly for BPF map creation,bpf_map_create()
API superseded a small zoo of six very similar APIs; similar sort of API normalization was done for other APIs. -
Support for legacy BPF map definitions in
SEC("maps")
was completely dropped. It was original and limited way to define BPF maps with no clean mechanism to provide additional key and value BTF type information associated with BPF map. We've long since switched to using BTF-defined map definitions defined inSEC(".maps")
, please use them instead, if you haven't done so already. -
A whole set of very niche and specialized APIs that were only used by a handful of applications (like Linux's
perf
tool, or BCC framework) were moved or removed, giving more freedom to change libbpf internals. In turn, libbpf gained more general APIs that allow to implement complex scenarios more generically (e.g., support for customSEC()
annotations) and covered existing niche use cases well. Typically, users are unlikely to ever notice this, but if you find some APIs missing, please reach out at BPF mailing list.
To familiarize yourself with finalized set of libbpf 1.0 APIs, please consult the following public headers:
- user-space APIs:
- BPF-side APIs:
Graceful degradation and taking care of kernel differences
While new features are exciting and most visible, I think it's important to appreciate small and mostly invisible things that libbpf is doing for the user to simplify their life and hide as many different quirks and limitations of different (especially older) kernel and Clang versions (and even some gotchas of particular Linux distros), as possible. It is unsung part of libbpf and a lot of thought and collective work went into making sure that things that can be abstracted away and handled transparently without user involvement "just work". Here's a list of just some of the stuff that libbpf is doing on behalf of users, so they don't have to.
bpf_probe_read_{kernel, user}()
is automatically "downgraded" tobpf_probe_read()
on old kernels, so user should freely usebpf_probe_read_kernel()
in their BPF code and not worry about backwards compatibility problems.bpf_printk()
macro is conservatively usingbpf_trace_printk()
BPF helper, supported since oldest kernels, if possible, but automatically and transparently switching to newer and more powerfulbpf_trace_vprintk()
BPF helper, available only on newer kernels, if user needs to log more arguments, so samebpf_printk()
macro can be used universally with largest possible backwards kernel compatibility.- libbpf will automatically set
RLIMIT_MEMLOCK
to infinity (user can override the limit), but only if kernel is old enough to require that. On newer kernels,RLIMIT_MEMLOCK
is not used and so doesn't have to be increased to do anything useful with BPF.RLIMIT_MEMLOCK
has been a bump in the road for lots of BPF newbies, and now it's taken care of by libbpf automatically (and only if necessary). - libbpf is taking care of automatically "sanitizing" BPF object's BTF information so it's not rejected by older kernels. User doesn't have to worry about which BTF features kernel and compiler support and whether they are compatible. BTF will be automatically sanitized to satisfy Linux kernel's level of BTF support. One less thing to worry about.
- libbpf APIs overall are smart enough to drop optional features,
if kernel is too old to support them and features themselves are not
critically important for correct functioning of BPF applications
(e.g., BPF program/map name for
bpf_prog_load()
/bpf_map_create()
; optional BTF/BTF.ext data associated with BPF program/map). - libbpf will post-process BPF verifier logs to augment it with more useful and relevant information for well-known and common situations (e.g., extending verifier log with information about failed BPF CO-RE relocation). This significantly improves usefulness of verification failure log.
- for cases when it's safe to do so, libbpf will auto-adjust BPF map parameters, if necessary. E.g., BPF ringbuf size will be rounded up to a proper multiple of page size on host system. Or key/value BTF type information will be dropped, if libbpf believes that specific BPF map doesn't support specifying BTF type ID; it will still calculate and provide correct key/value sizes, though.
- on BPF side, macros like
BPF_KPROBE()
,BPF_PROG()
,BPF_USDT()
,BPF_KSYSCALL()
were developed to make it much easier and ergonomic to write tracing BPF programs. Use of such macro both improves code readability and maintainability, as well as hides some of the nasty kernel- and architecture-specific quirks, so that typical user doesn't have to care or know about them for typical use cases.
This is just a few examples of what goes on behind the scenes in libbpf in the name of better user experience. Of course, not all kernel- or architecture-specific differences can be hidden and handled automatically, but whenever it can be, libbpf strives to do it transparently, efficiently and correctly.
New functionality
Libbpf is developed in lockstep with Linux kernel BPF support. This was the
case before and is going to be the case going forward. Any new BPF kernel
feature gets necessary APIs and overall support in libbpf at the time of that
feature landing upstream into Linux kernel. This means features like unstable
BPF helpers (exposed as extern __ksym
functions), safe typed kernel
pointers (__kptr
and __kptr_ref
), and lots of other features are ready to
be used through libbpf from the day they are accepted into the kernel, no
matter how bleeding edge they are.
This is par for the course for libbpf, so instead of concentrating on all the new kernel functionality supported and exposed through libbpf, I'll highlight new functionality that required additional purely user-space code to be added on the way to libbpf 1.0.
C language constructs support
Libbpf has come a long way since its early days in terms of supporting all the typical C language features one would expect from user-space code base:
- There are no restrictions on number of BPF programs and their
SEC ()
annotation uniqueness. - Users don't have to
__always_inline
their C functions (a.k.a. BPF subprograms) anymore, libbpf is smart enough to figure out which ones are used by each BPF program and perform code transformations to make sure BPF verifier gets correct final BPF assembly instructions. - C global variables are supported as well and provide a tremendous usability improvements for a lot of typical use cases (e.g., configuring BPF program logic from user-space; see below on BPF skeleton as well);
- Static linking of object files is now supported. There is no more
restrictions on keeping entire BPF-side logic within a single
.bpf.c
file. Libbpf implements BPF static linker functionality and allows to compile each individual.bpf.c
file separately and then link them all together into a final.bpf.o
file. Static linking is normally performed throughbpftool gen object
command, but it is also available as public APIs for programmatic use in more advanced applications. Static linking means that BPF static libraries are now possible and supported! All this allows to structure user's application in the way that makes sense for long-term maintainability and code reuse, instead of cramming all the code into single text file (even if through#include
-ing.c
files as if they were C headers). This also meansextern
subprograms, maps, and variables declarations are supported and are resolved as one'd expect with usual user-space C application.
Beyond that, BPF skeleton improves logistics of deploying, loading and interacting with BPF ELF object files. BPF skeleton allows to embed final BPF object file for ease of distribution, load it at runtime, set and configure all the programs, maps, and global variables from user-space conveniently to prepare BPF object for loading it into the kernel, and afterward to keep interfacing with it at runtime. If you haven't tried BPF skeleton, please consider giving it a go, it quite profoundly changes how one structures and interacts with their BPF-side functionality in BPF application.
It's been a gradual work towards supporting all the typical constructs of user-space C code and by libbpf 1.0 there shouldn't be many things that can't be expressed with BPF-side C, as long as it is supported by BPF verifier.
Improved customizability by user
Lots of small features and APIs were added and implemented to allow users more
precise control over which parts of their BPF applications are loaded (for BPF
programs) or created (for BPF maps). With the use of SEC("?...")
pattern
it's possible to opt-out from auto-loading BPF program declaratively in
BPF-side C code. Libbpf also provides equivalent programmatic controls with
bpf_program__set_autoload()
API. For BPF maps, there is equivalent
bpf_map__set_autocreate()
API to disable automatic creation of BPF map, if
that's undesirable at particular host system. And with
bpf_program__set_autoattach()
it's possible to further control which BPF
programs will be automatically attached by BPF skeleton logic, at a granular
level.
Look for all the getters and setters for bpf_object
, bpf_program
, and
bpf_map
to see what else can be tuned and adjusted at runtime to implement
the most portable BPF applications possible.
To improve debuggability, we've also added ability to flexibly capture BPF
verifier log into user-provided log buffers the help of kernel_log_buf
,
kernel_log_size
, and kernel_log_level
open options, passed into
bpf_object__open_file()
and bpf_object__open_mem()
. To get even more
control, each BPF program's log buffer can be set and retrieved with
bpf_program__[set_]log_buf()
and bpf_program__[set_]log_level()
APIs.
This control of BPF verifier log output comes very handy during active
development and troubleshooting.
As another example of libbpf allowing more control and customizability, libbpf
now supports defining custom SEC()
handler callbacks to implement more
advanced application-specific BPF program loading scenarios. As a proof of
concept, this was used by perf
application to support their
highly-specialized (and not supported by libbpf 1.0) way of defining custom
tracing BPF programs. This functionality is clearly for advanced users, but
it's good to have it when the need comes.
Tracing support improvements
Using BPF for tracing kernel and user-space functionality is one of the main BPF and libbpf use cases. Libbpf has significantly boosted its tracing support since its early days.
USDT tracing support. A long-standing feature request for a while was
ability to trace USDT
(User Statically-Defined Tracing) probes with libbpf. This is now possible
and is an integral part of libbpf, adding support for
SEC("usdt")
-annotated BPF programs. Check
BPF-side API,
and also note BPF_USDT()
macro which allows to declaratively
define expected USDT arguments for ease of use and better readability.
Make sure to check a simple USDT BPF example from libbpf-bootstrap (usdt.c and usdt.bpf.c).
Furthermore, thanks to the community, libbpf has support for five CPU architectures from day one:
- x86-64 (amd64);
- x86 (i386);
- s390x;
- ARM64 (aarch64);
- RISC V (riscv).
Improved old kernel support. Libbpf gained support for attaching kprobes,
uprobes, and tracepoint on much older kernels, falling back from more
modern and preferred perf_event_open()
-based attachment mechanism, to
older tracefs-based one. All these details are transparent to user and
are hidden behind standard bpf_link
interface. Just make sure to call
bpf_link__destroy()
to detach and clean up the system.
Syscall tracing support. Tracing kernel syscalls are trickier than a
typical kernel function due to differences between host architectures and
kernel versions, as the exact naming of kernel functions and use of a
syscall wrapper mechanism (see CONFIG_ARCH_HAS_SYSCALL_WRAPPER
if you are
curious). Instead of expecting users to figure all this out (e.g., is it
__x64_sys_close
or __se_sys_close
? Is PT_REGS_SYSCALL_REGS()
indirection necessary or not?), libbpf now supports special
SEC("ksyscall/<syscall>")
(and corresponding SEC("kretsyscall")
for
retprobes) BPF programs. In a combination with BPF_KSYSCALL()
macro
this allows to trace syscalls much more easily. There are various advanced
corner cases which still require user care, though, so please see
documentation
for bpf_program__attach_ksyscall()
API for more details. But common
scenarios are covered and well supported.
Improved uprobes. User-space tracing (uprobes) with libbpf used to
require user to do pretty much all the hard work themselves: figuring out
exact binary paths, calculating function offsets within target process,
manually attaching BPF programs, etc. Not anymore! Libbpf now supports
specifying target function by name and will do all the necessary calculations
automatically. Additionally, libbpf is smart enough to figure out absolute
path to system-wide libraries and binaries, so just annotating your BPF
uprobe program as SEC("uprobe/libc.so.6:malloc")
will allow to auto-attach
to malloc()
in your system's C runtime library. You still get full control,
if you need to, of course, with bpf_program__attach_uprobe()
API. Check
uprobe example in libbpf-bootstrap
(uprobe.bpf.c
and uprobe.c).
Expanded set of supported architectures for kprobes Thanks to community expertise, libbpf supports quite a variety of different CPU architectures when it comes to low-level tracing and fetching kernel function arguments. For kprobes, here's the current list of architectures libbpf supports and for which it provides tracing helpers:
- x86-64 (amd64);
- x86 (i386);
- arm64 (aarch64);
- arm;
- s390x;
- mips;
- riscv;
- powerpc;
- sparc;
- arc.
Networking support improvements
While tracing is perhaps the area that got the biggest boost in new features, BPF networking support wasn't forgotten either.
Libbpf now provides its own dedicated API for creating TC (Traffic Control)
hooks and attaching BPF programs to them. Previously this was only possible
through either shelling out to external iproute2
tool or implementing
custom netlink-based functionality on your own. Both can be a significant
burden for BPF networking applications. Now, with bpf_tc_hook_create()
,
bpf_tc_hook_destroy()
, bpf_tc_attach()
, bpf_tc_detach()
, and bpf_tc_query()
APIs there is no need to take extra dependency on
iproute2
just to attach BPF TC programs. All the batteries are included
with libbpf.
Similar in naming and spirit bpf_xdp_attach()
, bpf_xdp_detach()
, and
bpf_xdp_query()
APIs abstract away netlink-based XDP attachment details.
Note that there is also a safer-by-default alternative bpf_link
-based XDP
attachment API, bpf_program__attach_xdp()
, which is a preferred way of
performing XDP attachment on newer kernels.
Summary
It's been a long journey for libbpf to get to 1.0, but it was worth it. By taking time to get here, with community help and involvement, we got more well thought out, user friendly, and full-featured library. libbpf 1.0 now provides a battle-tested foundation for building any kind of BPF application. It also sets a good base for future libbpf releases with more exciting functionality while backwards compatibility across minor version releases, all while keeping maintainability in focus.
A big "Thank you!" goes to hundreds of contributors and bug reporters across entire libbpf family of projects for all your work and support! Congratulations on the long-awaited v1.0!