Finishing leftover tasks from Google Summer of Code
Over the past month, I was coordinating and coding the remaining post-GSoC tasks. This mostly covers work around honggfuzz and sanitizers.
honggfuzz ptrace(2) features
I've introduced new ptrace(2) tests verifying attaching to a stopped process. This is an important scenario in debuggers, the ability to call a ptrace(2) operation with the PT_ATTACH argument with a process id (PID) of a process entity that is stopped. In typical circumstances PT_ATTACH causes an executing process to stop and emit SIGSTOP for the tracer. An already stopped process is a special case as we cannot stop it again. Not every UNIX-like kernel can handle this scenario in a sensible way, and the modern solution is to keep the process stopped (rather than e.g... resumed) and emit a new signal SIGSTOP to the debugger (rather than e.g. not emitting anything). There used to be complex workarounds for mainstream kernels in debuggers such as GDB to workaround kernel bugs of attaching to a stopped process.
honggfuzz is a security oriented, feedback-driven, evolutionary, easy-to-use fuzzer with interesting analysis options. This piece of software is developed by a Google employee, however the product is not an official Google software. honggfuzz uses on featured platforms ptrace(2) to monitor crash signals in traced processes. I've implemented a new backed in the fuzzer for NetBSD using its ptrace(2) API. The backend is designed to follow the existing scenarios and features in Linux & Android:
- Attaching to a manually stopped process, optionally spawned by the fuzzer.
- Option to attach to a selected process over its PID.
- Crash instruction decoding with aid of a disassembler (capstone for NetBSD).
- Ability to monitor multiple processes with an arbitrary number of threads.
- Monitor forks(2) and vforks(2) events, however unused in the current fuzzing model.
- Concurrent execution of multiple processes.
- Timeout of long-lasting (hanging?) processes in persistent mode (limit: 0.25[sec]).
There are few missing features:
- Intel BTS - hardware assisted tracing
- Intel PT - hardware assisted tracing
- hp libunwind - unwinding stack of a traced process for better detection of unique crashes
Sanitizers
I've started researching Kernel Address Sanitizer, checking the runtime internals and differences between its version ABIs. My intention was to join efforts with Siddharth (GSoC student) and head with a sanitzier for EuroBSDCon 2018 in Romania. However, Maxime Villard decided to join the efforts a little bit earlier and he managed to get quickly a functional bare version for NetBSD/amd64. In the end we have decided to leave the kASan work to Maxime for now and let Siddharth to work on a kCov (SanitizerCoverage) device. SanCov is a feature of compilers, designed as an aid for fuzzers to ship interesting information from a fuzzing point of view of a number of function calls, comparisons, divisions etc. Successful userland fuzzers (such as AFL, honggfuzz) use this feature as an aid in determining of new code-paths. It's the same with a renowned kernel fuzzer - syzkaller.
While, I'm helping Siddharth to port a kcov(4) device to NetBSD, I've switched to the remaining pending tasks in userland sanitizers. I've managed to switch the sanitizers from syscall(2) and __syscall(2) - indirect system call API - calls to libc routines. The approach of using an indirect generic interface didn't work well in the NetBSD case, as there is the need to handle multiple ABIs, Endians, CPU architectures, and the C language ABI is not a good choice to serialize and deserialize arbitrary function arguments with various types through a generic interface. The discussion on the rationale is perhaps not the proper place, and every low-level C developer is well aware of the problems. It's better to restrict the discussion to the statement that it's not possible (not trivial) to call in a portable way all the needed syscalls, without the aid of per-case auxiliary switches and macros. There are also some cases (such as pipe(2)) when is is not possible to express the system call semantics with syscall(2)/__syscall(2).
I've switched these routines to use internal libc symbols when possible. In the remaining cases I've used a fallback to libc's versions of the routines, with aid of indirect function pointers. I'm trying to detect the addresses of real functions with dlsym(3) calls. In the result, I've switched all the uses of syscall(2) and __syscall(2) and observed no regressions in tests.
I'm also in the process of deduplication of local patches to sanitizers. My current main focus is to finish switching syscall(2) and __syscall(2) to libc routines (patch pending upstream), introduce a new internal version of sysctl(3) that bypasses interceptors (partially merged upstream) and introduces new interceptors for sysctl(3) calls. This is a convoluted process in the internals with the goal to make the sanitizers more reliable across NetBSD targets and manage to sanitize less trivial examples such as rumpkernels. The RUMP code uses internally a modified and private versions of sysctl*() operations and we still must keep the internals in order and properly handle the RUMP code.
Merged commits
The NetBSD sources:
- Merge FreeBSD improvements to the man-page of timespec_get(3)
- Remove unused symbols from sys/sysctl.h
- Add a new ATF ptrace(2) test: child_attach_to_its_stopped_parent
- Add await_stopped() in t_ptrace_wait.h
- Add a new ATF test parent_attach_to_its_stopped_child
- Add a new ATF ptrace(2) test: tracer_attach_to_unrelated_stopped_process
- Drop a duplicate instruction line [libpthread]
- Mark kernel-asan as done (by maxv)
- TODO.sanitizers: Mark switch of syscall(2)/__syscall(2) to libc done
The LLVM sources:
- Introduce new type for inteceptors UINTMAX_T
- Add internal_sysctl() used by FreeBSD, NetBSD, OpenBSD and MacOSX
- Improve portability of internal_sysctl()
- Try to fix internal_sysctl() for MacOSX
- Try to unbreak internal_sysctl() for MacOSX
Summary
I'm personally proud of the success of the reliability of the ptrace(2) backend in the renowned honggfuzz fuzzer. NetBSD was capable to handling all the needed features and support all of them with an issue-free manner. Once, I will address the remaining ptrace(2) issues on my TODO list - the NetBSD kernel will be capable to host more software in a similar fashion, and most importantly a fully featured debugger such as GDB and LLDB, however without the remaining hiccups.
We are also approaching another milestone with the sanitizers' runtime available in the compiler toolchain: sanitizing rumpkernels. It is already possible to execute the rump code against a homegrown uUBSan runtime, but we are heading now to execute the code under the default runtime for the remaining sanitizers (ASan, MSan, TSan).
For the record, it has been reported that kUBSan has been ported from NetBSD to at least two kernels: FreeBSD and XNU.
Plan for the next milestone
I'm in preparation for my visit to EuroBSDCon (Bucharest, Romania) in September and GSoC Mentor Summit & MeetBSDCa in October (California, the U.S.). I intend to rest during this month and still provide added value to the project, porting and researching missing software dedicated for developers. Among others, I'm planning to research the HP libunwind library and if possible, port it to NetBSD.
This work was sponsored by The NetBSD Foundation.
The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:
http://netbsd.org/donations/#how-to-donate [0 comments]