Towards optimal an optimal debugging library framework

This article is intended as overview of debugging techniques and motivation for uniform execution representation and setup to efficiently mix and match the appropriate technique for system level debugging with focus on statically optimizing compiler languages to keep complexity and scope limited. The author accepts the irony of such statements by “C having no ABI”/many systems in practice having no ABI, but reality is in this text simplified for brevity and sanity.

Theory of debugging

A program can be represented as (often non-deterministic) state machine, such that a bug is a bad transition rule between those states. It is usually assumed that the developer/user knows correct and incorrect (bad) system states and the code represents a somewhat correct model of the intended semantics. Then an execution witness are the states and state transitions encountered on a specific program run. If the execution witness shows a “bad state”, then there must be a bug. Thus a debugger can be seen as query engine over states and transitions of a buggy execution witness.
Frequent operations are bug source isolation to deterministic components, where encapsulation of non-determinism usually simplifies the process. In contrast to that, concurrent code is tricky to debug, because one needs to trace multiple execution flows to estimate where the origin of the incorrect state is.

One can generally categorize methods into the following list (asoul) automate, simplify, observe, understand, learn)

  • automate the process to minimize errors/oversights during debugging, against probabilistic errors, document the process etc
  • simplify and isolate system components and changes over time
  • observe the system while running it to trace state or state changes
  • understand the expected and actual code semantics to the degree necessary
  • learn, extend and ensure how and which system invariants are satisfied necessary from of the involved systems, for example userspace processes, kernel, build system, compiler, source code, linker, object code, assembly, hardware etc

with the fundamental constrains being (feel)

  • finding out correct system components semantics
  • eensuring deterministic reproducibility of the problem
  • limited time and effort

Common debugging methods to feel a soul with various tradeoffs from compile-time to runtime debugging and less to more run-time data collection are:

  • Formal Verification as ahead or compile-time invariant resolving.
  • Validation as runtime invariant checks.
  • Testing as sample based runtime invariant checks.
  • Stepping via “classical debugger” to manipulate task execution context, manipulate memory optionally via source code location translation via REPL commands, graphically, scripting or (rarely) freely programmable.
  • Logging as dumping (a simplification of) state with context from bugs (usually timestamps in production systems).
  • Tracing as dumping (a simplification of) runtime behavior via temporal relations (usually timestamps).
  • Recording Encoded dumping of runtime to replay runtime with before specified time and state determinism.

Simplification and isolation means to apply the meaning of both words on all potential sub-components including, but not limited to hardware, code versioning including dependencies, source system, compiler framework and target system. Typical methods are

  • Bisection via git or the actual binaries
  • Reduction via rmeoval of system parts or trying to reproduce with (a minimal) example.
  • Statistical analysis from collected data on how the problem manifests on given environment(s) etc.

Debugging is domain- and design-specific and relies on core component(s) of the to be debugged system to provide necessary debug functionality. For example, software based hardware debugging relies on interfaces to the hardware like JTAG, Kernel debugging on Kernel compilation or configuration and elevated (user), userspace debugging on process and user permissions, system configuration or a child process to be debugged on Posix systems via ptrace.

Practical methods with tradeoffs

Usually semantics are not “set into stone” inclusive or do not offer sufficient tradeoffs, so formal verification is rarely an option aside of usage of models as design and planning tool. Depending on the domain and environment, problematic behavior of hardware or software components must be to be more or less 1. avoided and 2. traceable and there exist various (domain) metrics as decision helper. Very well designed systems explain users how to debug bugs regarding to functional behavior, time behavior with internal and external system resources up to the degree the system usage and task execution correctness is intended. Access restrictions limit or rule out stepping, whereas storage limitations limit or rule out logging, tracing and recording.

  1. Hard(ware) problems Hardware design reviews with extensive focus on core components (power, battery, periphery, busses, memory/flash and debug/test infrastructure) to enable debugging and component tests against product and assembling defects are fundamental for software debugging under assumption that computing unit(s) and memory unit(s) can be trusted to work reliable enough. Depending on goals, time channel analysis, formal methods to rule out logic errors and fuzzing against bad temporal behavior (for example during speculative execution) are common methods besides various testing strategies based on statistical analysis.
  2. Kernel and platform problems The managing environment the code is running on can vary a lot. As example, the typical four phases of the Linux boot process (system startup, bootloader stage, kernel stage, and init process) have each their own debugging infrastructure and methods. Generally, working with (introspection-restricted) platforms requires 1. reverse engineering and "trying to find info" and/or 2. "use some tracing tool" and for 3. open source "adjust the source and stare at kernel dumps/use debugger". Kernels are rarely designed for tracing, recording, formal verification due to internal complexity and virtualization is slow and hides many classes of synchronization bugs. Due to being complex, moving targets, having no library design, having design flaws and many performance tradeoffs, they are hard to fuzz test.
  3. Detectable Undefined Behavior TODO make table of tools
  4. Undetectable Undefined Behavior Staring at source code, backend intermediate representation like LLVM IR and reducing the problem or resulting assembly. Unfortunately the backend optimizers like LLVM do not offer frontend language writers debug APIs and related tooling due to not being designed for that purpose.
  5. Miscompilations Tools like Miri or Cerberus run the program in an interpreter, but may not cover all possible program semantics due to ambiguity and may not be feasible, so the only good chance is to reduce it.
  6. Memory problems sanitizers, validators, simulator, tracers: TODO table with configs and costs - 1.out-of-bounds access sanitizer - 2.null pointer dereference sanitizer - 3.type confusion sanitizer - 4.integer overflow sanitizer - 5.use after free sanitizer - 6.invalid stack access sanitizer - 7.usage of uninitialized memory sanitizer - 8.data races sanitizer
  7. Resource leaks (Freestanding/Kernel) TODO sanitizers
  8. Freezes (deadlocks, softlocks, signal safety, unbounded loops etc) TODO sanitizer, validator, stepping
  9. Performance problems TODO
  10. Logic problems TODO