This text focuses on some of the non-obvious and easy to make mistakes
non-experienced C programmers are likely to make and are/can not completely
be covered by tooling without going into edge cases relevant to performance:
Pointer semantics
Sequence points
Bit-fields
Compiler flags or implementation may provide workarounds to these problems
to prevent optimizations based on introduced Unedefined Behavior (UB).
Review used C compilers with flags used including tests and and platforms before
reusing of any code. The
SEI wiki covers these cases
without covering compiler workarounds leading to footguns, if code is used by
other compiler implementations or with different compiler flags and is more verbose
on rules how pointers are allowed to be used.
To only compare pointers decrease alignment with char* pointer.
To prune type info for generics use void* pointer.
You are responsible to call a function that provides or provide yourself.
Sufficient storage (pointer must point to valid object)
Sufficient padding (ie withing structs).
Correct aliasing
"Strict Aliasing Rule"
> Dereferencing a pointer that aliases an object that is not of a
> compatible type or one of the other types allowed by
> C 2011 6.5 paragraph 71 is undefined behavior.
What this means in practice:
Each pointer has an associated "provenance" it is allowed to point to.
This mean that a pointer ptr must uphold
(&array[0] <= ptr && ptr < &array[len+1]) || ptr != 0).
for access with array being the "memory origin range" on stack or heap.
Pointers must point ot the same array, when being used for arithmetic.
Function arguments of identical pointer types are allowed to have
overlapping provenance regions, unless annotated with __restrict__,
but pointers of different types are not allowed to have those regions.
Pointer comparison must be done via identical alignments, for
example to compare a pointer against pointer to 0 (usually
abbreviated via maro NULL).
Pointer access in practice.
Provenance as regions pointer is allowed to point to for access.
Copy around some bytes from not overlapping regions (otherwise use memmove).
Correct alignment of pointers with temporary, when necessary.
Ensure correct storage and padding size for pointers via sizeof.
Allowed aliasing of pointers
Non-Allowed aliasing of pointers: See example correct_alignment.c
The Exceptions.
Controlling the build system + compiler invocation to opt-out.
Clang and gcc have -fno-strict-aliasing, msvc and tcc do not implement strict aliasing based optimizations.
Usage of restrict can be en/disabled in all compilers via #pragma optimize("", on/off).
It can also be disabled in all compilers via #define restrict, using an according optimization level
(typical -O1) or via separating header and implementation and disabling link time optimziations.
Posix extension and Windows in practice enable dynamic linking via casting pointers void *to function pointers and back.
This also means that sizeof (function pointer) == sizeof (void *) must be uphold, which is not true for microcontrollers
with separate address space for code and data or
CHERI in mixed capability mode/hybrid compilation mode.
Address space annotations are mandatory for this to work and it is unfortunate that standards do not reflect this as of 20240428.
Sequence Points in simple case and with storage lifetime extension.
Do not use bit-fields unless for non-portable code regarding compilers
and CPUs and do not make assumptions regarding the layout of structures
with bit-fields and use static_assert/_Static_assert on every struct.
Keep bit-fields as simple as possible, meaning prefer not to nest them or also
static_assert the layout. Reasons from ISO/IEC 9899:TC3
> An implementation may allocate any addressable storage unit large enough to hold a bit
> field. If enough space remains, a bit-field that immediately follows another bit-field in a
> structure shall be packed into adjacent bits of the same unit. If insufficient space remains,
> whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is
> implementation-defined. The order of allocation of bit-fields within a unit (high-order to
> low-order or low-order to high-order) is implementation-defined. The alignment of the
> addressable storage unit is unspecified.
or in other words:
Order of allocation not specified.
Most significant bit not specified.
Alignment is not specified.
Implementations can determine, whether bit-fields cross a storage unit boundary.