Posts tagged ‘C’

Signed zeroes and complex literals

IEEE-754 floats have the concept of a “signed zero”; 0.0 has a different bit representation to -0.0. In most cases, -0.0 behaves the same way as 0.0, and it compares equal in arithmetic operations. It becomes more obviously distinct in floating-point operations that involve some form of limiting behaviour. For example, x / 0.0 and x / -0.0 are opposite-signed infinities.

Along with the other IEEE-754 special values, like quiet/signalling NaN and infinities, these sorts of behaviours prevent compilers from making certain mathematical rewrites that would appear to be completely sound in regular arithmetic. (ab)-(a-b) in regular arithmetic over the reals can be written as bab-a, and despite floats having a symmetrical range of positive and negative values (unlike two’s complement integers), this is an invalid transformation in IEEE-754 arithmetic regardless of whether a symmetric rounding mode is in effect. In all rounding modes, the problem occurs at a = b; all finite IEEE-754 floats satisfy x - x = 0.0, therefore a - b and b - a are both 0.0, and negating one to make -0.0 makes it distinct.

In most uses, this is largely a curiosity and has little impact. It can be useful in general when the only information needed from the result of a long calculation is its sign. If the result were to underflow, beyond even the subnormal floats, the resulting sign of the zero would still be able to indicate the correct direction. The signed zero, then, can be thought of as representing the behaviour in the limit.

Where it becomes far more than a curiosity, with major impactful results, is when the signed zero becomes involved in a calculation with a discontinuity at zero. For real numbers, the most obvious of these is atan2(y, x), which is the floating-point version of arctan(y/x)\arctan(y/x) including the quadrant of the rotation. Since the float -0.0 is then interpreted as limx0x\lim_{x\to0^-} x—the limit as xx approaches zero by becoming less negative—there is a natural distinction between atan2(0.0, 0.0) and atan2(0.0, -0.0). Treating these as being calculated within limy0+\lim_{y\to0^+}, it becomes sensible that atan2(0.0, 0.0) would be a zero rotation, while atan2(0.0, -0.0) approximates π\pi.

Mutating Python tuples by C FFI abuse

Tuples are as immutable as things get in Python. They have special treatment being a core builtin, so unlike user-defined classes, there’s no private attribute that you can mutate if you really want to mess around. However, Python is an interpreted language with no baked-in access control, and (almost) everything really is mutable if you try hard enough.

All Python objects at the C level can be addressed by a pointer of type PyObject *, which (for most regular CPython builds) has the elements

struct _object {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
}
These objects do what they say on the tin. The different types of C object then have further elements after these, to specialise them.

Tuples are a type of “variable-length” object, in the sense that the same C struct definition is used for all sizes. With the head components inlined, this looks like

struct _tuple {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
    Py_ssize_t ob_size;
    PyObject *ob_item[1];
}
The first two arguments are the reference count and the Python type object, as before, then the next is the number of elements in the tuple, and the last one is an array of (pointers to) Python objects. The array is declared as size one, but Python guarantees that enough space will actually have been allocated to hold ob_size elements. The items in the tuple are stored in this array.

From Python space, tuple.__setitem__ raises a TypeError, but in principle at the C level, there is no such immutability. As of Python 3.11, the tupleobject.h header file even has a comment explaining how tuple can be used as a temporary array, and there is a PyTuple_SetItem function implementation, although this has slightly different reference-counting semantics to normal.

We can use the Python C FFI ctypes, and the CPython implementation detail that id returns the memory address of the Python object to cheekily define our own Python-space tuple_setitem. The PyObject type in ctypes doesn’t expose the internal fields of the object, but instead we can change the items we want by casting the integer output of id into an size_t (assuming, as in many systems that pointers have the same size as size_t), and manually assigning into the ob_item array. To be better citizens of Python—rampant pointer abuse aside—we should also increment the reference count of the object we are inserting into the tuple, and decrement the count for the object we are removing. This leaves us with the Python-space function

from ctypes import cast, c_size_t, c_ssize_t, POINTER
from typing import Any

def tuple_setitem(tup: tuple, index: int, new: Any):
    if -len(tup) <= index < 0:
        index = len(tup) + index
    if not 0 <= index < len(tup):
        raise IndexError("tuple index out of range")

    # Decrement the old item's refcount.
    old = tup[index]
    cast(id(old), POINTER(c_ssize_t))[0] -= 1

    # Increment the new item's refcount.
    cast(id(new), POINTER(c_ssize_t))[0] += 1

    # Replace the item.
    cast(id(tup), POINTER(c_size_t))[3 + index] = id(new)
Now let’s try it out:
>>> tup = (1, 2, 3)
>>> tuple_setitem(tup, 0, 7)
>>> tup
(7, 2, 3)

Contiguous memory and finding segfaults in Cython

One of the main advantages of using Cython in the algebraic core of scientific code is fast, non-bounds-checked access to contiguous memory. Of course, when you elide safety-checking, it’s not surprising when you start to get segfaults due to out-of-bounds access and use-after-free bugs in your Cython code.

This post talks a little about why we chose to use raw pointers to access our memory in the upcoming major version of QuTiP rather than other possible methods, and how I tracked down a couple of memory-safety bugs in my code which would have been caught had we been using safer alternatives.

Compiling OpenMP libraries on macOS

In QuTiP we have some optional OpenMP components, which can be used if the C extensions are built with OpenMP support at compile time. Typically this should be achievable just by adding the -fopenmp flag at compile and link time, but unfortunately the llvm clang distribution that Apple ship with macOS is not built with OpenMP support.