Bitbanging 1D Reversible Automata

2024-12-12 13:41:00
richiejp.com

One Dimensional Reversible Automata

I created a demo for the GFXPrim library. It
implements and displays a nearest-neighbor, one-dimensional, binary
cell automata. Additionally it implements a reversible automata,
which is almost identical except for a small change to make it
reversible. The automata is displayed over time in two dimensions,
time travels from top to bottom. Although in the reversible case
time could be played backwards.

A reversible rendition of rule 73 — A reversible rendition of rule
73

The automata works as follows:

Each cell has a state, which is on or off, black or white,
boolean etc.
At each time step, the state of a cell in the next step is
chosen by a rule.
The rule looks at a cell’s current value and the values of its
left and right neighbors.
There are 2³ = 8
possible state combinations (patterns) for 3 binary cells.
A rule states which patterns result in a black cell in the next
time step.
There are 2⁸ = 256
possible rules. That is, 256 unique combinations of patterns.

So a pattern is a 3 digit binary number, where each digit
corresponds to a cell. The middle digit is the center cell, the high
order bit the left cell, the low order bit the right cell. A rule
can be display by showing a row of patterns and a row of next
states.

Above is rule 110, 0x6e or
01101110. It essentially says to match patterns
110, 101, 011,
010, 001. Where a pattern match results in
the cell being set to 1 at the next time step. If no pattern is
matched or equivalently, an inactive pattern is matched, then the
cell will be set to 0.

Again note that each pattern resembles a 3bit binary number. Also
the values of the active patterns resemble an 8bit binary number. We
can use this to perform efficient matching of the patterns using
binary operations.

Let’s assume our CPU natively operates on 64bit integers (called
words). We can pack a 64 cell automata into a single 64bit
integer. Each bit corresponds to a cell. If a bit is 1
then it is a black cell and 0 for white. In this case
we are using integers as bit fields. We don’t care about the integer
number the bits can represent.

The CPU can perform bitwise operations on all 64bits in parallel
and without branching. This means we can perform a single operation
64 times in parallel.

If we rotate (wrapped >>) all bits to the right by one,
then we get a new integer where the left neighbor of a bit is now in
its position. Likewise if we shift all bits to the left, then we get
an integer representing the right neighbors. This gives us 3
integers where the left, center and right bits are in the same
position. For example, using only 8bits:

left:	0100 1011	`>>`
center:	1001 0110
right:	0010 1101

Each pattern can be represented as a 3bit number, plus a 4th bit
to say whether it is active in a given rule. As we want to operate
on all 64bits at once in the left, right and center bit fields. We
can generate 64bit long masks from the value of each bit in
a given pattern.

So if we have a pattern where the left cell should be one, then
we can create a 64bit mask of all ones. If it should be
zero, then all zeroes. Likewise for the center and right cells. The
masks can be xor’ed (^)
with the corresponding cell fields to show if no match occurred.
That is, if the pattern is one and the cell is zero or the cell is
one and the pattern is zero. We can invert this (~) to
give one when a match occurs.

To see whether all components (left, right, center) of a pattern
matches we can bitwise and (&) them
together. We can then bitwise or
(|) the result of the pattern matches together to
produce the final values.

Rule 193, inverted rule 110 — Rule 193, inverted rule
110

If we wish to operate on an automata larger than 64 cells, then
we can combine multiple integers into an array. After performing the
left and right shifts, we get the high or low bit from the next or
previous integers in the array. Then set the low and high bits of
the right and left bit fields. In other words we chain them together
using the end bits of the left and right bit fields.

For illustration purposes, below is the kernel of the
the automata algorithm.

/* If bit n is 1 then make all bits 1 otherwise 0 */
#define BIT_TO_MAX(b, n) (((b >> n) & 1) * ~0UL)

/* Numeric representation of the current update rule */
static uint8_t rule = 110;

/* Apply the current rule to a 64bit segment of a row */
static inline uint64_t ca1d_rule_apply(uint64_t c_prev,
                                       uint64_t c,
                                       uint64_t c_next,
                                       uint64_t c_prev_step)
{
    int i;
    /* These are wrapping shifts when c_prev == c or c_next == c */
    uint64_t l = (c >> 1) ^ (c_prev  63);
    uint64_t r = (c  1) ^ (c_next >> 63);
    uint64_t c_next_step = 0;

    for (i = 0; i  8; i++) {
        uint64_t active = BIT_TO_MAX(rule, i);
        uint64_t left   = BIT_TO_MAX(i, 2);
        uint64_t center = BIT_TO_MAX(i, 1);
        uint64_t right  = BIT_TO_MAX(i, 0);

        c_next_step |=
            active & ~(left ^ l) & ~(center ^ c) & ~(right ^ r);
    }

    /* The automata becomes reversible when we use c_prev_state */
    return c_next_step ^ c_prev_step;
}

Rule 193 again, but reversible — Rule 193 again, but
reversible

  active & ~(left ^ l) & ~(center ^ c) & ~(right ^ r);
= 0 & ~(0 ^ l) & ~(0 ^ c) & ~(0 ^ r);
= 0.`

   1 & ~(0 ^ l) & ~(0 ^ c) & ~(1 ^ r);
= ~l & ~c & r.

As expected pattern 001 matches
l=0, c=0, r=1. Let’s just list the remaining patterns
or’ed together in their reduced state. Then reduce that further.
Note that the for loop in ca1d_rule_apply
will be unrolled by the compiler when optimising for
performance. It’s also quite clear that c_next_step is
dependant on an expression from the previous iteration or zero. So
all the pattern match results will get or’ed together.

  l & c & ~r | l & ~c & ~r | ~l & c & r | ~l & ~c & r;
= l & ~r | ~l & r;
= l ^ r.

/* Draws a pixel */
static inline void shade_pixel(gp_pixmap *p, gp_coord x, gp_coord y,
                               gp_pixel bg, gp_pixel fg)
{
    gp_pixel px;
    size_t i = (x * (64 * width)) / p->w;
    size_t j = (y * height) / p->h;
    size_t k = 63 - (i & 63);
    uint64_t c = steps[gp_matrix_idx(width, j, i >> 6)];

    c = BIT_TO_MAX(c, k);
    px = (fg & c) | (bg & ~c);

    gp_putpixel_raw(p, x, y, px);
}

However, you may ask why we are even drawing on the CPU in the
first place? This is because GFXPrim targets embedded systems with
no graphics processor. Additionally the CPU may not even support
floating point natively. So integer division may actually be faster
in this case. Still better would be to limit the size of the pixmap
to be 2^x larger
than the dimensions of the automata, where x ∈ ℕ then we can use shifts
instead of division.

Keep your files stored safely and securely with the SanDisk 2TB Extreme Portable SSD. With over 69,505 ratings and an impressive 4.6 out of 5 stars, this product has been purchased over 8K+ times in the past month. At only $129.99, this Amazon’s Choice product is a must-have for secure file storage.

Help keep private content private with the included password protection featuring 256-bit AES hardware encryption. Order now for just $129.99 on Amazon!

Start your free Amazon Prime trial
today and unlock unlimited streaming and more!

Source Link

Support Techcratic

If you find value in Techcratic’s insights and articles, consider supporting us with Bitcoin. Your support helps me, as a solo operator, continue delivering high-quality content while managing all the technical aspects, from server maintenance to blog writing, future updates, and improvements. Support Innovation! Thank you.

Bitcoin Address:

bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge

Please verify this address before sending funds.