Tuesday 28 February 2012

Tricking GCC into TCO

Recursion is good. Recursion is bad. It's good for expressing algorithms clearly, but sometimes it's bad when the machine starts a new stack frame instead of just jumping to near the start of the recursing function.

Can you see the difference between



int collinear(int a_0, int a_1, int b_0, int b_1)
{
        printf("%d/%d %d/%d\n", a_0, a_1, b_0, b_1);
        if (a_1 == 0 || b_1 == 0) {
                return a_1 == b_1;
        }
        return (a_0/a_1 == b_0/b_1 && collinear(a_1, a_0%a_1, b_1, b_0%b_1));
}

and

int collinear(int a_0, int a_1, int b_0, int b_1)
{
        printf("%d/%d %d/%d\n", a_0, a_1, b_0, b_1);
        if (a_1 == 0 || b_1 == 0) {
                return a_1 == b_1;
        }
        return (a_0/a_1 == b_0/b_1 ? collinear(a_1, a_0%a_1, b_1, b_0%b_1) : 0);
}

? GCC emits a call instruction for the former, and the desired jmp instruction for the latter.

It looks like GCC (my copy identifies itself as "gcc (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5") can't reason through the logical AND to conclude that all terminating returns from collinear already return a boolean value. I'll probably keep the logical AND as it seems to express the intent of the test more clearly, and in my application, the recursion will terminate quickly anyway. I think the worst case is O(log n) recursions.

(FWIW this relates to my question on StackOverflow; I'm hacking on my gEDA fork and I need to determine if an endpoint of one net connects to another, both of which may be diagonal.)