Sergei Dyshel qyron.private at gmail.com
Tue Sep 14 09:58:15 EDT 2010


What is the current status of this feature? The regalloc2.c file wasn't
substantially updated during last couple of years. Some citations from the
file's comments section:

Focus was on correctness and easy debuggability so *performance is bad*

Bad related to what? I've tried both schemes, current vs globalra, on some
computationally-intensive kernels and globally-allocated version seems
always to run faster, even with x2-x3 speedups in case of floating-point
kernels. Is this supposed behavior?

Only works on amd64

Is this true for now? I'm actually interested in 32-bit x86 and PowerPC
back-ends. How much is required to make globalra work for them too? There
are relatively few places where globalra is used in 'mini-amd64.c' so it
doesn't seem hard to port these changes to 'mini-x86.c'.

In my project I only need to execute very simple small computational
kernels, with no arguments, no calls to another functions (hence no need to
distinguish between callee/caller saved registers), only global static
arrays are used?

Any answers/comments are greatly appreciated!
Sergei Dyshel
