[Mono-dev] Mono.Simd - slower than the normal implementation
Rodrigo Kumpera
kumpera at gmail.com
Sat Nov 15 10:50:13 EST 2008
Hi Alan,
There a couple of issues with your code, let me get on them:
-Until recently (last night), getters were not accelerated, which causes a
significant
slowdown. I fixed this in r118899. The generated code is not as good as it
could be,
but this will be fixed eventually.
-Setters are still not accelerated, I'll work on this next week, so until
then your code
will suffer.
-Once you use a single non-accelerated method on a Vector variable all
operations
on it will be slower due to how our JIT works - they still use sse
instructions, but
with a performance penalty.
-Getters and setter are a hint of ill vectorized code. The last part of your
unsafe code
should use temps for the intermediate results.
-In the unsafe case you should use a Vector4ui store instead of extracting
each element.
-For the safe case we still miss proper integration with arrays, in the form
of methods to
extract and store vectors from them.
Your code looks a bit strange, the Vector4ui constructor indexes in
particular. Have you checked that
the output of the 3 methods are the same?
I'll work on the Mono.Simd issues next week, getting setters to be
accelerated, some methods
to better integrate with arrays and other things like element extractors.
Rodrigo
On Sat, Nov 15, 2008 at 12:13 AM, Alan McGovern <alan.mcgovern at gmail.com>wrote:
> I found a bit of code in the SHA1 implementation which i thought was
> ideal for SIMD optimisations. However, unless i resort to unsafe code,
> it's actually substantially slower! I've attached three
> implementations of the method here. The original, the safe SIMD and
> the unsafe SIMD. The runtimes are as follows:
>
> Original: 600ms
> Unsafe Simd: 450ms
> Safe Simd: 1700ms
>
> Also, the method is always called with a uint[] of length 80.
>
> Is this just the wrong place to be using simd? It seemed ideal because
> i need 75% less XOR's. If anyone has an ideas on whether SIMD could
> actually be useful for this case or not, let me know.
>
> Thanks,
> Alan.
>
>
> The original code is:
>
> private static void FillBuff(uint[] buff)
> {
> uint val;
> for (int i = 16; i < 80; i += 8)
> {
> val = buff[i - 3] ^ buff[i - 8] ^ buff[i - 14] ^ buff[i -
> 16];
> buff[i] = (val << 1) | (val >> 31);
>
> val = buff[i - 2] ^ buff[i - 7] ^ buff[i - 13] ^ buff[i -
> 15];
> buff[i + 1] = (val << 1) | (val >> 31);
>
> val = buff[i - 1] ^ buff[i - 6] ^ buff[i - 12] ^ buff[i -
> 14];
> buff[i + 2] = (val << 1) | (val >> 31);
>
> val = buff[i + 0] ^ buff[i - 5] ^ buff[i - 11] ^ buff[i -
> 13];
> buff[i + 3] = (val << 1) | (val >> 31);
>
> val = buff[i + 1] ^ buff[i - 4] ^ buff[i - 10] ^ buff[i -
> 12];
> buff[i + 4] = (val << 1) | (val >> 31);
>
> val = buff[i + 2] ^ buff[i - 3] ^ buff[i - 9] ^ buff[i -
> 11];
> buff[i + 5] = (val << 1) | (val >> 31);
>
> val = buff[i + 3] ^ buff[i - 2] ^ buff[i - 8] ^ buff[i -
> 10];
> buff[i + 6] = (val << 1) | (val >> 31);
>
> val = buff[i + 4] ^ buff[i - 1] ^ buff[i - 7] ^ buff[i - 9];
> buff[i + 7] = (val << 1) | (val >> 31);
> }
> }
>
> The unsafe SIMD code is:
> public unsafe static void FillBuff(uint[] buffb)
> {
> fixed (uint* buff = buffb) {
> Vector4ui e;
> for (int t = 16; t < buffb.Length; t += 4)
> {
> e = *((Vector4ui*)&(buff [t-16])) ^
> *((Vector4ui*)&(buff [t-14])) ^
> *((Vector4ui*)&(buff [t- 8])) ^
> *((Vector4ui*)&(buff [t- 3]));
> e.W ^= buff[t];
>
> buff[t] = (e.X << 1) | (e.X >> 31);
> buff[t + 1] = (e.Y << 1) | (e.Y >> 31);
> buff[t + 2] = (e.Z << 1) | (e.Z >> 31);
> buff[t + 3] = (e.W << 1) | (e.W >> 31) ^ ((e.X << 2) | (e.X >>
> 30));
> }
> }
> }
>
> The safe simd code is:
> public static void FillBuff(uint[] buff)
> {
> Vector4ui e;
> for (int t = 16; t < buff.Length; t += 4)
> {
> e = new Vector4ui (buff [t-16],buff [t-15],buff
> [t-14],buff [t-13]) ^
> new Vector4ui (buff [t-14],buff [t-13],buff
> [t-12],buff [t-11]) ^
> new Vector4ui (buff [t-8], buff [t-7], buff
> [t-6], buff [t-5]) ^
> new Vector4ui (buff [t-3], buff [t-2], buff
> [t-1], buff [t-0]);
>
> e.W ^= buff[t];
> buff[t] = (e.X << 1) | (e.X >> 31);
> buff[t + 1] = (e.Y << 1) | (e.Y >> 31);
> buff[t + 2] = (e.Z << 1) | (e.Z >> 31);
> buff[t + 3] = (e.W << 1) | (e.W >> 31) ^ ((e.X << 2) |
> (e.X >> 30));
> }
> }
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20081115/0feaee48/attachment-0001.html
More information about the Mono-devel-list
mailing list