ExtensionCheck.X86: use LLVM's Subtarget detection instead of CPUID in order to allow cross compilation Unfortunately this detection is only available in the C++ interface of LLVM-2.6. I am afraid, Storable Bool (size 4, alignment 4) is not compatible with LLVM's treatment of Bool as i1. this should be solved with Memory.FirstClass hide LLVM.IsSized with a type family then we can make Size a constraint of Memory.C, remove IsSized constraints from Memory.C instances and thus get rid of UndecidableInstances in Memory This does not work completely, because in the instances for Array and Vector we cannot access the results of the Mul class. We also cannot access some types from the 'llvm' package like 'StructFields' and 'PtrSize'. The functions in the Vector module were originally designed for power-of-two-vectors. LLVM does now allow arbitary sizes, but not all of our functions work properly with sizes other than powers of two. We have to check that. Transposition of m vectors of size n m and n must be powers of two The algorithm works as follows: v[1,0] <- interleave (v[0,0]) (v[0,m/2]) v[1,1] <- interleave (v[0,1]) (v[0,m/2+1]) ... v[1,m/2-1] <- interleave (v[0,m/2-1]) (v[0,m-1]) v[2,0] <- interleave (v[1,0]) (v[1,m/4]) v[2,1] <- interleave (v[1,1]) (v[1,m/4+1]) ... v[2,m/4-1] <- interleave (v[1,m/4-1]) (v[1,m/2-1]) v[log m, 0] <- interleave ... Finally chop v[log m, 0] into n chunks of size m. The good news: 'interleave' is supported by SSE's UNPCK instructions. The bad news: The type-encoded vector size makes implementation in Haskell a little difficult. We have to recourse over the types via type classes. In order to provide the base case, we have to enable overlapping instances. Alternatively we could express the exponent as type-level Peano number. This way we would have clean handling of the base case. It is certainly simpler to fill large vectors with undefined values.