What is the fastest way to compute sin and cos together?
Modern Intel/AMD processors have instruction FSINCOS
for calculating sine and cosine functions simultaneously. If you need strong optimization, perhaps you should use it.
Here is a small example: http://home.broadpark.no/~alein/fsincos.html
Here is another example (for MSVC): http://www.codeguru.com/forum/showthread.php?t=328669
Here is yet another example (with gcc): http://www.allegro.cc/forums/thread/588470
Hope one of them helps.(I didn't use this instruction myself, sorry.)
As they are supported on processor level, I expect them to be way much faster than table lookups.
Edit:
Wikipedia suggests that FSINCOS
was added at 387 processors, so you can hardly find a processor which doesn't support it.
Edit:
Intel's documentation states that FSINCOS
is just about 5 times slower than FDIV
(i.e., floating point division).
Edit:
Please note that not all modern compilers optimize calculation of sine and cosine into a call to FSINCOS
. In particular, my VS 2008 didn't do it that way.
Edit:
The first example link is dead, but there is still a version at the Wayback Machine.
Modern x86 processors have a fsincos instruction which will do exactly what you're asking - calculate sin and cos at the same time. A good optimizing compiler should detect code which calculates sin and cos for the same value and use the fsincos command to execute this.
It took some twiddling of compiler flags for this to work, but:
$ gcc --versioni686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5488)Copyright (C) 2005 Free Software Foundation, Inc.This is free software; see the source for copying conditions. There is NOwarranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.$ cat main.c#include <math.h> struct Sin_cos {double sin; double cos;};struct Sin_cos fsincos(double val) { struct Sin_cos r; r.sin = sin(val); r.cos = cos(val); return r;}$ gcc -c -S -O3 -ffast-math -mfpmath=387 main.c -o main.s$ cat main.s .text .align 4,0x90.globl _fsincos_fsincos: pushl %ebp movl %esp, %ebp fldl 12(%ebp) fsincos movl 8(%ebp), %eax fstpl 8(%eax) fstpl (%eax) leave ret $4 .subsections_via_symbols
Tada, it uses the fsincos instruction!