···11+1.3.0 2012-07-18
22+ removed non-standard malloc.h from kiss_fft.h
33+44+ moved -lm to end of link line
55+66+ checked various return values
77+88+ converted python Numeric code to NumPy
99+1010+ fixed test of int32_t on 64 bit OS
1111+1212+ added padding in a couple of places to allow SIMD alignment of structs
1313+1414+1.2.9 2010-05-27
1515+ threadsafe ( including OpenMP )
1616+1717+ first edition of kissfft.hh the C++ template fft engine
1818+1919+1.2.8
2020+ Changed memory.h to string.h -- apparently more standard
2121+2222+ Added openmp extensions. This can have fairly linear speedups for larger FFT sizes.
2323+2424+1.2.7
2525+ Shrank the real-fft memory footprint. Thanks to Galen Seitz.
2626+2727+1.2.6 (Nov 14, 2006) The "thanks to GenArts" release.
2828+ Added multi-dimensional real-optimized FFT, see tools/kiss_fftndr
2929+ Thanks go to GenArts, Inc. for sponsoring the development.
3030+3131+1.2.5 (June 27, 2006) The "release for no good reason" release.
3232+ Changed some harmless code to make some compilers' warnings go away.
3333+ Added some more digits to pi -- why not.
3434+ Added kiss_fft_next_fast_size() function to help people decide how much to pad.
3535+ Changed multidimensional test from 8 dimensions to only 3 to avoid testing
3636+ problems with fixed point (sorry Buckaroo Banzai).
3737+3838+1.2.4 (Oct 27, 2005) The "oops, inverse fixed point real fft was borked" release.
3939+ Fixed scaling bug for inverse fixed point real fft -- also fixed test code that should've been failing.
4040+ Thanks to Jean-Marc Valin for bug report.
4141+4242+ Use sys/types.h for more portable types than short,int,long => int16_t,int32_t,int64_t
4343+ If your system does not have these, you may need to define them -- but at least it breaks in a
4444+ loud and easily fixable way -- unlike silently using the wrong size type.
4545+4646+ Hopefully tools/psdpng.c is fixed -- thanks to Steve Kellog for pointing out the weirdness.
4747+4848+1.2.3 (June 25, 2005) The "you want to use WHAT as a sample" release.
4949+ Added ability to use 32 bit fixed point samples -- requires a 64 bit intermediate result, a la 'long long'
5050+5151+ Added ability to do 4 FFTs in parallel by using SSE SIMD instructions. This is accomplished by
5252+ using the __m128 (vector of 4 floats) as kiss_fft_scalar. Define USE_SIMD to use this.
5353+5454+ I know, I know ... this is drifting a bit from the "kiss" principle, but the speed advantages
5555+ make it worth it for some. Also recent gcc makes it SOO easy to use vectors of 4 floats like a POD type.
5656+5757+1.2.2 (May 6, 2005) The Matthew release
5858+ Replaced fixed point division with multiply&shift. Thanks to Jean-Marc Valin for
5959+ discussions regarding. Considerable speedup for fixed-point.
6060+6161+ Corrected overflow protection in real fft routines when using fixed point.
6262+ Finder's Credit goes to Robert Oschler of robodance for pointing me at the bug.
6363+ This also led to the CHECK_OVERFLOW_OP macro.
6464+6565+1.2.1 (April 4, 2004)
6666+ compiles cleanly with just about every -W warning flag under the sun
6767+6868+ reorganized kiss_fft_state so it could be read-only/const. This may be useful for embedded systems
6969+ that are willing to predeclare twiddle factors, factorization.
7070+7171+ Fixed C_MUL,S_MUL on 16-bit platforms.
7272+7373+ tmpbuf will only be allocated if input & output buffers are same
7474+ scratchbuf will only be allocated for ffts that are not multiples of 2,3,5
7575+7676+ NOTE: The tmpbuf,scratchbuf changes may require synchronization code for multi-threaded apps.
7777+7878+7979+1.2 (Feb 23, 2004)
8080+ interface change -- cfg object is forward declaration of struct instead of void*
8181+ This maintains type saftey and lets the compiler warn/error about stupid mistakes.
8282+ (prompted by suggestion from Erik de Castro Lopo)
8383+8484+ small speed improvements
8585+8686+ added psdpng.c -- sample utility that will create png spectrum "waterfalls" from an input file
8787+ ( not terribly useful yet)
8888+8989+1.1.1 (Feb 1, 2004 )
9090+ minor bug fix -- only affects odd rank, in-place, multi-dimensional FFTs
9191+9292+1.1 : (Jan 30,2004)
9393+ split sample_code/ into test/ and tools/
9494+9595+ Removed 2-D fft and added N-D fft (arbitrary)
9696+9797+ modified fftutil.c to allow multi-d FFTs
9898+9999+ Modified core fft routine to allow an input stride via kiss_fft_stride()
100100+ (eased support of multi-D ffts)
101101+102102+ Added fast convolution filtering (FIR filtering using overlap-scrap method, with tail scrap)
103103+104104+ Add kfc.[ch]: the KISS FFT Cache. It takes care of allocs for you ( suggested by Oscar Lesta ).
105105+106106+1.0.1 (Dec 15, 2003)
107107+ fixed bug that occurred when nfft==1. Thanks to Steven Johnson.
108108+109109+1.0 : (Dec 14, 2003)
110110+ changed kiss_fft function from using a single buffer, to two buffers.
111111+ If the same buffer pointer is supplied for both in and out, kiss will
112112+ manage the buffer copies.
113113+114114+ added kiss_fft2d and kiss_fftr as separate source files (declarations in kiss_fft.h )
115115+116116+0.4 :(Nov 4,2003) optimized for radix 2,3,4,5
117117+118118+0.3 :(Oct 28, 2003) woops, version 2 didn't actually factor out any radices other than 2.
119119+ Thanks to Steven Johnson for finding this one.
120120+121121+0.2 :(Oct 27, 2003) added mixed radix, only radix 2,4 optimized versions
122122+123123+0.1 :(May 19 2003) initial release, radix 2 only
+11
COPYING
···11+Copyright (c) 2003-2010 Mark Borgerding
22+33+All rights reserved.
44+55+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
66+77+ * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
88+ * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
99+ * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
1010+1111+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+33
Makefile
···11+KFVER=130
22+33+doc:
44+ @echo "Start by reading the README file. If you want to build and test lots of stuff, do a 'make testall'"
55+ @echo "but be aware that 'make testall' has dependencies that the basic kissfft software does not."
66+ @echo "It is generally unneeded to run these tests yourself, unless you plan on changing the inner workings"
77+ @echo "of kissfft and would like to make use of its regression tests."
88+99+testall:
1010+ # The simd and int32_t types may or may not work on your machine
1111+ make -C test DATATYPE=simd CFLAGADD="$(CFLAGADD)" test
1212+ make -C test DATATYPE=int32_t CFLAGADD="$(CFLAGADD)" test
1313+ make -C test DATATYPE=int16_t CFLAGADD="$(CFLAGADD)" test
1414+ make -C test DATATYPE=float CFLAGADD="$(CFLAGADD)" test
1515+ make -C test DATATYPE=double CFLAGADD="$(CFLAGADD)" test
1616+ echo "all tests passed"
1717+1818+tarball: clean
1919+ hg archive -r v$(KFVER) -t tgz kiss_fft$(KFVER).tar.gz
2020+ hg archive -r v$(KFVER) -t zip kiss_fft$(KFVER).zip
2121+2222+clean:
2323+ cd test && make clean
2424+ cd tools && make clean
2525+ rm -f kiss_fft*.tar.gz *~ *.pyc kiss_fft*.zip
2626+2727+asm: kiss_fft.s
2828+2929+kiss_fft.s: kiss_fft.c kiss_fft.h _kiss_fft_guts.h
3030+ [ -e kiss_fft.s ] && mv kiss_fft.s kiss_fft.s~ || true
3131+ gcc -S kiss_fft.c -O3 -mtune=native -ffast-math -fomit-frame-pointer -unroll-loops -dA -fverbose-asm
3232+ gcc -o kiss_fft_short.s -S kiss_fft.c -O3 -mtune=native -ffast-math -fomit-frame-pointer -dA -fverbose-asm -DFIXED_POINT
3333+ [ -e kiss_fft.s~ ] && diff kiss_fft.s~ kiss_fft.s || true
+134
README
···11+KISS FFT - A mixed-radix Fast Fourier Transform based up on the principle,
22+"Keep It Simple, Stupid."
33+44+ There are many great fft libraries already around. Kiss FFT is not trying
55+to be better than any of them. It only attempts to be a reasonably efficient,
66+moderately useful FFT that can use fixed or floating data types and can be
77+incorporated into someone's C program in a few minutes with trivial licensing.
88+99+USAGE:
1010+1111+ The basic usage for 1-d complex FFT is:
1212+1313+ #include "kiss_fft.h"
1414+1515+ kiss_fft_cfg cfg = kiss_fft_alloc( nfft ,is_inverse_fft ,0,0 );
1616+1717+ while ...
1818+1919+ ... // put kth sample in cx_in[k].r and cx_in[k].i
2020+2121+ kiss_fft( cfg , cx_in , cx_out );
2222+2323+ ... // transformed. DC is in cx_out[0].r and cx_out[0].i
2424+2525+ free(cfg);
2626+2727+ Note: frequency-domain data is stored from dc up to 2pi.
2828+ so cx_out[0] is the dc bin of the FFT
2929+ and cx_out[nfft/2] is the Nyquist bin (if exists)
3030+3131+ Declarations are in "kiss_fft.h", along with a brief description of the
3232+functions you'll need to use.
3333+3434+Code definitions for 1d complex FFTs are in kiss_fft.c.
3535+3636+You can do other cool stuff with the extras you'll find in tools/
3737+3838+ * multi-dimensional FFTs
3939+ * real-optimized FFTs (returns the positive half-spectrum: (nfft/2+1) complex frequency bins)
4040+ * fast convolution FIR filtering (not available for fixed point)
4141+ * spectrum image creation
4242+4343+The core fft and most tools/ code can be compiled to use float, double,
4444+ Q15 short or Q31 samples. The default is float.
4545+4646+4747+BACKGROUND:
4848+4949+ I started coding this because I couldn't find a fixed point FFT that didn't
5050+use assembly code. I started with floating point numbers so I could get the
5151+theory straight before working on fixed point issues. In the end, I had a
5252+little bit of code that could be recompiled easily to do ffts with short, float
5353+or double (other types should be easy too).
5454+5555+ Once I got my FFT working, I was curious about the speed compared to
5656+a well respected and highly optimized fft library. I don't want to criticize
5757+this great library, so let's call it FFT_BRANDX.
5858+During this process, I learned:
5959+6060+ 1. FFT_BRANDX has more than 100K lines of code. The core of kiss_fft is about 500 lines (cpx 1-d).
6161+ 2. It took me an embarrassingly long time to get FFT_BRANDX working.
6262+ 3. A simple program using FFT_BRANDX is 522KB. A similar program using kiss_fft is 18KB (without optimizing for size).
6363+ 4. FFT_BRANDX is roughly twice as fast as KISS FFT in default mode.
6464+6565+ It is wonderful that free, highly optimized libraries like FFT_BRANDX exist.
6666+But such libraries carry a huge burden of complexity necessary to extract every
6767+last bit of performance.
6868+6969+ Sometimes simpler is better, even if it's not better.
7070+7171+FREQUENTLY ASKED QUESTIONS:
7272+ Q: Can I use kissfft in a project with a ___ license?
7373+ A: Yes. See LICENSE below.
7474+7575+ Q: Why don't I get the output I expect?
7676+ A: The two most common causes of this are
7777+ 1) scaling : is there a constant multiplier between what you got and what you want?
7878+ 2) mixed build environment -- all code must be compiled with same preprocessor
7979+ definitions for FIXED_POINT and kiss_fft_scalar
8080+8181+ Q: Will you write/debug my code for me?
8282+ A: Probably not unless you pay me. I am happy to answer pointed and topical questions, but
8383+ I may refer you to a book, a forum, or some other resource.
8484+8585+8686+PERFORMANCE:
8787+ (on Athlon XP 2100+, with gcc 2.96, float data type)
8888+8989+ Kiss performed 10000 1024-pt cpx ffts in .63 s of cpu time.
9090+ For comparison, it took md5sum twice as long to process the same amount of data.
9191+9292+ Transforming 5 minutes of CD quality audio takes less than a second (nfft=1024).
9393+9494+DO NOT:
9595+ ... use Kiss if you need the Fastest Fourier Transform in the World
9696+ ... ask me to add features that will bloat the code
9797+9898+UNDER THE HOOD:
9999+100100+ Kiss FFT uses a time decimation, mixed-radix, out-of-place FFT. If you give it an input buffer
101101+ and output buffer that are the same, a temporary buffer will be created to hold the data.
102102+103103+ No static data is used. The core routines of kiss_fft are thread-safe (but not all of the tools directory).
104104+105105+ No scaling is done for the floating point version (for speed).
106106+ Scaling is done both ways for the fixed-point version (for overflow prevention).
107107+108108+ Optimized butterflies are used for factors 2,3,4, and 5.
109109+110110+ The real (i.e. not complex) optimization code only works for even length ffts. It does two half-length
111111+ FFTs in parallel (packed into real&imag), and then combines them via twiddling. The result is
112112+ nfft/2+1 complex frequency bins from DC to Nyquist. If you don't know what this means, search the web.
113113+114114+ The fast convolution filtering uses the overlap-scrap method, slightly
115115+ modified to put the scrap at the tail.
116116+117117+LICENSE:
118118+ Revised BSD License, see COPYING for verbiage.
119119+ Basically, "free to use&change, give credit where due, no guarantees"
120120+ Note this license is compatible with GPL at one end of the spectrum and closed, commercial software at
121121+ the other end. See http://www.fsf.org/licensing/licenses
122122+123123+ A commercial license is available which removes the requirement for attribution. Contact me for details.
124124+125125+126126+TODO:
127127+ *) Add real optimization for odd length FFTs
128128+ *) Document/revisit the input/output fft scaling
129129+ *) Make doc describing the overlap (tail) scrap fast convolution filtering in kiss_fastfir.c
130130+ *) Test all the ./tools/ code with fixed point (kiss_fastfir.c doesn't work, maybe others)
131131+132132+AUTHOR:
133133+ Mark Borgerding
134134+ Mark@Borgerding.net
+78
README.simd
···11+If you are reading this, it means you think you may be interested in using the SIMD extensions in kissfft
22+to do 4 *separate* FFTs at once.
33+44+Beware! Beyond here there be dragons!
55+66+This API is not easy to use, is not well documented, and breaks the KISS principle.
77+88+99+Still reading? Okay, you may get rewarded for your patience with a considerable speedup
1010+(2-3x) on intel x86 machines with SSE if you are willing to jump through some hoops.
1111+1212+The basic idea is to use the packed 4 float __m128 data type as a scalar element.
1313+This means that the format is pretty convoluted. It performs 4 FFTs per fft call on signals A,B,C,D.
1414+1515+For complex data, the data is interlaced as follows:
1616+rA0,rB0,rC0,rD0, iA0,iB0,iC0,iD0, rA1,rB1,rC1,rD1, iA1,iB1,iC1,iD1 ...
1717+where "rA0" is the real part of the zeroth sample for signal A
1818+1919+Real-only data is laid out:
2020+rA0,rB0,rC0,rD0, rA1,rB1,rC1,rD1, ...
2121+2222+Compile with gcc flags something like
2323+-O3 -mpreferred-stack-boundary=4 -DUSE_SIMD=1 -msse
2424+2525+Be aware of SIMD alignment. This is the most likely cause of segfaults.
2626+The code within kissfft uses scratch variables on the stack.
2727+With SIMD, these must have addresses on 16 byte boundaries.
2828+Search on "SIMD alignment" for more info.
2929+3030+3131+3232+Robin at Divide Concept was kind enough to share his code for formatting to/from the SIMD kissfft.
3333+I have not run it -- use it at your own risk. It appears to do 4xN and Nx4 transpositions
3434+(out of place).
3535+3636+void SSETools::pack128(float* target, float* source, unsigned long size128)
3737+{
3838+ __m128* pDest = (__m128*)target;
3939+ __m128* pDestEnd = pDest+size128;
4040+ float* source0=source;
4141+ float* source1=source0+size128;
4242+ float* source2=source1+size128;
4343+ float* source3=source2+size128;
4444+4545+ while(pDest<pDestEnd)
4646+ {
4747+ *pDest=_mm_set_ps(*source3,*source2,*source1,*source0);
4848+ source0++;
4949+ source1++;
5050+ source2++;
5151+ source3++;
5252+ pDest++;
5353+ }
5454+}
5555+5656+void SSETools::unpack128(float* target, float* source, unsigned long size128)
5757+{
5858+5959+ float* pSrc = source;
6060+ float* pSrcEnd = pSrc+size128*4;
6161+ float* target0=target;
6262+ float* target1=target0+size128;
6363+ float* target2=target1+size128;
6464+ float* target3=target2+size128;
6565+6666+ while(pSrc<pSrcEnd)
6767+ {
6868+ *target0=pSrc[0];
6969+ *target1=pSrc[1];
7070+ *target2=pSrc[2];
7171+ *target3=pSrc[3];
7272+ target0++;
7373+ target1++;
7474+ target2++;
7575+ target3++;
7676+ pSrc+=4;
7777+ }
7878+}
+39
TIPS
···11+Speed:
22+ * If you want to use multiple cores, then compile with -openmp or -fopenmp (see your compiler docs).
33+ Realize that larger FFTs will reap more benefit than smaller FFTs. This generally uses more CPU time, but
44+ less wall time.
55+66+ * experiment with compiler flags
77+ Special thanks to Oscar Lesta. He suggested some compiler flags
88+ for gcc that make a big difference. They shave 10-15% off
99+ execution time on some systems. Try some combination of:
1010+ -march=pentiumpro
1111+ -ffast-math
1212+ -fomit-frame-pointer
1313+1414+ * If the input data has no imaginary component, use the kiss_fftr code under tools/.
1515+ Real ffts are roughly twice as fast as complex.
1616+1717+ * If you can rearrange your code to do 4 FFTs in parallel and you are on a recent Intel or AMD machine,
1818+ then you might want to experiment with the USE_SIMD code. See README.simd
1919+2020+2121+Reducing code size:
2222+ * remove some of the butterflies. There are currently butterflies optimized for radices
2323+ 2,3,4,5. It is worth mentioning that you can still use FFT sizes that contain
2424+ other factors, they just won't be quite as fast. You can decide for yourself
2525+ whether to keep radix 2 or 4. If you do some work in this area, let me
2626+ know what you find.
2727+2828+ * For platforms where ROM/code space is more plentiful than RAM,
2929+ consider creating a hardcoded kiss_fft_state. In other words, decide which
3030+ FFT size(s) you want and make a structure with the correct factors and twiddles.
3131+3232+ * Frank van der Hulst offered numerous suggestions for smaller code size and correct operation
3333+ on embedded targets. "I'm happy to help anyone who is trying to implement KISSFFT on a micro"
3434+3535+ Some of these were rolled into the mainline code base:
3636+ - using long casts to promote intermediate results of short*short multiplication
3737+ - delaying allocation of buffers that are sometimes unused.
3838+ In some cases, it may be desirable to limit capability in order to better suit the target:
3939+ - predefining the twiddle tables for the desired fft size.
+164
_kiss_fft_guts.h
···11+/*
22+Copyright (c) 2003-2010, Mark Borgerding
33+44+All rights reserved.
55+66+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
77+88+ * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
99+ * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
1010+ * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
1111+1212+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1313+*/
1414+1515+/* kiss_fft.h
1616+ defines kiss_fft_scalar as either short or a float type
1717+ and defines
1818+ typedef struct { kiss_fft_scalar r; kiss_fft_scalar i; }kiss_fft_cpx; */
1919+#include "kiss_fft.h"
2020+#include <limits.h>
2121+2222+#define MAXFACTORS 32
2323+/* e.g. an fft of length 128 has 4 factors
2424+ as far as kissfft is concerned
2525+ 4*4*4*2
2626+ */
2727+2828+struct kiss_fft_state{
2929+ int nfft;
3030+ int inverse;
3131+ int factors[2*MAXFACTORS];
3232+ kiss_fft_cpx twiddles[1];
3333+};
3434+3535+/*
3636+ Explanation of macros dealing with complex math:
3737+3838+ C_MUL(m,a,b) : m = a*b
3939+ C_FIXDIV( c , div ) : if a fixed point impl., c /= div. noop otherwise
4040+ C_SUB( res, a,b) : res = a - b
4141+ C_SUBFROM( res , a) : res -= a
4242+ C_ADDTO( res , a) : res += a
4343+ * */
4444+#ifdef FIXED_POINT
4545+#if (FIXED_POINT==32)
4646+# define FRACBITS 31
4747+# define SAMPPROD int64_t
4848+#define SAMP_MAX 2147483647
4949+#else
5050+# define FRACBITS 15
5151+# define SAMPPROD int32_t
5252+#define SAMP_MAX 32767
5353+#endif
5454+5555+#define SAMP_MIN -SAMP_MAX
5656+5757+#if defined(CHECK_OVERFLOW)
5858+# define CHECK_OVERFLOW_OP(a,op,b) \
5959+ if ( (SAMPPROD)(a) op (SAMPPROD)(b) > SAMP_MAX || (SAMPPROD)(a) op (SAMPPROD)(b) < SAMP_MIN ) { \
6060+ fprintf(stderr,"WARNING:overflow @ " __FILE__ "(%d): (%d " #op" %d) = %ld\n",__LINE__,(a),(b),(SAMPPROD)(a) op (SAMPPROD)(b) ); }
6161+#endif
6262+6363+6464+# define smul(a,b) ( (SAMPPROD)(a)*(b) )
6565+# define sround( x ) (kiss_fft_scalar)( ( (x) + (1<<(FRACBITS-1)) ) >> FRACBITS )
6666+6767+# define S_MUL(a,b) sround( smul(a,b) )
6868+6969+# define C_MUL(m,a,b) \
7070+ do{ (m).r = sround( smul((a).r,(b).r) - smul((a).i,(b).i) ); \
7171+ (m).i = sround( smul((a).r,(b).i) + smul((a).i,(b).r) ); }while(0)
7272+7373+# define DIVSCALAR(x,k) \
7474+ (x) = sround( smul( x, SAMP_MAX/k ) )
7575+7676+# define C_FIXDIV(c,div) \
7777+ do { DIVSCALAR( (c).r , div); \
7878+ DIVSCALAR( (c).i , div); }while (0)
7979+8080+# define C_MULBYSCALAR( c, s ) \
8181+ do{ (c).r = sround( smul( (c).r , s ) ) ;\
8282+ (c).i = sround( smul( (c).i , s ) ) ; }while(0)
8383+8484+#else /* not FIXED_POINT*/
8585+8686+# define S_MUL(a,b) ( (a)*(b) )
8787+#define C_MUL(m,a,b) \
8888+ do{ (m).r = (a).r*(b).r - (a).i*(b).i;\
8989+ (m).i = (a).r*(b).i + (a).i*(b).r; }while(0)
9090+# define C_FIXDIV(c,div) /* NOOP */
9191+# define C_MULBYSCALAR( c, s ) \
9292+ do{ (c).r *= (s);\
9393+ (c).i *= (s); }while(0)
9494+#endif
9595+9696+#ifndef CHECK_OVERFLOW_OP
9797+# define CHECK_OVERFLOW_OP(a,op,b) /* noop */
9898+#endif
9999+100100+#define C_ADD( res, a,b)\
101101+ do { \
102102+ CHECK_OVERFLOW_OP((a).r,+,(b).r)\
103103+ CHECK_OVERFLOW_OP((a).i,+,(b).i)\
104104+ (res).r=(a).r+(b).r; (res).i=(a).i+(b).i; \
105105+ }while(0)
106106+#define C_SUB( res, a,b)\
107107+ do { \
108108+ CHECK_OVERFLOW_OP((a).r,-,(b).r)\
109109+ CHECK_OVERFLOW_OP((a).i,-,(b).i)\
110110+ (res).r=(a).r-(b).r; (res).i=(a).i-(b).i; \
111111+ }while(0)
112112+#define C_ADDTO( res , a)\
113113+ do { \
114114+ CHECK_OVERFLOW_OP((res).r,+,(a).r)\
115115+ CHECK_OVERFLOW_OP((res).i,+,(a).i)\
116116+ (res).r += (a).r; (res).i += (a).i;\
117117+ }while(0)
118118+119119+#define C_SUBFROM( res , a)\
120120+ do {\
121121+ CHECK_OVERFLOW_OP((res).r,-,(a).r)\
122122+ CHECK_OVERFLOW_OP((res).i,-,(a).i)\
123123+ (res).r -= (a).r; (res).i -= (a).i; \
124124+ }while(0)
125125+126126+127127+#ifdef FIXED_POINT
128128+# define KISS_FFT_COS(phase) floor(.5+SAMP_MAX * cos (phase))
129129+# define KISS_FFT_SIN(phase) floor(.5+SAMP_MAX * sin (phase))
130130+# define HALF_OF(x) ((x)>>1)
131131+#elif defined(USE_SIMD)
132132+# define KISS_FFT_COS(phase) _mm_set1_ps( cos(phase) )
133133+# define KISS_FFT_SIN(phase) _mm_set1_ps( sin(phase) )
134134+# define HALF_OF(x) ((x)*_mm_set1_ps(.5))
135135+#else
136136+# define KISS_FFT_COS(phase) (kiss_fft_scalar) cos(phase)
137137+# define KISS_FFT_SIN(phase) (kiss_fft_scalar) sin(phase)
138138+# define HALF_OF(x) ((x)*.5)
139139+#endif
140140+141141+#define kf_cexp(x,phase) \
142142+ do{ \
143143+ (x)->r = KISS_FFT_COS(phase);\
144144+ (x)->i = KISS_FFT_SIN(phase);\
145145+ }while(0)
146146+147147+148148+/* a debugging function */
149149+#define pcpx(c)\
150150+ fprintf(stderr,"%g + %gi\n",(double)((c)->r),(double)((c)->i) )
151151+152152+153153+#ifdef KISS_FFT_USE_ALLOCA
154154+// define this to allow use of alloca instead of malloc for temporary buffers
155155+// Temporary buffers are used in two case:
156156+// 1. FFT sizes that have "bad" factors. i.e. not 2,3 and 5
157157+// 2. "in-place" FFTs. Notice the quotes, since kissfft does not really do an in-place transform.
158158+#include <alloca.h>
159159+#define KISS_FFT_TMP_ALLOC(nbytes) alloca(nbytes)
160160+#define KISS_FFT_TMP_FREE(ptr)
161161+#else
162162+#define KISS_FFT_TMP_ALLOC(nbytes) KISS_FFT_MALLOC(nbytes)
163163+#define KISS_FFT_TMP_FREE(ptr) KISS_FFT_FREE(ptr)
164164+#endif
+408
kiss_fft.c
···11+/*
22+Copyright (c) 2003-2010, Mark Borgerding
33+44+All rights reserved.
55+66+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
77+88+ * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
99+ * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
1010+ * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
1111+1212+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1313+*/
1414+1515+1616+#include "_kiss_fft_guts.h"
1717+/* The guts header contains all the multiplication and addition macros that are defined for
1818+ fixed or floating point complex numbers. It also delares the kf_ internal functions.
1919+ */
2020+2121+static void kf_bfly2(
2222+ kiss_fft_cpx * Fout,
2323+ const size_t fstride,
2424+ const kiss_fft_cfg st,
2525+ int m
2626+ )
2727+{
2828+ kiss_fft_cpx * Fout2;
2929+ kiss_fft_cpx * tw1 = st->twiddles;
3030+ kiss_fft_cpx t;
3131+ Fout2 = Fout + m;
3232+ do{
3333+ C_FIXDIV(*Fout,2); C_FIXDIV(*Fout2,2);
3434+3535+ C_MUL (t, *Fout2 , *tw1);
3636+ tw1 += fstride;
3737+ C_SUB( *Fout2 , *Fout , t );
3838+ C_ADDTO( *Fout , t );
3939+ ++Fout2;
4040+ ++Fout;
4141+ }while (--m);
4242+}
4343+4444+static void kf_bfly4(
4545+ kiss_fft_cpx * Fout,
4646+ const size_t fstride,
4747+ const kiss_fft_cfg st,
4848+ const size_t m
4949+ )
5050+{
5151+ kiss_fft_cpx *tw1,*tw2,*tw3;
5252+ kiss_fft_cpx scratch[6];
5353+ size_t k=m;
5454+ const size_t m2=2*m;
5555+ const size_t m3=3*m;
5656+5757+5858+ tw3 = tw2 = tw1 = st->twiddles;
5959+6060+ do {
6161+ C_FIXDIV(*Fout,4); C_FIXDIV(Fout[m],4); C_FIXDIV(Fout[m2],4); C_FIXDIV(Fout[m3],4);
6262+6363+ C_MUL(scratch[0],Fout[m] , *tw1 );
6464+ C_MUL(scratch[1],Fout[m2] , *tw2 );
6565+ C_MUL(scratch[2],Fout[m3] , *tw3 );
6666+6767+ C_SUB( scratch[5] , *Fout, scratch[1] );
6868+ C_ADDTO(*Fout, scratch[1]);
6969+ C_ADD( scratch[3] , scratch[0] , scratch[2] );
7070+ C_SUB( scratch[4] , scratch[0] , scratch[2] );
7171+ C_SUB( Fout[m2], *Fout, scratch[3] );
7272+ tw1 += fstride;
7373+ tw2 += fstride*2;
7474+ tw3 += fstride*3;
7575+ C_ADDTO( *Fout , scratch[3] );
7676+7777+ if(st->inverse) {
7878+ Fout[m].r = scratch[5].r - scratch[4].i;
7979+ Fout[m].i = scratch[5].i + scratch[4].r;
8080+ Fout[m3].r = scratch[5].r + scratch[4].i;
8181+ Fout[m3].i = scratch[5].i - scratch[4].r;
8282+ }else{
8383+ Fout[m].r = scratch[5].r + scratch[4].i;
8484+ Fout[m].i = scratch[5].i - scratch[4].r;
8585+ Fout[m3].r = scratch[5].r - scratch[4].i;
8686+ Fout[m3].i = scratch[5].i + scratch[4].r;
8787+ }
8888+ ++Fout;
8989+ }while(--k);
9090+}
9191+9292+static void kf_bfly3(
9393+ kiss_fft_cpx * Fout,
9494+ const size_t fstride,
9595+ const kiss_fft_cfg st,
9696+ size_t m
9797+ )
9898+{
9999+ size_t k=m;
100100+ const size_t m2 = 2*m;
101101+ kiss_fft_cpx *tw1,*tw2;
102102+ kiss_fft_cpx scratch[5];
103103+ kiss_fft_cpx epi3;
104104+ epi3 = st->twiddles[fstride*m];
105105+106106+ tw1=tw2=st->twiddles;
107107+108108+ do{
109109+ C_FIXDIV(*Fout,3); C_FIXDIV(Fout[m],3); C_FIXDIV(Fout[m2],3);
110110+111111+ C_MUL(scratch[1],Fout[m] , *tw1);
112112+ C_MUL(scratch[2],Fout[m2] , *tw2);
113113+114114+ C_ADD(scratch[3],scratch[1],scratch[2]);
115115+ C_SUB(scratch[0],scratch[1],scratch[2]);
116116+ tw1 += fstride;
117117+ tw2 += fstride*2;
118118+119119+ Fout[m].r = Fout->r - HALF_OF(scratch[3].r);
120120+ Fout[m].i = Fout->i - HALF_OF(scratch[3].i);
121121+122122+ C_MULBYSCALAR( scratch[0] , epi3.i );
123123+124124+ C_ADDTO(*Fout,scratch[3]);
125125+126126+ Fout[m2].r = Fout[m].r + scratch[0].i;
127127+ Fout[m2].i = Fout[m].i - scratch[0].r;
128128+129129+ Fout[m].r -= scratch[0].i;
130130+ Fout[m].i += scratch[0].r;
131131+132132+ ++Fout;
133133+ }while(--k);
134134+}
135135+136136+static void kf_bfly5(
137137+ kiss_fft_cpx * Fout,
138138+ const size_t fstride,
139139+ const kiss_fft_cfg st,
140140+ int m
141141+ )
142142+{
143143+ kiss_fft_cpx *Fout0,*Fout1,*Fout2,*Fout3,*Fout4;
144144+ int u;
145145+ kiss_fft_cpx scratch[13];
146146+ kiss_fft_cpx * twiddles = st->twiddles;
147147+ kiss_fft_cpx *tw;
148148+ kiss_fft_cpx ya,yb;
149149+ ya = twiddles[fstride*m];
150150+ yb = twiddles[fstride*2*m];
151151+152152+ Fout0=Fout;
153153+ Fout1=Fout0+m;
154154+ Fout2=Fout0+2*m;
155155+ Fout3=Fout0+3*m;
156156+ Fout4=Fout0+4*m;
157157+158158+ tw=st->twiddles;
159159+ for ( u=0; u<m; ++u ) {
160160+ C_FIXDIV( *Fout0,5); C_FIXDIV( *Fout1,5); C_FIXDIV( *Fout2,5); C_FIXDIV( *Fout3,5); C_FIXDIV( *Fout4,5);
161161+ scratch[0] = *Fout0;
162162+163163+ C_MUL(scratch[1] ,*Fout1, tw[u*fstride]);
164164+ C_MUL(scratch[2] ,*Fout2, tw[2*u*fstride]);
165165+ C_MUL(scratch[3] ,*Fout3, tw[3*u*fstride]);
166166+ C_MUL(scratch[4] ,*Fout4, tw[4*u*fstride]);
167167+168168+ C_ADD( scratch[7],scratch[1],scratch[4]);
169169+ C_SUB( scratch[10],scratch[1],scratch[4]);
170170+ C_ADD( scratch[8],scratch[2],scratch[3]);
171171+ C_SUB( scratch[9],scratch[2],scratch[3]);
172172+173173+ Fout0->r += scratch[7].r + scratch[8].r;
174174+ Fout0->i += scratch[7].i + scratch[8].i;
175175+176176+ scratch[5].r = scratch[0].r + S_MUL(scratch[7].r,ya.r) + S_MUL(scratch[8].r,yb.r);
177177+ scratch[5].i = scratch[0].i + S_MUL(scratch[7].i,ya.r) + S_MUL(scratch[8].i,yb.r);
178178+179179+ scratch[6].r = S_MUL(scratch[10].i,ya.i) + S_MUL(scratch[9].i,yb.i);
180180+ scratch[6].i = -S_MUL(scratch[10].r,ya.i) - S_MUL(scratch[9].r,yb.i);
181181+182182+ C_SUB(*Fout1,scratch[5],scratch[6]);
183183+ C_ADD(*Fout4,scratch[5],scratch[6]);
184184+185185+ scratch[11].r = scratch[0].r + S_MUL(scratch[7].r,yb.r) + S_MUL(scratch[8].r,ya.r);
186186+ scratch[11].i = scratch[0].i + S_MUL(scratch[7].i,yb.r) + S_MUL(scratch[8].i,ya.r);
187187+ scratch[12].r = - S_MUL(scratch[10].i,yb.i) + S_MUL(scratch[9].i,ya.i);
188188+ scratch[12].i = S_MUL(scratch[10].r,yb.i) - S_MUL(scratch[9].r,ya.i);
189189+190190+ C_ADD(*Fout2,scratch[11],scratch[12]);
191191+ C_SUB(*Fout3,scratch[11],scratch[12]);
192192+193193+ ++Fout0;++Fout1;++Fout2;++Fout3;++Fout4;
194194+ }
195195+}
196196+197197+/* perform the butterfly for one stage of a mixed radix FFT */
198198+static void kf_bfly_generic(
199199+ kiss_fft_cpx * Fout,
200200+ const size_t fstride,
201201+ const kiss_fft_cfg st,
202202+ int m,
203203+ int p
204204+ )
205205+{
206206+ int u,k,q1,q;
207207+ kiss_fft_cpx * twiddles = st->twiddles;
208208+ kiss_fft_cpx t;
209209+ int Norig = st->nfft;
210210+211211+ kiss_fft_cpx * scratch = (kiss_fft_cpx*)KISS_FFT_TMP_ALLOC(sizeof(kiss_fft_cpx)*p);
212212+213213+ for ( u=0; u<m; ++u ) {
214214+ k=u;
215215+ for ( q1=0 ; q1<p ; ++q1 ) {
216216+ scratch[q1] = Fout[ k ];
217217+ C_FIXDIV(scratch[q1],p);
218218+ k += m;
219219+ }
220220+221221+ k=u;
222222+ for ( q1=0 ; q1<p ; ++q1 ) {
223223+ int twidx=0;
224224+ Fout[ k ] = scratch[0];
225225+ for (q=1;q<p;++q ) {
226226+ twidx += fstride * k;
227227+ if (twidx>=Norig) twidx-=Norig;
228228+ C_MUL(t,scratch[q] , twiddles[twidx] );
229229+ C_ADDTO( Fout[ k ] ,t);
230230+ }
231231+ k += m;
232232+ }
233233+ }
234234+ KISS_FFT_TMP_FREE(scratch);
235235+}
236236+237237+static
238238+void kf_work(
239239+ kiss_fft_cpx * Fout,
240240+ const kiss_fft_cpx * f,
241241+ const size_t fstride,
242242+ int in_stride,
243243+ int * factors,
244244+ const kiss_fft_cfg st
245245+ )
246246+{
247247+ kiss_fft_cpx * Fout_beg=Fout;
248248+ const int p=*factors++; /* the radix */
249249+ const int m=*factors++; /* stage's fft length/p */
250250+ const kiss_fft_cpx * Fout_end = Fout + p*m;
251251+252252+#ifdef _OPENMP
253253+ // use openmp extensions at the
254254+ // top-level (not recursive)
255255+ if (fstride==1 && p<=5)
256256+ {
257257+ int k;
258258+259259+ // execute the p different work units in different threads
260260+# pragma omp parallel for
261261+ for (k=0;k<p;++k)
262262+ kf_work( Fout +k*m, f+ fstride*in_stride*k,fstride*p,in_stride,factors,st);
263263+ // all threads have joined by this point
264264+265265+ switch (p) {
266266+ case 2: kf_bfly2(Fout,fstride,st,m); break;
267267+ case 3: kf_bfly3(Fout,fstride,st,m); break;
268268+ case 4: kf_bfly4(Fout,fstride,st,m); break;
269269+ case 5: kf_bfly5(Fout,fstride,st,m); break;
270270+ default: kf_bfly_generic(Fout,fstride,st,m,p); break;
271271+ }
272272+ return;
273273+ }
274274+#endif
275275+276276+ if (m==1) {
277277+ do{
278278+ *Fout = *f;
279279+ f += fstride*in_stride;
280280+ }while(++Fout != Fout_end );
281281+ }else{
282282+ do{
283283+ // recursive call:
284284+ // DFT of size m*p performed by doing
285285+ // p instances of smaller DFTs of size m,
286286+ // each one takes a decimated version of the input
287287+ kf_work( Fout , f, fstride*p, in_stride, factors,st);
288288+ f += fstride*in_stride;
289289+ }while( (Fout += m) != Fout_end );
290290+ }
291291+292292+ Fout=Fout_beg;
293293+294294+ // recombine the p smaller DFTs
295295+ switch (p) {
296296+ case 2: kf_bfly2(Fout,fstride,st,m); break;
297297+ case 3: kf_bfly3(Fout,fstride,st,m); break;
298298+ case 4: kf_bfly4(Fout,fstride,st,m); break;
299299+ case 5: kf_bfly5(Fout,fstride,st,m); break;
300300+ default: kf_bfly_generic(Fout,fstride,st,m,p); break;
301301+ }
302302+}
303303+304304+/* facbuf is populated by p1,m1,p2,m2, ...
305305+ where
306306+ p[i] * m[i] = m[i-1]
307307+ m0 = n */
308308+static
309309+void kf_factor(int n,int * facbuf)
310310+{
311311+ int p=4;
312312+ double floor_sqrt;
313313+ floor_sqrt = floor( sqrt((double)n) );
314314+315315+ /*factor out powers of 4, powers of 2, then any remaining primes */
316316+ do {
317317+ while (n % p) {
318318+ switch (p) {
319319+ case 4: p = 2; break;
320320+ case 2: p = 3; break;
321321+ default: p += 2; break;
322322+ }
323323+ if (p > floor_sqrt)
324324+ p = n; /* no more factors, skip to end */
325325+ }
326326+ n /= p;
327327+ *facbuf++ = p;
328328+ *facbuf++ = n;
329329+ } while (n > 1);
330330+}
331331+332332+/*
333333+ *
334334+ * User-callable function to allocate all necessary storage space for the fft.
335335+ *
336336+ * The return value is a contiguous block of memory, allocated with malloc. As such,
337337+ * It can be freed with free(), rather than a kiss_fft-specific function.
338338+ * */
339339+kiss_fft_cfg kiss_fft_alloc(int nfft,int inverse_fft,void * mem,size_t * lenmem )
340340+{
341341+ kiss_fft_cfg st=NULL;
342342+ size_t memneeded = sizeof(struct kiss_fft_state)
343343+ + sizeof(kiss_fft_cpx)*(nfft-1); /* twiddle factors*/
344344+345345+ if ( lenmem==NULL ) {
346346+ st = ( kiss_fft_cfg)KISS_FFT_MALLOC( memneeded );
347347+ }else{
348348+ if (mem != NULL && *lenmem >= memneeded)
349349+ st = (kiss_fft_cfg)mem;
350350+ *lenmem = memneeded;
351351+ }
352352+ if (st) {
353353+ int i;
354354+ st->nfft=nfft;
355355+ st->inverse = inverse_fft;
356356+357357+ for (i=0;i<nfft;++i) {
358358+ const double pi=3.141592653589793238462643383279502884197169399375105820974944;
359359+ double phase = -2*pi*i / nfft;
360360+ if (st->inverse)
361361+ phase *= -1;
362362+ kf_cexp(st->twiddles+i, phase );
363363+ }
364364+365365+ kf_factor(nfft,st->factors);
366366+ }
367367+ return st;
368368+}
369369+370370+371371+void kiss_fft_stride(kiss_fft_cfg st,const kiss_fft_cpx *fin,kiss_fft_cpx *fout,int in_stride)
372372+{
373373+ if (fin == fout) {
374374+ //NOTE: this is not really an in-place FFT algorithm.
375375+ //It just performs an out-of-place FFT into a temp buffer
376376+ kiss_fft_cpx * tmpbuf = (kiss_fft_cpx*)KISS_FFT_TMP_ALLOC( sizeof(kiss_fft_cpx)*st->nfft);
377377+ kf_work(tmpbuf,fin,1,in_stride, st->factors,st);
378378+ memcpy(fout,tmpbuf,sizeof(kiss_fft_cpx)*st->nfft);
379379+ KISS_FFT_TMP_FREE(tmpbuf);
380380+ }else{
381381+ kf_work( fout, fin, 1,in_stride, st->factors,st );
382382+ }
383383+}
384384+385385+void kiss_fft(kiss_fft_cfg cfg,const kiss_fft_cpx *fin,kiss_fft_cpx *fout)
386386+{
387387+ kiss_fft_stride(cfg,fin,fout,1);
388388+}
389389+390390+391391+void kiss_fft_cleanup(void)
392392+{
393393+ // nothing needed any more
394394+}
395395+396396+int kiss_fft_next_fast_size(int n)
397397+{
398398+ while(1) {
399399+ int m=n;
400400+ while ( (m%2) == 0 ) m/=2;
401401+ while ( (m%3) == 0 ) m/=3;
402402+ while ( (m%5) == 0 ) m/=5;
403403+ if (m<=1)
404404+ break; /* n is completely factorable by twos, threes, and fives */
405405+ n++;
406406+ }
407407+ return n;
408408+}
+124
kiss_fft.h
···11+#ifndef KISS_FFT_H
22+#define KISS_FFT_H
33+44+#include <stdlib.h>
55+#include <stdio.h>
66+#include <math.h>
77+#include <string.h>
88+99+#ifdef __cplusplus
1010+extern "C" {
1111+#endif
1212+1313+/*
1414+ ATTENTION!
1515+ If you would like a :
1616+ -- a utility that will handle the caching of fft objects
1717+ -- real-only (no imaginary time component ) FFT
1818+ -- a multi-dimensional FFT
1919+ -- a command-line utility to perform ffts
2020+ -- a command-line utility to perform fast-convolution filtering
2121+2222+ Then see kfc.h kiss_fftr.h kiss_fftnd.h fftutil.c kiss_fastfir.c
2323+ in the tools/ directory.
2424+*/
2525+2626+#ifdef USE_SIMD
2727+# include <xmmintrin.h>
2828+# define kiss_fft_scalar __m128
2929+#define KISS_FFT_MALLOC(nbytes) _mm_malloc(nbytes,16)
3030+#define KISS_FFT_FREE _mm_free
3131+#else
3232+#define KISS_FFT_MALLOC malloc
3333+#define KISS_FFT_FREE free
3434+#endif
3535+3636+3737+#ifdef FIXED_POINT
3838+#include <sys/types.h>
3939+# if (FIXED_POINT == 32)
4040+# define kiss_fft_scalar int32_t
4141+# else
4242+# define kiss_fft_scalar int16_t
4343+# endif
4444+#else
4545+# ifndef kiss_fft_scalar
4646+/* default is float */
4747+# define kiss_fft_scalar float
4848+# endif
4949+#endif
5050+5151+typedef struct {
5252+ kiss_fft_scalar r;
5353+ kiss_fft_scalar i;
5454+}kiss_fft_cpx;
5555+5656+typedef struct kiss_fft_state* kiss_fft_cfg;
5757+5858+/*
5959+ * kiss_fft_alloc
6060+ *
6161+ * Initialize a FFT (or IFFT) algorithm's cfg/state buffer.
6262+ *
6363+ * typical usage: kiss_fft_cfg mycfg=kiss_fft_alloc(1024,0,NULL,NULL);
6464+ *
6565+ * The return value from fft_alloc is a cfg buffer used internally
6666+ * by the fft routine or NULL.
6767+ *
6868+ * If lenmem is NULL, then kiss_fft_alloc will allocate a cfg buffer using malloc.
6969+ * The returned value should be free()d when done to avoid memory leaks.
7070+ *
7171+ * The state can be placed in a user supplied buffer 'mem':
7272+ * If lenmem is not NULL and mem is not NULL and *lenmem is large enough,
7373+ * then the function places the cfg in mem and the size used in *lenmem
7474+ * and returns mem.
7575+ *
7676+ * If lenmem is not NULL and ( mem is NULL or *lenmem is not large enough),
7777+ * then the function returns NULL and places the minimum cfg
7878+ * buffer size in *lenmem.
7979+ * */
8080+8181+kiss_fft_cfg kiss_fft_alloc(int nfft,int inverse_fft,void * mem,size_t * lenmem);
8282+8383+/*
8484+ * kiss_fft(cfg,in_out_buf)
8585+ *
8686+ * Perform an FFT on a complex input buffer.
8787+ * for a forward FFT,
8888+ * fin should be f[0] , f[1] , ... ,f[nfft-1]
8989+ * fout will be F[0] , F[1] , ... ,F[nfft-1]
9090+ * Note that each element is complex and can be accessed like
9191+ f[k].r and f[k].i
9292+ * */
9393+void kiss_fft(kiss_fft_cfg cfg,const kiss_fft_cpx *fin,kiss_fft_cpx *fout);
9494+9595+/*
9696+ A more generic version of the above function. It reads its input from every Nth sample.
9797+ * */
9898+void kiss_fft_stride(kiss_fft_cfg cfg,const kiss_fft_cpx *fin,kiss_fft_cpx *fout,int fin_stride);
9999+100100+/* If kiss_fft_alloc allocated a buffer, it is one contiguous
101101+ buffer and can be simply free()d when no longer needed*/
102102+#define kiss_fft_free free
103103+104104+/*
105105+ Cleans up some memory that gets managed internally. Not necessary to call, but it might clean up
106106+ your compiler output to call this before you exit.
107107+*/
108108+void kiss_fft_cleanup(void);
109109+110110+111111+/*
112112+ * Returns the smallest integer k, such that k>=n and k has only "fast" factors (2,3,5)
113113+ */
114114+int kiss_fft_next_fast_size(int n);
115115+116116+/* for real ffts, we need an even size */
117117+#define kiss_fftr_next_fast_size_real(n) \
118118+ (kiss_fft_next_fast_size( ((n)+1)>>1)<<1)
119119+120120+#ifdef __cplusplus
121121+}
122122+#endif
123123+124124+#endif
···11+#include <stdio.h>
22+#include <stdlib.h>
33+#include <fftw3.h>
44+#include <unistd.h>
55+#include "pstats.h"
66+77+#ifdef DATATYPEdouble
88+99+#define CPXTYPE fftw_complex
1010+#define PLAN fftw_plan
1111+#define FFTMALLOC fftw_malloc
1212+#define MAKEPLAN fftw_plan_dft_1d
1313+#define DOFFT fftw_execute
1414+#define DESTROYPLAN fftw_destroy_plan
1515+#define FFTFREE fftw_free
1616+1717+#elif defined(DATATYPEfloat)
1818+1919+#define CPXTYPE fftwf_complex
2020+#define PLAN fftwf_plan
2121+#define FFTMALLOC fftwf_malloc
2222+#define MAKEPLAN fftwf_plan_dft_1d
2323+#define DOFFT fftwf_execute
2424+#define DESTROYPLAN fftwf_destroy_plan
2525+#define FFTFREE fftwf_free
2626+2727+#endif
2828+2929+#ifndef CPXTYPE
3030+int main(void)
3131+{
3232+ fprintf(stderr,"Datatype not available in FFTW\n" );
3333+ return 0;
3434+}
3535+#else
3636+int main(int argc,char ** argv)
3737+{
3838+ int nfft=1024;
3939+ int isinverse=0;
4040+ int numffts=1000,i;
4141+4242+ CPXTYPE * in=NULL;
4343+ CPXTYPE * out=NULL;
4444+ PLAN p;
4545+4646+ pstats_init();
4747+4848+ while (1) {
4949+ int c = getopt (argc, argv, "n:ix:h");
5050+ if (c == -1)
5151+ break;
5252+ switch (c) {
5353+ case 'n':
5454+ nfft = atoi (optarg);
5555+ break;
5656+ case 'x':
5757+ numffts = atoi (optarg);
5858+ break;
5959+ case 'i':
6060+ isinverse = 1;
6161+ break;
6262+ case 'h':
6363+ case '?':
6464+ default:
6565+ fprintf(stderr,"options:\n-n N: complex fft length\n-i: inverse\n-x N: number of ffts to compute\n"
6666+ "");
6767+ }
6868+ }
6969+7070+ in=FFTMALLOC(sizeof(CPXTYPE) * nfft);
7171+ out=FFTMALLOC(sizeof(CPXTYPE) * nfft);
7272+ for (i=0;i<nfft;++i ) {
7373+ in[i][0] = rand() - RAND_MAX/2;
7474+ in[i][1] = rand() - RAND_MAX/2;
7575+ }
7676+7777+ if ( isinverse )
7878+ p = MAKEPLAN(nfft, in, out, FFTW_BACKWARD, FFTW_ESTIMATE);
7979+ else
8080+ p = MAKEPLAN(nfft, in, out, FFTW_FORWARD, FFTW_ESTIMATE);
8181+8282+ for (i=0;i<numffts;++i)
8383+ DOFFT(p);
8484+8585+ DESTROYPLAN(p);
8686+8787+ FFTFREE(in); FFTFREE(out);
8888+8989+ fprintf(stderr,"fftw\tnfft=%d\tnumffts=%d\n", nfft,numffts);
9090+ pstats_report();
9191+9292+ return 0;
9393+}
9494+#endif
+122
test/benchkiss.c
···11+#include <stdio.h>
22+#include <stdlib.h>
33+#include <sys/times.h>
44+#include <unistd.h>
55+#include "kiss_fft.h"
66+#include "kiss_fftr.h"
77+#include "kiss_fftnd.h"
88+#include "kiss_fftndr.h"
99+1010+#include "pstats.h"
1111+1212+static
1313+int getdims(int * dims, char * arg)
1414+{
1515+ char *s;
1616+ int ndims=0;
1717+ while ( (s=strtok( arg , ",") ) ) {
1818+ dims[ndims++] = atoi(s);
1919+ //printf("%s=%d\n",s,dims[ndims-1]);
2020+ arg=NULL;
2121+ }
2222+ return ndims;
2323+}
2424+2525+int main(int argc,char ** argv)
2626+{
2727+ int k;
2828+ int nfft[32];
2929+ int ndims = 1;
3030+ int isinverse=0;
3131+ int numffts=1000,i;
3232+ kiss_fft_cpx * buf;
3333+ kiss_fft_cpx * bufout;
3434+ int real = 0;
3535+3636+ nfft[0] = 1024;// default
3737+3838+ while (1) {
3939+ int c = getopt (argc, argv, "n:ix:r");
4040+ if (c == -1)
4141+ break;
4242+ switch (c) {
4343+ case 'r':
4444+ real = 1;
4545+ break;
4646+ case 'n':
4747+ ndims = getdims(nfft, optarg );
4848+ if (nfft[0] != kiss_fft_next_fast_size(nfft[0]) ) {
4949+ int ng = kiss_fft_next_fast_size(nfft[0]);
5050+ fprintf(stderr,"warning: %d might be a better choice for speed than %d\n",ng,nfft[0]);
5151+ }
5252+ break;
5353+ case 'x':
5454+ numffts = atoi (optarg);
5555+ break;
5656+ case 'i':
5757+ isinverse = 1;
5858+ break;
5959+ }
6060+ }
6161+ int nbytes = sizeof(kiss_fft_cpx);
6262+ for (k=0;k<ndims;++k)
6363+ nbytes *= nfft[k];
6464+6565+#ifdef USE_SIMD
6666+ numffts /= 4;
6767+ fprintf(stderr,"since SIMD implementation does 4 ffts at a time, numffts is being reduced to %d\n",numffts);
6868+#endif
6969+7070+ buf=(kiss_fft_cpx*)KISS_FFT_MALLOC(nbytes);
7171+ bufout=(kiss_fft_cpx*)KISS_FFT_MALLOC(nbytes);
7272+ memset(buf,0,nbytes);
7373+7474+ pstats_init();
7575+7676+ if (ndims==1) {
7777+ if (real) {
7878+ kiss_fftr_cfg st = kiss_fftr_alloc( nfft[0] ,isinverse ,0,0);
7979+ if (isinverse)
8080+ for (i=0;i<numffts;++i)
8181+ kiss_fftri( st ,(kiss_fft_cpx*)buf,(kiss_fft_scalar*)bufout );
8282+ else
8383+ for (i=0;i<numffts;++i)
8484+ kiss_fftr( st ,(kiss_fft_scalar*)buf,(kiss_fft_cpx*)bufout );
8585+ free(st);
8686+ }else{
8787+ kiss_fft_cfg st = kiss_fft_alloc( nfft[0] ,isinverse ,0,0);
8888+ for (i=0;i<numffts;++i)
8989+ kiss_fft( st ,buf,bufout );
9090+ free(st);
9191+ }
9292+ }else{
9393+ if (real) {
9494+ kiss_fftndr_cfg st = kiss_fftndr_alloc( nfft,ndims ,isinverse ,0,0);
9595+ if (isinverse)
9696+ for (i=0;i<numffts;++i)
9797+ kiss_fftndri( st ,(kiss_fft_cpx*)buf,(kiss_fft_scalar*)bufout );
9898+ else
9999+ for (i=0;i<numffts;++i)
100100+ kiss_fftndr( st ,(kiss_fft_scalar*)buf,(kiss_fft_cpx*)bufout );
101101+ free(st);
102102+ }else{
103103+ kiss_fftnd_cfg st= kiss_fftnd_alloc(nfft,ndims,isinverse ,0,0);
104104+ for (i=0;i<numffts;++i)
105105+ kiss_fftnd( st ,buf,bufout );
106106+ free(st);
107107+ }
108108+ }
109109+110110+ free(buf); free(bufout);
111111+112112+ fprintf(stderr,"KISS\tnfft=");
113113+ for (k=0;k<ndims;++k)
114114+ fprintf(stderr, "%d,",nfft[k]);
115115+ fprintf(stderr,"\tnumffts=%d\n" ,numffts);
116116+ pstats_report();
117117+118118+ kiss_fft_cleanup();
119119+120120+ return 0;
121121+}
122122+
+92
test/compfft.py
···11+#!/usr/bin/env python
22+33+# use FFTPACK as a baseline
44+import FFT
55+from Numeric import *
66+import math
77+import random
88+import sys
99+import struct
1010+import fft
1111+1212+pi=math.pi
1313+e=math.e
1414+j=complex(0,1)
1515+lims=(-32768,32767)
1616+1717+def randbuf(n,cpx=1):
1818+ res = array( [ random.uniform( lims[0],lims[1] ) for i in range(n) ] )
1919+ if cpx:
2020+ res = res + j*randbuf(n,0)
2121+ return res
2222+2323+def main():
2424+ from getopt import getopt
2525+ import popen2
2626+ opts,args = getopt( sys.argv[1:],'u:n:Rt:' )
2727+ opts=dict(opts)
2828+ exitcode=0
2929+3030+ util = opts.get('-u','./kf_float')
3131+3232+ try:
3333+ dims = [ int(d) for d in opts['-n'].split(',')]
3434+ cpx = opts.get('-R') is None
3535+ fmt=opts.get('-t','f')
3636+ except KeyError:
3737+ sys.stderr.write("""
3838+ usage: compfft.py
3939+ -n d1[,d2,d3...] : FFT dimension(s)
4040+ -u utilname : see sample_code/fftutil.c, default = ./kf_float
4141+ -R : real-optimized version\n""")
4242+ sys.exit(1)
4343+4444+ x = fft.make_random( dims )
4545+4646+ cmd = '%s -n %s ' % ( util, ','.join([ str(d) for d in dims]) )
4747+ if cpx:
4848+ xout = FFT.fftnd(x)
4949+ xout = reshape(xout,(size(xout),))
5050+ else:
5151+ cmd += '-R '
5252+ xout = FFT.real_fft(x)
5353+5454+ proc = popen2.Popen3( cmd , bufsize=len(x) )
5555+5656+ proc.tochild.write( dopack( x , fmt ,cpx ) )
5757+ proc.tochild.close()
5858+ xoutcomp = dounpack( proc.fromchild.read( ) , fmt ,1 )
5959+ #xoutcomp = reshape( xoutcomp , dims )
6060+6161+ sig = xout * conjugate(xout)
6262+ sigpow = sum( sig )
6363+6464+ diff = xout-xoutcomp
6565+ noisepow = sum( diff * conjugate(diff) )
6666+6767+ snr = 10 * math.log10(abs( sigpow / noisepow ) )
6868+ if snr<100:
6969+ print xout
7070+ print xoutcomp
7171+ exitcode=1
7272+ print 'NFFT=%s,SNR = %f dB' % (str(dims),snr)
7373+ sys.exit(exitcode)
7474+7575+def dopack(x,fmt,cpx):
7676+ x = reshape( x, ( size(x),) )
7777+ if cpx:
7878+ s = ''.join( [ struct.pack('ff',c.real,c.imag) for c in x ] )
7979+ else:
8080+ s = ''.join( [ struct.pack('f',c) for c in x ] )
8181+ return s
8282+8383+def dounpack(x,fmt,cpx):
8484+ uf = fmt * ( len(x) / 4 )
8585+ s = struct.unpack(uf,x)
8686+ if cpx:
8787+ return array(s[::2]) + array( s[1::2] )*j
8888+ else:
8989+ return array(s )
9090+9191+if __name__ == "__main__":
9292+ main()
+129
test/doit.c
···11+/* this program is in the public domain
22+ A program that helps the authors of the fine fftw library benchmark kiss
33+*/
44+55+#include "bench-user.h"
66+#include <math.h>
77+88+#include "kiss_fft.h"
99+#include "kiss_fftnd.h"
1010+#include "kiss_fftr.h"
1111+1212+BEGIN_BENCH_DOC
1313+BENCH_DOC("name", "kissfft")
1414+BENCH_DOC("version", "1.0.1")
1515+BENCH_DOC("year", "2004")
1616+BENCH_DOC("author", "Mark Borgerding")
1717+BENCH_DOC("language", "C")
1818+BENCH_DOC("url", "http://sourceforge.net/projects/kissfft/")
1919+BENCH_DOC("copyright",
2020+"Copyright (c) 2003,4 Mark Borgerding\n"
2121+"\n"
2222+"All rights reserved.\n"
2323+"\n"
2424+"Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:\n"
2525+"\n"
2626+" * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.\n"
2727+" * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.\n"
2828+" * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.\n"
2929+"\n"
3030+ "THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n")
3131+END_BENCH_DOC
3232+3333+int can_do(struct problem *p)
3434+{
3535+ if (p->rank == 1) {
3636+ if (p->kind == PROBLEM_REAL) {
3737+ return (p->n[0] & 1) == 0; /* only even real is okay */
3838+ } else {
3939+ return 1;
4040+ }
4141+ } else {
4242+ return p->kind == PROBLEM_COMPLEX;
4343+ }
4444+}
4545+4646+static kiss_fft_cfg cfg=NULL;
4747+static kiss_fftr_cfg cfgr=NULL;
4848+static kiss_fftnd_cfg cfgnd=NULL;
4949+5050+#define FAILIF( c ) \
5151+ if ( c ) do {\
5252+ fprintf(stderr,\
5353+ "kissfft: " #c " (file=%s:%d errno=%d %s)\n",\
5454+ __FILE__,__LINE__ , errno,strerror( errno ) ) ;\
5555+ exit(1);\
5656+ }while(0)
5757+5858+5959+6060+void setup(struct problem *p)
6161+{
6262+ size_t i;
6363+6464+ /*
6565+ fprintf(stderr,"%s %s %d-d ",
6666+ (p->sign == 1)?"Inverse":"Forward",
6767+ p->kind == PROBLEM_COMPLEX?"Complex":"Real",
6868+ p->rank);
6969+ */
7070+ if (p->rank == 1) {
7171+ if (p->kind == PROBLEM_COMPLEX) {
7272+ cfg = kiss_fft_alloc (p->n[0], (p->sign == 1), 0, 0);
7373+ FAILIF(cfg==NULL);
7474+ }else{
7575+ cfgr = kiss_fftr_alloc (p->n[0], (p->sign == 1), 0, 0);
7676+ FAILIF(cfgr==NULL);
7777+ }
7878+ }else{
7979+ int dims[5];
8080+ for (i=0;i<p->rank;++i){
8181+ dims[i] = p->n[i];
8282+ }
8383+ /* multi-dimensional */
8484+ if (p->kind == PROBLEM_COMPLEX) {
8585+ cfgnd = kiss_fftnd_alloc( dims , p->rank, (p->sign == 1), 0, 0 );
8686+ FAILIF(cfgnd==NULL);
8787+ }
8888+ }
8989+}
9090+9191+void doit(int iter, struct problem *p)
9292+{
9393+ int i;
9494+ void *in = p->in;
9595+ void *out = p->out;
9696+9797+ if (p->in_place)
9898+ out = p->in;
9999+100100+ if (p->rank == 1) {
101101+ if (p->kind == PROBLEM_COMPLEX){
102102+ for (i = 0; i < iter; ++i)
103103+ kiss_fft (cfg, in, out);
104104+ } else {
105105+ /* PROBLEM_REAL */
106106+ if (p->sign == -1) /* FORWARD */
107107+ for (i = 0; i < iter; ++i)
108108+ kiss_fftr (cfgr, in, out);
109109+ else
110110+ for (i = 0; i < iter; ++i)
111111+ kiss_fftri (cfgr, in, out);
112112+ }
113113+ }else{
114114+ /* multi-dimensional */
115115+ for (i = 0; i < iter; ++i)
116116+ kiss_fftnd(cfgnd,in,out);
117117+ }
118118+}
119119+120120+void done(struct problem *p)
121121+{
122122+ free(cfg);
123123+ cfg=NULL;
124124+ free(cfgr);
125125+ cfgr=NULL;
126126+ free(cfgnd);
127127+ cfgnd=NULL;
128128+ UNUSED(p);
129129+}
···11+function maxabsdiff=tailscrap()
22+% test code for circular convolution with the scrapped portion
33+% at the tail of the buffer, rather than the front
44+%
55+% The idea is to rotate the zero-padded h (impulse response) buffer
66+% to the left nh-1 samples, rotating the junk samples as well.
77+% This could be very handy in avoiding buffer copies during fast filtering.
88+nh=10;
99+nfft=256;
1010+1111+h=rand(1,nh);
1212+x=rand(1,nfft);
1313+1414+hpad=[ h(nh) zeros(1,nfft-nh) h(1:nh-1) ];
1515+1616+% baseline comparison
1717+y1 = filter(h,1,x);
1818+y1_notrans = y1(nh:nfft);
1919+2020+% fast convolution
2121+y2 = ifft( fft(hpad) .* fft(x) );
2222+y2_notrans=y2(1:nfft-nh+1);
2323+2424+maxabsdiff = max(abs(y2_notrans - y1_notrans))
2525+2626+end
···11+/*
22+Copyright (c) 2003-2004, Mark Borgerding
33+44+All rights reserved.
55+66+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
77+88+ * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
99+ * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
1010+ * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
1111+1212+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1313+*/
1414+1515+#include <stdlib.h>
1616+#include <math.h>
1717+#include <stdio.h>
1818+#include <string.h>
1919+#include <unistd.h>
2020+2121+#include "kiss_fft.h"
2222+#include "kiss_fftndr.h"
2323+2424+static
2525+void fft_file(FILE * fin,FILE * fout,int nfft,int isinverse)
2626+{
2727+ kiss_fft_cfg st;
2828+ kiss_fft_cpx * buf;
2929+ kiss_fft_cpx * bufout;
3030+3131+ buf = (kiss_fft_cpx*)malloc(sizeof(kiss_fft_cpx) * nfft );
3232+ bufout = (kiss_fft_cpx*)malloc(sizeof(kiss_fft_cpx) * nfft );
3333+ st = kiss_fft_alloc( nfft ,isinverse ,0,0);
3434+3535+ while ( fread( buf , sizeof(kiss_fft_cpx) * nfft ,1, fin ) > 0 ) {
3636+ kiss_fft( st , buf ,bufout);
3737+ fwrite( bufout , sizeof(kiss_fft_cpx) , nfft , fout );
3838+ }
3939+ free(st);
4040+ free(buf);
4141+ free(bufout);
4242+}
4343+4444+static
4545+void fft_filend(FILE * fin,FILE * fout,int *dims,int ndims,int isinverse)
4646+{
4747+ kiss_fftnd_cfg st;
4848+ kiss_fft_cpx *buf;
4949+ int dimprod=1,i;
5050+ for (i=0;i<ndims;++i)
5151+ dimprod *= dims[i];
5252+5353+ buf = (kiss_fft_cpx *) malloc (sizeof (kiss_fft_cpx) * dimprod);
5454+ st = kiss_fftnd_alloc (dims, ndims, isinverse, 0, 0);
5555+5656+ while (fread (buf, sizeof (kiss_fft_cpx) * dimprod, 1, fin) > 0) {
5757+ kiss_fftnd (st, buf, buf);
5858+ fwrite (buf, sizeof (kiss_fft_cpx), dimprod, fout);
5959+ }
6060+ free (st);
6161+ free (buf);
6262+}
6363+6464+6565+6666+static
6767+void fft_filend_real(FILE * fin,FILE * fout,int *dims,int ndims,int isinverse)
6868+{
6969+ int dimprod=1,i;
7070+ kiss_fftndr_cfg st;
7171+ void *ibuf;
7272+ void *obuf;
7373+ int insize,outsize; // size in bytes
7474+7575+ for (i=0;i<ndims;++i)
7676+ dimprod *= dims[i];
7777+ insize = outsize = dimprod;
7878+ int rdim = dims[ndims-1];
7979+8080+ if (isinverse)
8181+ insize = insize*2*(rdim/2+1)/rdim;
8282+ else
8383+ outsize = outsize*2*(rdim/2+1)/rdim;
8484+8585+ ibuf = malloc(insize*sizeof(kiss_fft_scalar));
8686+ obuf = malloc(outsize*sizeof(kiss_fft_scalar));
8787+8888+ st = kiss_fftndr_alloc(dims, ndims, isinverse, 0, 0);
8989+9090+ while ( fread (ibuf, sizeof(kiss_fft_scalar), insize, fin) > 0) {
9191+ if (isinverse) {
9292+ kiss_fftndri(st,
9393+ (kiss_fft_cpx*)ibuf,
9494+ (kiss_fft_scalar*)obuf);
9595+ }else{
9696+ kiss_fftndr(st,
9797+ (kiss_fft_scalar*)ibuf,
9898+ (kiss_fft_cpx*)obuf);
9999+ }
100100+ fwrite (obuf, sizeof(kiss_fft_scalar), outsize,fout);
101101+ }
102102+ free(st);
103103+ free(ibuf);
104104+ free(obuf);
105105+}
106106+107107+static
108108+void fft_file_real(FILE * fin,FILE * fout,int nfft,int isinverse)
109109+{
110110+ kiss_fftr_cfg st;
111111+ kiss_fft_scalar * rbuf;
112112+ kiss_fft_cpx * cbuf;
113113+114114+ rbuf = (kiss_fft_scalar*)malloc(sizeof(kiss_fft_scalar) * nfft );
115115+ cbuf = (kiss_fft_cpx*)malloc(sizeof(kiss_fft_cpx) * (nfft/2+1) );
116116+ st = kiss_fftr_alloc( nfft ,isinverse ,0,0);
117117+118118+ if (isinverse==0) {
119119+ while ( fread( rbuf , sizeof(kiss_fft_scalar) * nfft ,1, fin ) > 0 ) {
120120+ kiss_fftr( st , rbuf ,cbuf);
121121+ fwrite( cbuf , sizeof(kiss_fft_cpx) , (nfft/2 + 1) , fout );
122122+ }
123123+ }else{
124124+ while ( fread( cbuf , sizeof(kiss_fft_cpx) * (nfft/2+1) ,1, fin ) > 0 ) {
125125+ kiss_fftri( st , cbuf ,rbuf);
126126+ fwrite( rbuf , sizeof(kiss_fft_scalar) , nfft , fout );
127127+ }
128128+ }
129129+ free(st);
130130+ free(rbuf);
131131+ free(cbuf);
132132+}
133133+134134+static
135135+int get_dims(char * arg,int * dims)
136136+{
137137+ char *p0;
138138+ int ndims=0;
139139+140140+ do{
141141+ p0 = strchr(arg,',');
142142+ if (p0)
143143+ *p0++ = '\0';
144144+ dims[ndims++] = atoi(arg);
145145+// fprintf(stderr,"dims[%d] = %d\n",ndims-1,dims[ndims-1]);
146146+ arg = p0;
147147+ }while (p0);
148148+ return ndims;
149149+}
150150+151151+int main(int argc,char ** argv)
152152+{
153153+ int isinverse=0;
154154+ int isreal=0;
155155+ FILE *fin=stdin;
156156+ FILE *fout=stdout;
157157+ int ndims=1;
158158+ int dims[32];
159159+ dims[0] = 1024; /*default fft size*/
160160+161161+ while (1) {
162162+ int c=getopt(argc,argv,"n:iR");
163163+ if (c==-1) break;
164164+ switch (c) {
165165+ case 'n':
166166+ ndims = get_dims(optarg,dims);
167167+ break;
168168+ case 'i':isinverse=1;break;
169169+ case 'R':isreal=1;break;
170170+ case '?':
171171+ fprintf(stderr,"usage options:\n"
172172+ "\t-n d1[,d2,d3...]: fft dimension(s)\n"
173173+ "\t-i : inverse\n"
174174+ "\t-R : real input samples, not complex\n");
175175+ exit (1);
176176+ default:fprintf(stderr,"bad %c\n",c);break;
177177+ }
178178+ }
179179+180180+ if ( optind < argc ) {
181181+ if (strcmp("-",argv[optind]) !=0)
182182+ fin = fopen(argv[optind],"rb");
183183+ ++optind;
184184+ }
185185+186186+ if ( optind < argc ) {
187187+ if ( strcmp("-",argv[optind]) !=0 )
188188+ fout = fopen(argv[optind],"wb");
189189+ ++optind;
190190+ }
191191+192192+ if (ndims==1) {
193193+ if (isreal)
194194+ fft_file_real(fin,fout,dims[0],isinverse);
195195+ else
196196+ fft_file(fin,fout,dims[0],isinverse);
197197+ }else{
198198+ if (isreal)
199199+ fft_filend_real(fin,fout,dims,ndims,isinverse);
200200+ else
201201+ fft_filend(fin,fout,dims,ndims,isinverse);
202202+ }
203203+204204+ if (fout!=stdout) fclose(fout);
205205+ if (fin!=stdin) fclose(fin);
206206+207207+ return 0;
208208+}
+116
tools/kfc.c
···11+#include "kfc.h"
22+33+/*
44+Copyright (c) 2003-2004, Mark Borgerding
55+66+All rights reserved.
77+88+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
99+1010+ * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
1111+ * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
1212+ * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
1313+1414+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1515+*/
1616+1717+1818+typedef struct cached_fft *kfc_cfg;
1919+2020+struct cached_fft
2121+{
2222+ int nfft;
2323+ int inverse;
2424+ kiss_fft_cfg cfg;
2525+ kfc_cfg next;
2626+};
2727+2828+static kfc_cfg cache_root=NULL;
2929+static int ncached=0;
3030+3131+static kiss_fft_cfg find_cached_fft(int nfft,int inverse)
3232+{
3333+ size_t len;
3434+ kfc_cfg cur=cache_root;
3535+ kfc_cfg prev=NULL;
3636+ while ( cur ) {
3737+ if ( cur->nfft == nfft && inverse == cur->inverse )
3838+ break;/*found the right node*/
3939+ prev = cur;
4040+ cur = prev->next;
4141+ }
4242+ if (cur== NULL) {
4343+ /* no cached node found, need to create a new one*/
4444+ kiss_fft_alloc(nfft,inverse,0,&len);
4545+#ifdef USE_SIMD
4646+ int padding = (16-sizeof(struct cached_fft)) & 15;
4747+ // make sure the cfg aligns on a 16 byte boundary
4848+ len += padding;
4949+#endif
5050+ cur = (kfc_cfg)KISS_FFT_MALLOC((sizeof(struct cached_fft) + len ));
5151+ if (cur == NULL)
5252+ return NULL;
5353+ cur->cfg = (kiss_fft_cfg)(cur+1);
5454+#ifdef USE_SIMD
5555+ cur->cfg = (kiss_fft_cfg) ((char*)(cur+1)+padding);
5656+#endif
5757+ kiss_fft_alloc(nfft,inverse,cur->cfg,&len);
5858+ cur->nfft=nfft;
5959+ cur->inverse=inverse;
6060+ cur->next = NULL;
6161+ if ( prev )
6262+ prev->next = cur;
6363+ else
6464+ cache_root = cur;
6565+ ++ncached;
6666+ }
6767+ return cur->cfg;
6868+}
6969+7070+void kfc_cleanup(void)
7171+{
7272+ kfc_cfg cur=cache_root;
7373+ kfc_cfg next=NULL;
7474+ while (cur){
7575+ next = cur->next;
7676+ free(cur);
7777+ cur=next;
7878+ }
7979+ ncached=0;
8080+ cache_root = NULL;
8181+}
8282+void kfc_fft(int nfft, const kiss_fft_cpx * fin,kiss_fft_cpx * fout)
8383+{
8484+ kiss_fft( find_cached_fft(nfft,0),fin,fout );
8585+}
8686+8787+void kfc_ifft(int nfft, const kiss_fft_cpx * fin,kiss_fft_cpx * fout)
8888+{
8989+ kiss_fft( find_cached_fft(nfft,1),fin,fout );
9090+}
9191+9292+#ifdef KFC_TEST
9393+static void check(int nc)
9494+{
9595+ if (ncached != nc) {
9696+ fprintf(stderr,"ncached should be %d,but it is %d\n",nc,ncached);
9797+ exit(1);
9898+ }
9999+}
100100+101101+int main(void)
102102+{
103103+ kiss_fft_cpx buf1[1024],buf2[1024];
104104+ memset(buf1,0,sizeof(buf1));
105105+ check(0);
106106+ kfc_fft(512,buf1,buf2);
107107+ check(1);
108108+ kfc_fft(512,buf1,buf2);
109109+ check(1);
110110+ kfc_ifft(512,buf1,buf2);
111111+ check(2);
112112+ kfc_cleanup();
113113+ check(0);
114114+ return 0;
115115+}
116116+#endif
+46
tools/kfc.h
···11+#ifndef KFC_H
22+#define KFC_H
33+#include "kiss_fft.h"
44+55+#ifdef __cplusplus
66+extern "C" {
77+#endif
88+99+/*
1010+KFC -- Kiss FFT Cache
1111+1212+Not needing to deal with kiss_fft_alloc and a config
1313+object may be handy for a lot of programs.
1414+1515+KFC uses the underlying KISS FFT functions, but caches the config object.
1616+The first time kfc_fft or kfc_ifft for a given FFT size, the cfg
1717+object is created for it. All subsequent calls use the cached
1818+configuration object.
1919+2020+NOTE:
2121+You should probably not use this if your program will be using a lot
2222+of various sizes of FFTs. There is a linear search through the
2323+cached objects. If you are only using one or two FFT sizes, this
2424+will be negligible. Otherwise, you may want to use another method
2525+of managing the cfg objects.
2626+2727+ There is no automated cleanup of the cached objects. This could lead
2828+to large memory usage in a program that uses a lot of *DIFFERENT*
2929+sized FFTs. If you want to force all cached cfg objects to be freed,
3030+call kfc_cleanup.
3131+3232+ */
3333+3434+/*forward complex FFT */
3535+void kfc_fft(int nfft, const kiss_fft_cpx * fin,kiss_fft_cpx * fout);
3636+/*reverse complex FFT */
3737+void kfc_ifft(int nfft, const kiss_fft_cpx * fin,kiss_fft_cpx * fout);
3838+3939+/*free all cached objects*/
4040+void kfc_cleanup(void);
4141+4242+#ifdef __cplusplus
4343+}
4444+#endif
4545+4646+#endif
+470
tools/kiss_fastfir.c
···11+/*
22+Copyright (c) 2003-2004, Mark Borgerding
33+44+All rights reserved.
55+66+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
77+88+ * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
99+ * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
1010+ * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
1111+1212+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1313+*/
1414+1515+#include "_kiss_fft_guts.h"
1616+1717+1818+/*
1919+ Some definitions that allow real or complex filtering
2020+*/
2121+#ifdef REAL_FASTFIR
2222+#define MIN_FFT_LEN 2048
2323+#include "kiss_fftr.h"
2424+typedef kiss_fft_scalar kffsamp_t;
2525+typedef kiss_fftr_cfg kfcfg_t;
2626+#define FFT_ALLOC kiss_fftr_alloc
2727+#define FFTFWD kiss_fftr
2828+#define FFTINV kiss_fftri
2929+#else
3030+#define MIN_FFT_LEN 1024
3131+typedef kiss_fft_cpx kffsamp_t;
3232+typedef kiss_fft_cfg kfcfg_t;
3333+#define FFT_ALLOC kiss_fft_alloc
3434+#define FFTFWD kiss_fft
3535+#define FFTINV kiss_fft
3636+#endif
3737+3838+typedef struct kiss_fastfir_state *kiss_fastfir_cfg;
3939+4040+4141+4242+kiss_fastfir_cfg kiss_fastfir_alloc(const kffsamp_t * imp_resp,size_t n_imp_resp,
4343+ size_t * nfft,void * mem,size_t*lenmem);
4444+4545+/* see do_file_filter for usage */
4646+size_t kiss_fastfir( kiss_fastfir_cfg cfg, kffsamp_t * inbuf, kffsamp_t * outbuf, size_t n, size_t *offset);
4747+4848+4949+5050+static int verbose=0;
5151+5252+5353+struct kiss_fastfir_state{
5454+ size_t nfft;
5555+ size_t ngood;
5656+ kfcfg_t fftcfg;
5757+ kfcfg_t ifftcfg;
5858+ kiss_fft_cpx * fir_freq_resp;
5959+ kiss_fft_cpx * freqbuf;
6060+ size_t n_freq_bins;
6161+ kffsamp_t * tmpbuf;
6262+};
6363+6464+6565+kiss_fastfir_cfg kiss_fastfir_alloc(
6666+ const kffsamp_t * imp_resp,size_t n_imp_resp,
6767+ size_t *pnfft, /* if <= 0, an appropriate size will be chosen */
6868+ void * mem,size_t*lenmem)
6969+{
7070+ kiss_fastfir_cfg st = NULL;
7171+ size_t len_fftcfg,len_ifftcfg;
7272+ size_t memneeded = sizeof(struct kiss_fastfir_state);
7373+ char * ptr;
7474+ size_t i;
7575+ size_t nfft=0;
7676+ float scale;
7777+ int n_freq_bins;
7878+ if (pnfft)
7979+ nfft=*pnfft;
8080+8181+ if (nfft<=0) {
8282+ /* determine fft size as next power of two at least 2x
8383+ the impulse response length*/
8484+ i=n_imp_resp-1;
8585+ nfft=2;
8686+ do{
8787+ nfft<<=1;
8888+ }while (i>>=1);
8989+#ifdef MIN_FFT_LEN
9090+ if ( nfft < MIN_FFT_LEN )
9191+ nfft=MIN_FFT_LEN;
9292+#endif
9393+ }
9494+ if (pnfft)
9595+ *pnfft = nfft;
9696+9797+#ifdef REAL_FASTFIR
9898+ n_freq_bins = nfft/2 + 1;
9999+#else
100100+ n_freq_bins = nfft;
101101+#endif
102102+ /*fftcfg*/
103103+ FFT_ALLOC (nfft, 0, NULL, &len_fftcfg);
104104+ memneeded += len_fftcfg;
105105+ /*ifftcfg*/
106106+ FFT_ALLOC (nfft, 1, NULL, &len_ifftcfg);
107107+ memneeded += len_ifftcfg;
108108+ /* tmpbuf */
109109+ memneeded += sizeof(kffsamp_t) * nfft;
110110+ /* fir_freq_resp */
111111+ memneeded += sizeof(kiss_fft_cpx) * n_freq_bins;
112112+ /* freqbuf */
113113+ memneeded += sizeof(kiss_fft_cpx) * n_freq_bins;
114114+115115+ if (lenmem == NULL) {
116116+ st = (kiss_fastfir_cfg) malloc (memneeded);
117117+ } else {
118118+ if (*lenmem >= memneeded)
119119+ st = (kiss_fastfir_cfg) mem;
120120+ *lenmem = memneeded;
121121+ }
122122+ if (!st)
123123+ return NULL;
124124+125125+ st->nfft = nfft;
126126+ st->ngood = nfft - n_imp_resp + 1;
127127+ st->n_freq_bins = n_freq_bins;
128128+ ptr=(char*)(st+1);
129129+130130+ st->fftcfg = (kfcfg_t)ptr;
131131+ ptr += len_fftcfg;
132132+133133+ st->ifftcfg = (kfcfg_t)ptr;
134134+ ptr += len_ifftcfg;
135135+136136+ st->tmpbuf = (kffsamp_t*)ptr;
137137+ ptr += sizeof(kffsamp_t) * nfft;
138138+139139+ st->freqbuf = (kiss_fft_cpx*)ptr;
140140+ ptr += sizeof(kiss_fft_cpx) * n_freq_bins;
141141+142142+ st->fir_freq_resp = (kiss_fft_cpx*)ptr;
143143+ ptr += sizeof(kiss_fft_cpx) * n_freq_bins;
144144+145145+ FFT_ALLOC (nfft,0,st->fftcfg , &len_fftcfg);
146146+ FFT_ALLOC (nfft,1,st->ifftcfg , &len_ifftcfg);
147147+148148+ memset(st->tmpbuf,0,sizeof(kffsamp_t)*nfft);
149149+ /*zero pad in the middle to left-rotate the impulse response
150150+ This puts the scrap samples at the end of the inverse fft'd buffer */
151151+ st->tmpbuf[0] = imp_resp[ n_imp_resp - 1 ];
152152+ for (i=0;i<n_imp_resp - 1; ++i) {
153153+ st->tmpbuf[ nfft - n_imp_resp + 1 + i ] = imp_resp[ i ];
154154+ }
155155+156156+ FFTFWD(st->fftcfg,st->tmpbuf,st->fir_freq_resp);
157157+158158+ /* TODO: this won't work for fixed point */
159159+ scale = 1.0 / st->nfft;
160160+161161+ for ( i=0; i < st->n_freq_bins; ++i ) {
162162+#ifdef USE_SIMD
163163+ st->fir_freq_resp[i].r *= _mm_set1_ps(scale);
164164+ st->fir_freq_resp[i].i *= _mm_set1_ps(scale);
165165+#else
166166+ st->fir_freq_resp[i].r *= scale;
167167+ st->fir_freq_resp[i].i *= scale;
168168+#endif
169169+ }
170170+ return st;
171171+}
172172+173173+static void fastconv1buf(const kiss_fastfir_cfg st,const kffsamp_t * in,kffsamp_t * out)
174174+{
175175+ size_t i;
176176+ /* multiply the frequency response of the input signal by
177177+ that of the fir filter*/
178178+ FFTFWD( st->fftcfg, in , st->freqbuf );
179179+ for ( i=0; i<st->n_freq_bins; ++i ) {
180180+ kiss_fft_cpx tmpsamp;
181181+ C_MUL(tmpsamp,st->freqbuf[i],st->fir_freq_resp[i]);
182182+ st->freqbuf[i] = tmpsamp;
183183+ }
184184+185185+ /* perform the inverse fft*/
186186+ FFTINV(st->ifftcfg,st->freqbuf,out);
187187+}
188188+189189+/* n : the size of inbuf and outbuf in samples
190190+ return value: the number of samples completely processed
191191+ n-retval samples should be copied to the front of the next input buffer */
192192+static size_t kff_nocopy(
193193+ kiss_fastfir_cfg st,
194194+ const kffsamp_t * inbuf,
195195+ kffsamp_t * outbuf,
196196+ size_t n)
197197+{
198198+ size_t norig=n;
199199+ while (n >= st->nfft ) {
200200+ fastconv1buf(st,inbuf,outbuf);
201201+ inbuf += st->ngood;
202202+ outbuf += st->ngood;
203203+ n -= st->ngood;
204204+ }
205205+ return norig - n;
206206+}
207207+208208+static
209209+size_t kff_flush(kiss_fastfir_cfg st,const kffsamp_t * inbuf,kffsamp_t * outbuf,size_t n)
210210+{
211211+ size_t zpad=0,ntmp;
212212+213213+ ntmp = kff_nocopy(st,inbuf,outbuf,n);
214214+ n -= ntmp;
215215+ inbuf += ntmp;
216216+ outbuf += ntmp;
217217+218218+ zpad = st->nfft - n;
219219+ memset(st->tmpbuf,0,sizeof(kffsamp_t)*st->nfft );
220220+ memcpy(st->tmpbuf,inbuf,sizeof(kffsamp_t)*n );
221221+222222+ fastconv1buf(st,st->tmpbuf,st->tmpbuf);
223223+224224+ memcpy(outbuf,st->tmpbuf,sizeof(kffsamp_t)*( st->ngood - zpad ));
225225+ return ntmp + st->ngood - zpad;
226226+}
227227+228228+size_t kiss_fastfir(
229229+ kiss_fastfir_cfg vst,
230230+ kffsamp_t * inbuf,
231231+ kffsamp_t * outbuf,
232232+ size_t n_new,
233233+ size_t *offset)
234234+{
235235+ size_t ntot = n_new + *offset;
236236+ if (n_new==0) {
237237+ return kff_flush(vst,inbuf,outbuf,ntot);
238238+ }else{
239239+ size_t nwritten = kff_nocopy(vst,inbuf,outbuf,ntot);
240240+ *offset = ntot - nwritten;
241241+ /*save the unused or underused samples at the front of the input buffer */
242242+ memcpy( inbuf , inbuf+nwritten , *offset * sizeof(kffsamp_t) );
243243+ return nwritten;
244244+ }
245245+}
246246+247247+#ifdef FAST_FILT_UTIL
248248+#include <unistd.h>
249249+#include <sys/types.h>
250250+#include <sys/mman.h>
251251+#include <assert.h>
252252+253253+static
254254+void direct_file_filter(
255255+ FILE * fin,
256256+ FILE * fout,
257257+ const kffsamp_t * imp_resp,
258258+ size_t n_imp_resp)
259259+{
260260+ size_t nlag = n_imp_resp - 1;
261261+262262+ const kffsamp_t *tmph;
263263+ kffsamp_t *buf, *circbuf;
264264+ kffsamp_t outval;
265265+ size_t nread;
266266+ size_t nbuf;
267267+ size_t oldestlag = 0;
268268+ size_t k, tap;
269269+#ifndef REAL_FASTFIR
270270+ kffsamp_t tmp;
271271+#endif
272272+273273+ nbuf = 4096;
274274+ buf = (kffsamp_t *) malloc ( sizeof (kffsamp_t) * nbuf);
275275+ circbuf = (kffsamp_t *) malloc (sizeof (kffsamp_t) * nlag);
276276+ if (!circbuf || !buf) {
277277+ perror("circbuf allocation");
278278+ exit(1);
279279+ }
280280+281281+ if ( fread (circbuf, sizeof (kffsamp_t), nlag, fin) != nlag ) {
282282+ perror ("insufficient data to overcome transient");
283283+ exit (1);
284284+ }
285285+286286+ do {
287287+ nread = fread (buf, sizeof (kffsamp_t), nbuf, fin);
288288+ if (nread <= 0)
289289+ break;
290290+291291+ for (k = 0; k < nread; ++k) {
292292+ tmph = imp_resp+nlag;
293293+#ifdef REAL_FASTFIR
294294+# ifdef USE_SIMD
295295+ outval = _mm_set1_ps(0);
296296+#else
297297+ outval = 0;
298298+#endif
299299+ for (tap = oldestlag; tap < nlag; ++tap)
300300+ outval += circbuf[tap] * *tmph--;
301301+ for (tap = 0; tap < oldestlag; ++tap)
302302+ outval += circbuf[tap] * *tmph--;
303303+ outval += buf[k] * *tmph;
304304+#else
305305+# ifdef USE_SIMD
306306+ outval.r = outval.i = _mm_set1_ps(0);
307307+#else
308308+ outval.r = outval.i = 0;
309309+#endif
310310+ for (tap = oldestlag; tap < nlag; ++tap){
311311+ C_MUL(tmp,circbuf[tap],*tmph);
312312+ --tmph;
313313+ C_ADDTO(outval,tmp);
314314+ }
315315+316316+ for (tap = 0; tap < oldestlag; ++tap) {
317317+ C_MUL(tmp,circbuf[tap],*tmph);
318318+ --tmph;
319319+ C_ADDTO(outval,tmp);
320320+ }
321321+ C_MUL(tmp,buf[k],*tmph);
322322+ C_ADDTO(outval,tmp);
323323+#endif
324324+325325+ circbuf[oldestlag++] = buf[k];
326326+ buf[k] = outval;
327327+328328+ if (oldestlag == nlag)
329329+ oldestlag = 0;
330330+ }
331331+332332+ if (fwrite (buf, sizeof (buf[0]), nread, fout) != nread) {
333333+ perror ("short write");
334334+ exit (1);
335335+ }
336336+ } while (nread);
337337+ free (buf);
338338+ free (circbuf);
339339+}
340340+341341+static
342342+void do_file_filter(
343343+ FILE * fin,
344344+ FILE * fout,
345345+ const kffsamp_t * imp_resp,
346346+ size_t n_imp_resp,
347347+ size_t nfft )
348348+{
349349+ int fdout;
350350+ size_t n_samps_buf;
351351+352352+ kiss_fastfir_cfg cfg;
353353+ kffsamp_t *inbuf,*outbuf;
354354+ int nread,nwrite;
355355+ size_t idx_inbuf;
356356+357357+ fdout = fileno(fout);
358358+359359+ cfg=kiss_fastfir_alloc(imp_resp,n_imp_resp,&nfft,0,0);
360360+361361+ /* use length to minimize buffer shift*/
362362+ n_samps_buf = 8*4096/sizeof(kffsamp_t);
363363+ n_samps_buf = nfft + 4*(nfft-n_imp_resp+1);
364364+365365+ if (verbose) fprintf(stderr,"bufsize=%d\n",(int)(sizeof(kffsamp_t)*n_samps_buf) );
366366+367367+368368+ /*allocate space and initialize pointers */
369369+ inbuf = (kffsamp_t*)malloc(sizeof(kffsamp_t)*n_samps_buf);
370370+ outbuf = (kffsamp_t*)malloc(sizeof(kffsamp_t)*n_samps_buf);
371371+372372+ idx_inbuf=0;
373373+ do{
374374+ /* start reading at inbuf[idx_inbuf] */
375375+ nread = fread( inbuf + idx_inbuf, sizeof(kffsamp_t), n_samps_buf - idx_inbuf,fin );
376376+377377+ /* If nread==0, then this is a flush.
378378+ The total number of samples in input is idx_inbuf + nread . */
379379+ nwrite = kiss_fastfir(cfg, inbuf, outbuf,nread,&idx_inbuf) * sizeof(kffsamp_t);
380380+ /* kiss_fastfir moved any unused samples to the front of inbuf and updated idx_inbuf */
381381+382382+ if ( write(fdout, outbuf, nwrite) != nwrite ) {
383383+ perror("short write");
384384+ exit(1);
385385+ }
386386+ }while ( nread );
387387+ free(cfg);
388388+ free(inbuf);
389389+ free(outbuf);
390390+}
391391+392392+int main(int argc,char**argv)
393393+{
394394+ kffsamp_t * h;
395395+ int use_direct=0;
396396+ size_t nh,nfft=0;
397397+ FILE *fin=stdin;
398398+ FILE *fout=stdout;
399399+ FILE *filtfile=NULL;
400400+ while (1) {
401401+ int c=getopt(argc,argv,"n:h:i:o:vd");
402402+ if (c==-1) break;
403403+ switch (c) {
404404+ case 'v':
405405+ verbose=1;
406406+ break;
407407+ case 'n':
408408+ nfft=atoi(optarg);
409409+ break;
410410+ case 'i':
411411+ fin = fopen(optarg,"rb");
412412+ if (fin==NULL) {
413413+ perror(optarg);
414414+ exit(1);
415415+ }
416416+ break;
417417+ case 'o':
418418+ fout = fopen(optarg,"w+b");
419419+ if (fout==NULL) {
420420+ perror(optarg);
421421+ exit(1);
422422+ }
423423+ break;
424424+ case 'h':
425425+ filtfile = fopen(optarg,"rb");
426426+ if (filtfile==NULL) {
427427+ perror(optarg);
428428+ exit(1);
429429+ }
430430+ break;
431431+ case 'd':
432432+ use_direct=1;
433433+ break;
434434+ case '?':
435435+ fprintf(stderr,"usage options:\n"
436436+ "\t-n nfft: fft size to use\n"
437437+ "\t-d : use direct FIR filtering, not fast convolution\n"
438438+ "\t-i filename: input file\n"
439439+ "\t-o filename: output(filtered) file\n"
440440+ "\t-n nfft: fft size to use\n"
441441+ "\t-h filename: impulse response\n");
442442+ exit (1);
443443+ default:fprintf(stderr,"bad %c\n",c);break;
444444+ }
445445+ }
446446+ if (filtfile==NULL) {
447447+ fprintf(stderr,"You must supply the FIR coeffs via -h\n");
448448+ exit(1);
449449+ }
450450+ fseek(filtfile,0,SEEK_END);
451451+ nh = ftell(filtfile) / sizeof(kffsamp_t);
452452+ if (verbose) fprintf(stderr,"%d samples in FIR filter\n",(int)nh);
453453+ h = (kffsamp_t*)malloc(sizeof(kffsamp_t)*nh);
454454+ fseek(filtfile,0,SEEK_SET);
455455+ if (fread(h,sizeof(kffsamp_t),nh,filtfile) != nh)
456456+ fprintf(stderr,"short read on filter file\n");
457457+458458+ fclose(filtfile);
459459+460460+ if (use_direct)
461461+ direct_file_filter( fin, fout, h,nh);
462462+ else
463463+ do_file_filter( fin, fout, h,nh,nfft);
464464+465465+ if (fout!=stdout) fclose(fout);
466466+ if (fin!=stdin) fclose(fin);
467467+468468+ return 0;
469469+}
470470+#endif
+193
tools/kiss_fftnd.c
···11+22+33+/*
44+Copyright (c) 2003-2004, Mark Borgerding
55+66+All rights reserved.
77+88+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
99+1010+ * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
1111+ * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
1212+ * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
1313+1414+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1515+*/
1616+1717+#include "kiss_fftnd.h"
1818+#include "_kiss_fft_guts.h"
1919+2020+struct kiss_fftnd_state{
2121+ int dimprod; /* dimsum would be mighty tasty right now */
2222+ int ndims;
2323+ int *dims;
2424+ kiss_fft_cfg *states; /* cfg states for each dimension */
2525+ kiss_fft_cpx * tmpbuf; /*buffer capable of hold the entire input */
2626+};
2727+2828+kiss_fftnd_cfg kiss_fftnd_alloc(const int *dims,int ndims,int inverse_fft,void*mem,size_t*lenmem)
2929+{
3030+ kiss_fftnd_cfg st = NULL;
3131+ int i;
3232+ int dimprod=1;
3333+ size_t memneeded = sizeof(struct kiss_fftnd_state);
3434+ char * ptr;
3535+3636+ for (i=0;i<ndims;++i) {
3737+ size_t sublen=0;
3838+ kiss_fft_alloc (dims[i], inverse_fft, NULL, &sublen);
3939+ memneeded += sublen; /* st->states[i] */
4040+ dimprod *= dims[i];
4141+ }
4242+ memneeded += sizeof(int) * ndims;/* st->dims */
4343+ memneeded += sizeof(void*) * ndims;/* st->states */
4444+ memneeded += sizeof(kiss_fft_cpx) * dimprod; /* st->tmpbuf */
4545+4646+ if (lenmem == NULL) {/* allocate for the caller*/
4747+ st = (kiss_fftnd_cfg) malloc (memneeded);
4848+ } else { /* initialize supplied buffer if big enough */
4949+ if (*lenmem >= memneeded)
5050+ st = (kiss_fftnd_cfg) mem;
5151+ *lenmem = memneeded; /*tell caller how big struct is (or would be) */
5252+ }
5353+ if (!st)
5454+ return NULL; /*malloc failed or buffer too small */
5555+5656+ st->dimprod = dimprod;
5757+ st->ndims = ndims;
5858+ ptr=(char*)(st+1);
5959+6060+ st->states = (kiss_fft_cfg *)ptr;
6161+ ptr += sizeof(void*) * ndims;
6262+6363+ st->dims = (int*)ptr;
6464+ ptr += sizeof(int) * ndims;
6565+6666+ st->tmpbuf = (kiss_fft_cpx*)ptr;
6767+ ptr += sizeof(kiss_fft_cpx) * dimprod;
6868+6969+ for (i=0;i<ndims;++i) {
7070+ size_t len;
7171+ st->dims[i] = dims[i];
7272+ kiss_fft_alloc (st->dims[i], inverse_fft, NULL, &len);
7373+ st->states[i] = kiss_fft_alloc (st->dims[i], inverse_fft, ptr,&len);
7474+ ptr += len;
7575+ }
7676+ /*
7777+Hi there!
7878+7979+If you're looking at this particular code, it probably means you've got a brain-dead bounds checker
8080+that thinks the above code overwrites the end of the array.
8181+8282+It doesn't.
8383+8484+-- Mark
8585+8686+P.S.
8787+The below code might give you some warm fuzzies and help convince you.
8888+ */
8989+ if ( ptr - (char*)st != (int)memneeded ) {
9090+ fprintf(stderr,
9191+ "################################################################################\n"
9292+ "Internal error! Memory allocation miscalculation\n"
9393+ "################################################################################\n"
9494+ );
9595+ }
9696+ return st;
9797+}
9898+9999+/*
100100+ This works by tackling one dimension at a time.
101101+102102+ In effect,
103103+ Each stage starts out by reshaping the matrix into a DixSi 2d matrix.
104104+ A Di-sized fft is taken of each column, transposing the matrix as it goes.
105105+106106+Here's a 3-d example:
107107+Take a 2x3x4 matrix, laid out in memory as a contiguous buffer
108108+ [ [ [ a b c d ] [ e f g h ] [ i j k l ] ]
109109+ [ [ m n o p ] [ q r s t ] [ u v w x ] ] ]
110110+111111+Stage 0 ( D=2): treat the buffer as a 2x12 matrix
112112+ [ [a b ... k l]
113113+ [m n ... w x] ]
114114+115115+ FFT each column with size 2.
116116+ Transpose the matrix at the same time using kiss_fft_stride.
117117+118118+ [ [ a+m a-m ]
119119+ [ b+n b-n]
120120+ ...
121121+ [ k+w k-w ]
122122+ [ l+x l-x ] ]
123123+124124+ Note fft([x y]) == [x+y x-y]
125125+126126+Stage 1 ( D=3) treats the buffer (the output of stage D=2) as an 3x8 matrix,
127127+ [ [ a+m a-m b+n b-n c+o c-o d+p d-p ]
128128+ [ e+q e-q f+r f-r g+s g-s h+t h-t ]
129129+ [ i+u i-u j+v j-v k+w k-w l+x l-x ] ]
130130+131131+ And perform FFTs (size=3) on each of the columns as above, transposing
132132+ the matrix as it goes. The output of stage 1 is
133133+ (Legend: ap = [ a+m e+q i+u ]
134134+ am = [ a-m e-q i-u ] )
135135+136136+ [ [ sum(ap) fft(ap)[0] fft(ap)[1] ]
137137+ [ sum(am) fft(am)[0] fft(am)[1] ]
138138+ [ sum(bp) fft(bp)[0] fft(bp)[1] ]
139139+ [ sum(bm) fft(bm)[0] fft(bm)[1] ]
140140+ [ sum(cp) fft(cp)[0] fft(cp)[1] ]
141141+ [ sum(cm) fft(cm)[0] fft(cm)[1] ]
142142+ [ sum(dp) fft(dp)[0] fft(dp)[1] ]
143143+ [ sum(dm) fft(dm)[0] fft(dm)[1] ] ]
144144+145145+Stage 2 ( D=4) treats this buffer as a 4*6 matrix,
146146+ [ [ sum(ap) fft(ap)[0] fft(ap)[1] sum(am) fft(am)[0] fft(am)[1] ]
147147+ [ sum(bp) fft(bp)[0] fft(bp)[1] sum(bm) fft(bm)[0] fft(bm)[1] ]
148148+ [ sum(cp) fft(cp)[0] fft(cp)[1] sum(cm) fft(cm)[0] fft(cm)[1] ]
149149+ [ sum(dp) fft(dp)[0] fft(dp)[1] sum(dm) fft(dm)[0] fft(dm)[1] ] ]
150150+151151+ Then FFTs each column, transposing as it goes.
152152+153153+ The resulting matrix is the 3d FFT of the 2x3x4 input matrix.
154154+155155+ Note as a sanity check that the first element of the final
156156+ stage's output (DC term) is
157157+ sum( [ sum(ap) sum(bp) sum(cp) sum(dp) ] )
158158+ , i.e. the summation of all 24 input elements.
159159+160160+*/
161161+void kiss_fftnd(kiss_fftnd_cfg st,const kiss_fft_cpx *fin,kiss_fft_cpx *fout)
162162+{
163163+ int i,k;
164164+ const kiss_fft_cpx * bufin=fin;
165165+ kiss_fft_cpx * bufout;
166166+167167+ /*arrange it so the last bufout == fout*/
168168+ if ( st->ndims & 1 ) {
169169+ bufout = fout;
170170+ if (fin==fout) {
171171+ memcpy( st->tmpbuf, fin, sizeof(kiss_fft_cpx) * st->dimprod );
172172+ bufin = st->tmpbuf;
173173+ }
174174+ }else
175175+ bufout = st->tmpbuf;
176176+177177+ for ( k=0; k < st->ndims; ++k) {
178178+ int curdim = st->dims[k];
179179+ int stride = st->dimprod / curdim;
180180+181181+ for ( i=0 ; i<stride ; ++i )
182182+ kiss_fft_stride( st->states[k], bufin+i , bufout+i*curdim, stride );
183183+184184+ /*toggle back and forth between the two buffers*/
185185+ if (bufout == st->tmpbuf){
186186+ bufout = fout;
187187+ bufin = st->tmpbuf;
188188+ }else{
189189+ bufout = st->tmpbuf;
190190+ bufin = fout;
191191+ }
192192+ }
193193+}
···11+/*
22+Copyright (c) 2003-2004, Mark Borgerding
33+44+All rights reserved.
55+66+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
77+88+ * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
99+ * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
1010+ * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
1111+1212+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1313+*/
1414+1515+#include "kiss_fftndr.h"
1616+#include "_kiss_fft_guts.h"
1717+#define MAX(x,y) ( ( (x)<(y) )?(y):(x) )
1818+1919+struct kiss_fftndr_state
2020+{
2121+ int dimReal;
2222+ int dimOther;
2323+ kiss_fftr_cfg cfg_r;
2424+ kiss_fftnd_cfg cfg_nd;
2525+ void * tmpbuf;
2626+};
2727+2828+static int prod(const int *dims, int ndims)
2929+{
3030+ int x=1;
3131+ while (ndims--)
3232+ x *= *dims++;
3333+ return x;
3434+}
3535+3636+kiss_fftndr_cfg kiss_fftndr_alloc(const int *dims,int ndims,int inverse_fft,void*mem,size_t*lenmem)
3737+{
3838+ kiss_fftndr_cfg st = NULL;
3939+ size_t nr=0 , nd=0,ntmp=0;
4040+ int dimReal = dims[ndims-1];
4141+ int dimOther = prod(dims,ndims-1);
4242+ size_t memneeded;
4343+4444+ (void)kiss_fftr_alloc(dimReal,inverse_fft,NULL,&nr);
4545+ (void)kiss_fftnd_alloc(dims,ndims-1,inverse_fft,NULL,&nd);
4646+ ntmp =
4747+ MAX( 2*dimOther , dimReal+2) * sizeof(kiss_fft_scalar) // freq buffer for one pass
4848+ + dimOther*(dimReal+2) * sizeof(kiss_fft_scalar); // large enough to hold entire input in case of in-place
4949+5050+ memneeded = sizeof( struct kiss_fftndr_state ) + nr + nd + ntmp;
5151+5252+ if (lenmem==NULL) {
5353+ st = (kiss_fftndr_cfg) malloc(memneeded);
5454+ }else{
5555+ if (*lenmem >= memneeded)
5656+ st = (kiss_fftndr_cfg)mem;
5757+ *lenmem = memneeded;
5858+ }
5959+ if (st==NULL)
6060+ return NULL;
6161+ memset( st , 0 , memneeded);
6262+6363+ st->dimReal = dimReal;
6464+ st->dimOther = dimOther;
6565+ st->cfg_r = kiss_fftr_alloc( dimReal,inverse_fft,st+1,&nr);
6666+ st->cfg_nd = kiss_fftnd_alloc(dims,ndims-1,inverse_fft, ((char*) st->cfg_r)+nr,&nd);
6767+ st->tmpbuf = (char*)st->cfg_nd + nd;
6868+6969+ return st;
7070+}
7171+7272+void kiss_fftndr(kiss_fftndr_cfg st,const kiss_fft_scalar *timedata,kiss_fft_cpx *freqdata)
7373+{
7474+ int k1,k2;
7575+ int dimReal = st->dimReal;
7676+ int dimOther = st->dimOther;
7777+ int nrbins = dimReal/2+1;
7878+7979+ kiss_fft_cpx * tmp1 = (kiss_fft_cpx*)st->tmpbuf;
8080+ kiss_fft_cpx * tmp2 = tmp1 + MAX(nrbins,dimOther);
8181+8282+ // timedata is N0 x N1 x ... x Nk real
8383+8484+ // take a real chunk of data, fft it and place the output at correct intervals
8585+ for (k1=0;k1<dimOther;++k1) {
8686+ kiss_fftr( st->cfg_r, timedata + k1*dimReal , tmp1 ); // tmp1 now holds nrbins complex points
8787+ for (k2=0;k2<nrbins;++k2)
8888+ tmp2[ k2*dimOther+k1 ] = tmp1[k2];
8989+ }
9090+9191+ for (k2=0;k2<nrbins;++k2) {
9292+ kiss_fftnd(st->cfg_nd, tmp2+k2*dimOther, tmp1); // tmp1 now holds dimOther complex points
9393+ for (k1=0;k1<dimOther;++k1)
9494+ freqdata[ k1*(nrbins) + k2] = tmp1[k1];
9595+ }
9696+}
9797+9898+void kiss_fftndri(kiss_fftndr_cfg st,const kiss_fft_cpx *freqdata,kiss_fft_scalar *timedata)
9999+{
100100+ int k1,k2;
101101+ int dimReal = st->dimReal;
102102+ int dimOther = st->dimOther;
103103+ int nrbins = dimReal/2+1;
104104+ kiss_fft_cpx * tmp1 = (kiss_fft_cpx*)st->tmpbuf;
105105+ kiss_fft_cpx * tmp2 = tmp1 + MAX(nrbins,dimOther);
106106+107107+ for (k2=0;k2<nrbins;++k2) {
108108+ for (k1=0;k1<dimOther;++k1)
109109+ tmp1[k1] = freqdata[ k1*(nrbins) + k2 ];
110110+ kiss_fftnd(st->cfg_nd, tmp1, tmp2+k2*dimOther);
111111+ }
112112+113113+ for (k1=0;k1<dimOther;++k1) {
114114+ for (k2=0;k2<nrbins;++k2)
115115+ tmp1[k2] = tmp2[ k2*dimOther+k1 ];
116116+ kiss_fftri( st->cfg_r,tmp1,timedata + k1*dimReal);
117117+ }
118118+}
+47
tools/kiss_fftndr.h
···11+#ifndef KISS_NDR_H
22+#define KISS_NDR_H
33+44+#include "kiss_fft.h"
55+#include "kiss_fftr.h"
66+#include "kiss_fftnd.h"
77+88+#ifdef __cplusplus
99+extern "C" {
1010+#endif
1111+1212+typedef struct kiss_fftndr_state *kiss_fftndr_cfg;
1313+1414+1515+kiss_fftndr_cfg kiss_fftndr_alloc(const int *dims,int ndims,int inverse_fft,void*mem,size_t*lenmem);
1616+/*
1717+ dims[0] must be even
1818+1919+ If you don't care to allocate space, use mem = lenmem = NULL
2020+*/
2121+2222+2323+void kiss_fftndr(
2424+ kiss_fftndr_cfg cfg,
2525+ const kiss_fft_scalar *timedata,
2626+ kiss_fft_cpx *freqdata);
2727+/*
2828+ input timedata has dims[0] X dims[1] X ... X dims[ndims-1] scalar points
2929+ output freqdata has dims[0] X dims[1] X ... X dims[ndims-1]/2+1 complex points
3030+*/
3131+3232+void kiss_fftndri(
3333+ kiss_fftndr_cfg cfg,
3434+ const kiss_fft_cpx *freqdata,
3535+ kiss_fft_scalar *timedata);
3636+/*
3737+ input and output dimensions are the exact opposite of kiss_fftndr
3838+*/
3939+4040+4141+#define kiss_fftr_free free
4242+4343+#ifdef __cplusplus
4444+}
4545+#endif
4646+4747+#endif
+159
tools/kiss_fftr.c
···11+/*
22+Copyright (c) 2003-2004, Mark Borgerding
33+44+All rights reserved.
55+66+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
77+88+ * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
99+ * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
1010+ * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
1111+1212+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1313+*/
1414+1515+#include "kiss_fftr.h"
1616+#include "_kiss_fft_guts.h"
1717+1818+struct kiss_fftr_state{
1919+ kiss_fft_cfg substate;
2020+ kiss_fft_cpx * tmpbuf;
2121+ kiss_fft_cpx * super_twiddles;
2222+#ifdef USE_SIMD
2323+ void * pad;
2424+#endif
2525+};
2626+2727+kiss_fftr_cfg kiss_fftr_alloc(int nfft,int inverse_fft,void * mem,size_t * lenmem)
2828+{
2929+ int i;
3030+ kiss_fftr_cfg st = NULL;
3131+ size_t subsize, memneeded;
3232+3333+ if (nfft & 1) {
3434+ fprintf(stderr,"Real FFT optimization must be even.\n");
3535+ return NULL;
3636+ }
3737+ nfft >>= 1;
3838+3939+ kiss_fft_alloc (nfft, inverse_fft, NULL, &subsize);
4040+ memneeded = sizeof(struct kiss_fftr_state) + subsize + sizeof(kiss_fft_cpx) * ( nfft * 3 / 2);
4141+4242+ if (lenmem == NULL) {
4343+ st = (kiss_fftr_cfg) KISS_FFT_MALLOC (memneeded);
4444+ } else {
4545+ if (*lenmem >= memneeded)
4646+ st = (kiss_fftr_cfg) mem;
4747+ *lenmem = memneeded;
4848+ }
4949+ if (!st)
5050+ return NULL;
5151+5252+ st->substate = (kiss_fft_cfg) (st + 1); /*just beyond kiss_fftr_state struct */
5353+ st->tmpbuf = (kiss_fft_cpx *) (((char *) st->substate) + subsize);
5454+ st->super_twiddles = st->tmpbuf + nfft;
5555+ kiss_fft_alloc(nfft, inverse_fft, st->substate, &subsize);
5656+5757+ for (i = 0; i < nfft/2; ++i) {
5858+ double phase =
5959+ -3.14159265358979323846264338327 * ((double) (i+1) / nfft + .5);
6060+ if (inverse_fft)
6161+ phase *= -1;
6262+ kf_cexp (st->super_twiddles+i,phase);
6363+ }
6464+ return st;
6565+}
6666+6767+void kiss_fftr(kiss_fftr_cfg st,const kiss_fft_scalar *timedata,kiss_fft_cpx *freqdata)
6868+{
6969+ /* input buffer timedata is stored row-wise */
7070+ int k,ncfft;
7171+ kiss_fft_cpx fpnk,fpk,f1k,f2k,tw,tdc;
7272+7373+ if ( st->substate->inverse) {
7474+ fprintf(stderr,"kiss fft usage error: improper alloc\n");
7575+ exit(1);
7676+ }
7777+7878+ ncfft = st->substate->nfft;
7979+8080+ /*perform the parallel fft of two real signals packed in real,imag*/
8181+ kiss_fft( st->substate , (const kiss_fft_cpx*)timedata, st->tmpbuf );
8282+ /* The real part of the DC element of the frequency spectrum in st->tmpbuf
8383+ * contains the sum of the even-numbered elements of the input time sequence
8484+ * The imag part is the sum of the odd-numbered elements
8585+ *
8686+ * The sum of tdc.r and tdc.i is the sum of the input time sequence.
8787+ * yielding DC of input time sequence
8888+ * The difference of tdc.r - tdc.i is the sum of the input (dot product) [1,-1,1,-1...
8989+ * yielding Nyquist bin of input time sequence
9090+ */
9191+9292+ tdc.r = st->tmpbuf[0].r;
9393+ tdc.i = st->tmpbuf[0].i;
9494+ C_FIXDIV(tdc,2);
9595+ CHECK_OVERFLOW_OP(tdc.r ,+, tdc.i);
9696+ CHECK_OVERFLOW_OP(tdc.r ,-, tdc.i);
9797+ freqdata[0].r = tdc.r + tdc.i;
9898+ freqdata[ncfft].r = tdc.r - tdc.i;
9999+#ifdef USE_SIMD
100100+ freqdata[ncfft].i = freqdata[0].i = _mm_set1_ps(0);
101101+#else
102102+ freqdata[ncfft].i = freqdata[0].i = 0;
103103+#endif
104104+105105+ for ( k=1;k <= ncfft/2 ; ++k ) {
106106+ fpk = st->tmpbuf[k];
107107+ fpnk.r = st->tmpbuf[ncfft-k].r;
108108+ fpnk.i = - st->tmpbuf[ncfft-k].i;
109109+ C_FIXDIV(fpk,2);
110110+ C_FIXDIV(fpnk,2);
111111+112112+ C_ADD( f1k, fpk , fpnk );
113113+ C_SUB( f2k, fpk , fpnk );
114114+ C_MUL( tw , f2k , st->super_twiddles[k-1]);
115115+116116+ freqdata[k].r = HALF_OF(f1k.r + tw.r);
117117+ freqdata[k].i = HALF_OF(f1k.i + tw.i);
118118+ freqdata[ncfft-k].r = HALF_OF(f1k.r - tw.r);
119119+ freqdata[ncfft-k].i = HALF_OF(tw.i - f1k.i);
120120+ }
121121+}
122122+123123+void kiss_fftri(kiss_fftr_cfg st,const kiss_fft_cpx *freqdata,kiss_fft_scalar *timedata)
124124+{
125125+ /* input buffer timedata is stored row-wise */
126126+ int k, ncfft;
127127+128128+ if (st->substate->inverse == 0) {
129129+ fprintf (stderr, "kiss fft usage error: improper alloc\n");
130130+ exit (1);
131131+ }
132132+133133+ ncfft = st->substate->nfft;
134134+135135+ st->tmpbuf[0].r = freqdata[0].r + freqdata[ncfft].r;
136136+ st->tmpbuf[0].i = freqdata[0].r - freqdata[ncfft].r;
137137+ C_FIXDIV(st->tmpbuf[0],2);
138138+139139+ for (k = 1; k <= ncfft / 2; ++k) {
140140+ kiss_fft_cpx fk, fnkc, fek, fok, tmp;
141141+ fk = freqdata[k];
142142+ fnkc.r = freqdata[ncfft - k].r;
143143+ fnkc.i = -freqdata[ncfft - k].i;
144144+ C_FIXDIV( fk , 2 );
145145+ C_FIXDIV( fnkc , 2 );
146146+147147+ C_ADD (fek, fk, fnkc);
148148+ C_SUB (tmp, fk, fnkc);
149149+ C_MUL (fok, tmp, st->super_twiddles[k-1]);
150150+ C_ADD (st->tmpbuf[k], fek, fok);
151151+ C_SUB (st->tmpbuf[ncfft - k], fek, fok);
152152+#ifdef USE_SIMD
153153+ st->tmpbuf[ncfft - k].i *= _mm_set1_ps(-1.0);
154154+#else
155155+ st->tmpbuf[ncfft - k].i *= -1;
156156+#endif
157157+ }
158158+ kiss_fft (st->substate, st->tmpbuf, (kiss_fft_cpx *) timedata);
159159+}
+46
tools/kiss_fftr.h
···11+#ifndef KISS_FTR_H
22+#define KISS_FTR_H
33+44+#include "kiss_fft.h"
55+#ifdef __cplusplus
66+extern "C" {
77+#endif
88+99+1010+/*
1111+1212+ Real optimized version can save about 45% cpu time vs. complex fft of a real seq.
1313+1414+1515+1616+ */
1717+1818+typedef struct kiss_fftr_state *kiss_fftr_cfg;
1919+2020+2121+kiss_fftr_cfg kiss_fftr_alloc(int nfft,int inverse_fft,void * mem, size_t * lenmem);
2222+/*
2323+ nfft must be even
2424+2525+ If you don't care to allocate space, use mem = lenmem = NULL
2626+*/
2727+2828+2929+void kiss_fftr(kiss_fftr_cfg cfg,const kiss_fft_scalar *timedata,kiss_fft_cpx *freqdata);
3030+/*
3131+ input timedata has nfft scalar points
3232+ output freqdata has nfft/2+1 complex points
3333+*/
3434+3535+void kiss_fftri(kiss_fftr_cfg cfg,const kiss_fft_cpx *freqdata,kiss_fft_scalar *timedata);
3636+/*
3737+ input freqdata has nfft/2+1 complex points
3838+ output timedata has nfft scalar points
3939+*/
4040+4141+#define kiss_fftr_free free
4242+4343+#ifdef __cplusplus
4444+}
4545+#endif
4646+#endif
+235
tools/psdpng.c
···11+/*
22+Copyright (c) 2003-2004, Mark Borgerding
33+44+All rights reserved.
55+66+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
77+88+ * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
99+ * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
1010+ * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
1111+1212+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1313+*/
1414+1515+#include <stdlib.h>
1616+#include <math.h>
1717+#include <stdio.h>
1818+#include <string.h>
1919+#include <unistd.h>
2020+#include <png.h>
2121+2222+#include "kiss_fft.h"
2323+#include "kiss_fftr.h"
2424+2525+int nfft=1024;
2626+FILE * fin=NULL;
2727+FILE * fout=NULL;
2828+2929+int navg=20;
3030+int remove_dc=0;
3131+int nrows=0;
3232+float * vals=NULL;
3333+int stereo=0;
3434+3535+static
3636+void config(int argc,char** argv)
3737+{
3838+ while (1) {
3939+ int c = getopt (argc, argv, "n:r:as");
4040+ if (c == -1)
4141+ break;
4242+ switch (c) {
4343+ case 'n': nfft=(int)atoi(optarg);break;
4444+ case 'r': navg=(int)atoi(optarg);break;
4545+ case 'a': remove_dc=1;break;
4646+ case 's': stereo=1;break;
4747+ case '?':
4848+ fprintf (stderr, "usage options:\n"
4949+ "\t-n d: fft dimension(s) [1024]\n"
5050+ "\t-r d: number of rows to average [20]\n"
5151+ "\t-a : remove average from each fft buffer\n"
5252+ "\t-s : input is stereo, channels will be combined before fft\n"
5353+ "16 bit machine format real input is assumed\n"
5454+ );
5555+ default:
5656+ fprintf (stderr, "bad %c\n", c);
5757+ exit (1);
5858+ break;
5959+ }
6060+ }
6161+ if ( optind < argc ) {
6262+ if (strcmp("-",argv[optind]) !=0)
6363+ fin = fopen(argv[optind],"rb");
6464+ ++optind;
6565+ }
6666+6767+ if ( optind < argc ) {
6868+ if ( strcmp("-",argv[optind]) !=0 )
6969+ fout = fopen(argv[optind],"wb");
7070+ ++optind;
7171+ }
7272+ if (fin==NULL)
7373+ fin=stdin;
7474+ if (fout==NULL)
7575+ fout=stdout;
7676+}
7777+7878+#define CHECKNULL(p) if ( (p)==NULL ) do { fprintf(stderr,"CHECKNULL failed @ %s(%d): %s\n",__FILE__,__LINE__,#p );exit(1);} while(0)
7979+8080+typedef struct
8181+{
8282+ png_byte r;
8383+ png_byte g;
8484+ png_byte b;
8585+} rgb_t;
8686+8787+static
8888+void val2rgb(float x,rgb_t *p)
8989+{
9090+ const double pi = 3.14159265358979;
9191+ p->g = (int)(255*sin(x*pi));
9292+ p->r = (int)(255*abs(sin(x*pi*3/2)));
9393+ p->b = (int)(255*abs(sin(x*pi*5/2)));
9494+ //fprintf(stderr,"%.2f : %d,%d,%d\n",x,(int)p->r,(int)p->g,(int)p->b);
9595+}
9696+9797+static
9898+void cpx2pixels(rgb_t * res,const float * fbuf,size_t n)
9999+{
100100+ size_t i;
101101+ float minval,maxval,valrange;
102102+ minval=maxval=fbuf[0];
103103+104104+ for (i = 0; i < n; ++i) {
105105+ if (fbuf[i] > maxval) maxval = fbuf[i];
106106+ if (fbuf[i] < minval) minval = fbuf[i];
107107+ }
108108+109109+ fprintf(stderr,"min ==%f,max=%f\n",minval,maxval);
110110+ valrange = maxval-minval;
111111+ if (valrange == 0) {
112112+ fprintf(stderr,"min == max == %f\n",minval);
113113+ exit (1);
114114+ }
115115+116116+ for (i = 0; i < n; ++i)
117117+ val2rgb( (fbuf[i] - minval)/valrange , res+i );
118118+}
119119+120120+static
121121+void transform_signal(void)
122122+{
123123+ short *inbuf;
124124+ kiss_fftr_cfg cfg=NULL;
125125+ kiss_fft_scalar *tbuf;
126126+ kiss_fft_cpx *fbuf;
127127+ float *mag2buf;
128128+ int i;
129129+ int n;
130130+ int avgctr=0;
131131+132132+ int nfreqs=nfft/2+1;
133133+134134+ CHECKNULL( cfg=kiss_fftr_alloc(nfft,0,0,0) );
135135+ CHECKNULL( inbuf=(short*)malloc(sizeof(short)*2*nfft ) );
136136+ CHECKNULL( tbuf=(kiss_fft_scalar*)malloc(sizeof(kiss_fft_scalar)*nfft ) );
137137+ CHECKNULL( fbuf=(kiss_fft_cpx*)malloc(sizeof(kiss_fft_cpx)*nfreqs ) );
138138+ CHECKNULL( mag2buf=(float*)malloc(sizeof(float)*nfreqs ) );
139139+140140+ memset(mag2buf,0,sizeof(mag2buf)*nfreqs);
141141+142142+ while (1) {
143143+ if (stereo) {
144144+ n = fread(inbuf,sizeof(short)*2,nfft,fin);
145145+ if (n != nfft )
146146+ break;
147147+ for (i=0;i<nfft;++i)
148148+ tbuf[i] = inbuf[2*i] + inbuf[2*i+1];
149149+ }else{
150150+ n = fread(inbuf,sizeof(short),nfft,fin);
151151+ if (n != nfft )
152152+ break;
153153+ for (i=0;i<nfft;++i)
154154+ tbuf[i] = inbuf[i];
155155+ }
156156+157157+ if (remove_dc) {
158158+ float avg = 0;
159159+ for (i=0;i<nfft;++i) avg += tbuf[i];
160160+ avg /= nfft;
161161+ for (i=0;i<nfft;++i) tbuf[i] -= (kiss_fft_scalar)avg;
162162+ }
163163+164164+ /* do FFT */
165165+ kiss_fftr(cfg,tbuf,fbuf);
166166+167167+ for (i=0;i<nfreqs;++i)
168168+ mag2buf[i] += fbuf[i].r * fbuf[i].r + fbuf[i].i * fbuf[i].i;
169169+170170+ if (++avgctr == navg) {
171171+ avgctr=0;
172172+ ++nrows;
173173+ vals = (float*)realloc(vals,sizeof(float)*nrows*nfreqs);
174174+ float eps = 1;
175175+ for (i=0;i<nfreqs;++i)
176176+ vals[(nrows - 1) * nfreqs + i] = 10 * log10 ( mag2buf[i] / navg + eps );
177177+ memset(mag2buf,0,sizeof(mag2buf[0])*nfreqs);
178178+ }
179179+ }
180180+181181+ free(cfg);
182182+ free(inbuf);
183183+ free(tbuf);
184184+ free(fbuf);
185185+ free(mag2buf);
186186+}
187187+188188+static
189189+void make_png(void)
190190+{
191191+ png_bytepp row_pointers=NULL;
192192+ rgb_t * row_data=NULL;
193193+ int i;
194194+ int nfreqs = nfft/2+1;
195195+196196+ png_structp png_ptr=NULL;
197197+ png_infop info_ptr=NULL;
198198+199199+ CHECKNULL( png_ptr = png_create_write_struct (PNG_LIBPNG_VER_STRING,0,0,0) );
200200+ CHECKNULL( info_ptr = png_create_info_struct(png_ptr) );
201201+202202+203203+ png_init_io(png_ptr, fout );
204204+ png_set_IHDR(png_ptr, info_ptr ,nfreqs,nrows,8,PNG_COLOR_TYPE_RGB,PNG_INTERLACE_NONE,PNG_COMPRESSION_TYPE_DEFAULT,PNG_FILTER_TYPE_DEFAULT );
205205+206206+207207+ row_data = (rgb_t*)malloc(sizeof(rgb_t) * nrows * nfreqs) ;
208208+ cpx2pixels(row_data, vals, nfreqs*nrows );
209209+210210+ row_pointers = realloc(row_pointers, nrows*sizeof(png_bytep));
211211+ for (i=0;i<nrows;++i) {
212212+ row_pointers[i] = (png_bytep)(row_data + i*nfreqs);
213213+ }
214214+ png_set_rows(png_ptr, info_ptr, row_pointers);
215215+216216+217217+ fprintf(stderr,"creating %dx%d png\n",nfreqs,nrows);
218218+ fprintf(stderr,"bitdepth %d \n",png_get_bit_depth(png_ptr,info_ptr ) );
219219+220220+ png_write_png(png_ptr, info_ptr, PNG_TRANSFORM_IDENTITY , NULL);
221221+222222+}
223223+224224+int main(int argc,char ** argv)
225225+{
226226+ config(argc,argv);
227227+228228+ transform_signal();
229229+230230+ make_png();
231231+232232+ if (fout!=stdout) fclose(fout);
233233+ if (fin!=stdin) fclose(fin);
234234+ return 0;
235235+}