This chapter describes the format in which multi-dimensional arrays are
stored. We felt that a detailed discussion of this topic was necessary,
since it is often a source of confusion among users and several
different formats are common. Although the comments below refer to
fftwnd
, they are also applicable to the rfftwnd
routines.
The multi-dimensional arrays passed to fftwnd
are expected to be
stored as a single contiguous block in row-major order (sometimes
called "C order"). Basically, this means that as you step through
adjacent memory locations, the first dimension's index varies most
slowly and the last dimension's index varies most quickly.
To be more explicit, let us consider an array of rank d whose dimensions are n1 x n2 x n3 x ... x nd. Now, we specify a location in the array by a sequence of (zero-based) indices, one for each dimension: (i1, i2, i3,..., id). If the array is stored in row-major order, then this element is located at the position id + nd * (id-1 + nd-1 * (... + n2 * i1)).
Note that each element of the array must be of type fftw_complex
;
i.e. a (real, imaginary) pair of (double-precision) numbers. Note also
that, in fftwnd
, the expression above is multiplied by the stride
to get the actual array index--this is useful in situations where each
element of the multi-dimensional array is actually a data structure or
another array, and you just want to transform a single field. In most
cases, however, you use a stride of 1.
Readers from the Fortran world are used to arrays stored in column-major order (sometimes called "Fortran order"). This is essentially the exact opposite of row-major order in that, here, the first dimension's index varies most quickly.
If you have an array stored in column-major order and wish to transform
it using fftwnd
, it is quite easy to do. When creating the plan,
simply pass the dimensions of the array to fftwnd_create_plan
in
reverse order. For example, if your array is a rank three
N x M x L
matrix in column-major order, you should pass the
dimensions of the array as if it were an L x M x N
matrix (which
it is, from perspective of fftwnd
).
Multi-dimensional arrays declared statically (that is, at compile time,
not necessarily with the static
keyword) in C are already
in row-major order. You don't have to do anything special to transform
them. (See Section Complex Multi-dimensional Transforms Tutorial, for an
example of this sort of code.)
Often, especially for large arrays, it is desirable to allocate the arrays dynamically, at runtime. This isn't too hard to do, although it is not as straightforward for multi-dimensional arrays as it is for one-dimensional arrays.
Creating the array is simple: using a dynamic-allocation routine like
malloc
, allocate an array big enough to store N fftw_complex
values, where N is the product of the sizes of the array dimensions
(i.e. the total number of complex values in the array). For example,
here is code to allocate a 5x12x27 rank 3 array:
fftw_complex *an_array; an_array = (fftw_complex *) malloc(5 * 12 * 27 * sizeof(fftw_complex));
Accessing the array elements, however, is more tricky--you can't simply
use multiple applications of the `[]' operator like you could for
static arrays. Instead, you have to explicitly compute the offset into
the array using the formula given earlier for row-major arrays. For
example, to reference the (i,j,k)-th element of the array
allocated above, you would use the expression an_array[k + 27 * (j
+ 12 * i)]
.
This pain can be alleviated somewhat by defining appropriate macros, or, in C++, creating a class and overloading the `()' operator.
A different method for allocating multi-dimensional arrays in C is often
suggested which, though occasionally more convenient, produces very poor
performance. More to the point, it is incompatible with fftwnd
;
using it will cause FFTW to die a painful death. We discuss the
technique here, however, because it is so commonly known and used. This
method is to create arrays of pointers of arrays of pointers
of ...etcetera. For example, the analogue in this method to the example
above is:
int i,j; fftw_complex ***a_slow_array; /* another way to make a 5x12x27 array */ a_slow_array = (fftw_complex ***) malloc(5 * sizeof(fftw_complex **)); for (i = 0; i < 5; ++i) { a_slow_array[i] = (fftw_complex **) malloc(12 * sizeof(fftw_complex *)); for (j = 0; j < 12; ++j) a_slow_array[i][j] = (fftw_complex *) malloc(27 * sizeof(fftw_complex)); }
As you can see, this sort of array is inconvenient to allocate (and
deallocate). On the other hand, it has the advantage that the
(i,j,k)-th element can be referenced simply by
a_slow_array[i][j][k]
.
Even if compatibility with FFTW were not a concern, you should not
allocate multi-dimensional arrays in this way when performance is
important. Expressions like a_slow_array[i][j][k]
really
correspond to *(*(*(a_slow_array + i) + j) + k)
, whereas
expressions like an_array[k + 27 * (j + 12 * i)]
from before
correspond to *(an_array + k + 27 * (j + 12 * i))
. As you can
see, the former expression substitutes two pointer dereferences for the
two multiplications in the latter expression. A pointer dereference,
however, translates into a load from memory and, as everyone should
know, a memory access is about the most expensive instruction you can
possibly execute--much more expensive than a multiplication, often even
if the memory access is in cache.
If you want to maximize convenience in accessing the array and
performance is not critical, but still want to pass the array to FFTW,
you can use a hybrid technique. Allocate the array as one contiguous
block, but also declare an array of arrays of pointers that points to
appropriate places in the block. That sort of trick is beyond the scope
of this documentation; for more information on multi-dimensional arrays
in C, see the comp.lang.c
FAQ.
Go to the first, previous, next, last section, table of contents.