The operations on arrays that were mentioned in the previous section (element-wise addition, multiplication, etc.) all share some features -- they all follow similar rules for broadcasting, coercion and "element-wise operation". Just like standard addition is available in Python through the add function in the operator module, array operations are available through callable objects as well. Thus, the following objects are available in the Numeric module:
All of these ufuncs can be used as functions. For example, to use add , which is a binary ufunc (i.e. it takes two arguments), one can do either of:
>>>
a = arange(10)
>>>
print add(a,a)
In other words, the + operator on arrays performs exactly the same thing as the add ufunc when operated on arrays. For a unary ufunc such as sin , one can do, e.g.:
>>>
a = arange(10)
>>>
print sin(a)
[ 0. 0.84147098 0.90929743 0.14112001 -0.7568025 -0.95892427
-0.2794155 0.6569866 0.98935825 0.41211849]
Unary ufuncs return arrays with the same shape as their arguments, but with the contents corresponding to the corresponding mathematical function applied to each element (sin(0)=0, sin(1)=0.84147098, etc.).
There are three additional features of ufuncs which make them different from standard Python functions. They can operate on any Python sequence in addition to arrays; they can take an "output" argument; they have attributes which are themselves callable with arrays and sequences. Each of these will be described in turn.
Ufuncs have so far been described as callable objects which take either one or two arrays as arguments (depending on whether they are unary or binary). In fact, any Python sequence which can be the input to the array() constructor can be used. The return value from ufuncs is always an array. Thus:
In many computations with large sets of numbers, arrays are often used only once. For example, a computation on a large set of numbers could involve the following step
This operation as written needs to create a temporary array to store the results of the computation, and then eventually free the memory used by the original dataset array (provided there are no other references to the data it contains). It is more efficient, both in terms of memory and computation time, to do an "in-place" operation. This can be done by specifying an existing array as the place to store the result of the ufunc. In this example, one can write:
multiply(dataset, 1.20, dataset)
This is not a step to take lightly, however. For example, the "big and slow" version ( dataset = dataset * 1.20 ) and the "small and fast" version above will yield different results in two cases:
>>> a = arange(5, typecode=Float64)
array([ 4.8 , 3.6 , 2.4 , 4.32, 5.76])
This is because the ufunc does not know which arrays share which data, and in this case the overwriting of the data contents follows a different path through the shared data space of the two arrays, thus resulting in strangely distorted data.
If you don't know about the reduce command in Python, review section 5.1.1 of the Python Tutorial ( http://www.python.org/doc/tut/functional.html ). Briefly, reduce is most often used with two arguments, a callable object (such as a function), and a sequence. It calls the callable object with the first two elements of the sequence, then with the result of that operation and the third element, and so on, returning at the end the successive "reduction" of the specified callable object over the sequence elements. Similarly, the reduce method of ufuncs is called with a sequence as an argument, and performs the reduction of that ufunc on the sequence. As an example, adding all of the elements in a rank-1 array can be done with:
When applied to arrays which are of rank greater than one, the reduction proceeds by default along the first axis:
>>> b = array([[1,2,3,4],[6,7,8,9]])
A different axis of reduction can be specified with a second integer argument:
The accumulate ufunc method is simular to reduce , except that it returns an array containing the intermediate results of the reduction:
[ 0 1 3 6 10 15 21 28 36 45] # 0, 0+1, 0+1+2, 0+1+2+3, ... 0+...+9
Table 1 lists all the ufuncs. We will first discuss the mathematical ufuncs, which perform operations very similar to the functions in the math and cmath modules, albeit elementwise, on arrays. These come in two forms, unary and binary:
The following ufuncs apply the predictable functions on their single array arguments, one element at a time: arccos , arccosh , arcsin , arcsinh , arctan , arctanh , cos , cosh , exp , log , log10 , sin , sinh , sqrt , tan , tanh .
[ 1. 0.54030231 -0.41614684 -0.9899925 -0.65364362]
# not a bug, but wraparound: 2*pi%4 is 2.28318531
The conjugate ufunc takes an array of complex numbers and returns the array with entries which are the complex conjugates of the entries in the input array. If it is called with real numbers, a copy of the array is returned unchanged.
These ufuncs take two arrays as arguments, and perform the specified mathematical operation on them, one pair of elements at a time: add , subtract , multiply , divide , remainder , power .
The ``logical'' ufuncs also perform their operations on arrays in elementwise fashion, just like the ``mathematical'' ones.
Two are special ( maximum and miminum ) in that they return arrays with entries taken from their input arrays:
The others all return arrays of 0's or 1's: logical_and , logical_or , logical_xor , logical_not , bitwise_and , bitwise_or , bitwise_xor , bitwise_not .
These are fairly self-explanatory, especially with the associated symbols from the standard Python version of the same operations in Table 1 above. The logical_* ufuncs perform their operations (and, or, etc.) using the truth value of the elements in the array (equality to 0 for numbers and the standard truth test for PyObject arrays). The bitwise_* ufuncs, on the other hand, can be used only with integer arrays (of any word size), and will return integer arrays of the larger bit size of the two input arrays:
We've already discussed how to find out about the contents of arrays based on the indices in the arrays - that's what the various slice mechanisms are for. Often, especially when dealing with the result of computations or data analysis, one needs to ``pick out'' parts of matrices based on the content of those matrices. For example, it might be useful to find out which elements of an array are negative, and which are positive. The comparison ufuncs are designed for just this type of operation. Assume an array with various positive and negative numbers in it (for the sake of the example we'll generate it from scratch):
[[ 0. 0.84147098 0.90929743 0.14112001 -0.7568025 ]
[-0.95892427 -0.2794155 0.6569866 0.98935825 0.41211849]
[-0.54402111 -0.99999021 -0.53657292 0.42016704 0.99060736]
[ 0.65028784 -0.28790332 -0.96139749 -0.75098725 0.14987721]
[ 0.91294525 0.83665564 -0.00885131 -0.8462204 -0.90557836]]
>>> view(greater(greeceBW, .3))
# shows a binary image with white where the pixel value was greater than .3
The comparison functions equal , not_equal , greater , greater_equal , less , and less_equal are invoked by the operators ==, !=, >, >=, <, and <= respectively, but they can also be called directly as functions. Continuing with the preceding example,
This last example has 1's where the corresponding elements are less than or equal to 0, and 0's everywhere else.
To compare an array a with an object b, if b can be converted to an array, the result of the comparison is returned. Otherwise, zero is returned. In particular one can compare arrays of type object. This means that comparing a list and comparing an array can return quite different answers. Since the functional forms such as equal will try to make arrays from their arguments, using equal can result in a different result than using ==.
Numeric defines a few functions which correspond to often-used uses of ufuncs: for example, add.reduce() is synonymous with the sum() utility function:
>>> a = arange(5) # [0 1 2 3 4]
>>> print sum(a) # 0 + 1 + 2 + 3 + 4
Similarly, cumsum is equivalent to add.accumulate (for ``cumulative sum``), product to multiply.reduce , and cumproduct to multiply.accumulate .
Additional ``utility'' functions which are often useful are alltrue and sometrue , which are defined as logical_and.reduce and logical_or.reduce respectively: