8. Array Functions

Most of the useful manipulations on arrays are done with functions. This might be surprising given Python's object-oriented framework, and that many of these functions could have been implemented using methods instead. Choosing functions means that the same procedures can be applied to arbitrary python sequences, not just to arrays. For example, while transpose([[1,2],[3,4]]) works just fine, [[1,2],[3,4]].transpose() can't work. This approach also allows uniformity in interface between functions defined in the Numeric Python system, whether implemented in C or in Python, and functions defined in extension modules. The use of array methods is limited to functionality which depends critically on the implementation details of array objects. Array methods are discussed in the next chapter.

We've already covered two functions which operate on arrays, reshape and resize .

take(a, indices, axis=0)

take is in some ways like the slice operations. It selects the elements of the array it gets as first argument based on the indices it gets as a second argument. Unlike slicing, however, the array returned by take has the same rank as the input array. This is again much easier to understand with an illustration:

 

>>> print a

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]

[15 16 17 18 19]]

>>> print take(a, (0,)) # first row

[ [0 1 2 3 4]]

>>> print take(a, (0,1)) # first and second row

[[0 1 2 3 4]

[5 6 7 8 9]]

>>> print take(a, (0,-1)) # first and last row

[[ 0 1 2 3 4]

[15 16 17 18 19]]

The optional third argument specifies the axis along which the selection occurs, and the default value (as in the examples above) is 0, the first axis. If you want another axis, then you can specify it:

>>> print take(a, (0,), 1) # first column

[[ 0]

[ 5]

[10]

[15]]

>>> print take(a, (0,1), 1) # first and second column

[[ 0 1]

[ 5 6]

[10 11]

[15 16]]

>>> print take(a, (0,-1), 1) # first and last column

[[ 0 4]

[ 5 9]

[10 14]

[15 19]]

This is considered to be a ``structural'' operation, because its result does not depend on the content of the arrays or the result of a computation on those contents but uniquely on the structure of the array. Like all such structural operations, the default axis is 0 (the first rank). I mention it here because later in this tutorial, we will see functions which have a default axis of -1.

Take is often used to create multidimensional arrays with the indices from a rank-1 array. As in the earlier examples, the shape of the array returned by take() is a combination of the shape of its first argument and the shape of the array that elements are "taken" from -- when that array is rank-1, the shape of the returned array has the same shape as the index sequence. This, as with many other facets of Numeric, is best understood by experiment.

>>> x = arange(10) * 100

>>> print x

[ 0 100 200 300 400 500 600 700 800 900]

>>> print take(x, [[2,4],[1,2]])

[[200 400]

[100 200]]

A typical example of using take() is to replace the grey values in an image according to a "translation table". For example, let's consider a brightening of a greyscale image. The view() function defined in the NumTut package automatically scales the input arrays to use the entire range of grey values, except if the input arrays are of typecode 'b' unsigned bytes -- thus to test this brightening function, we'll first start by converting the greyscale floating point array to a greyscale byte array:

>>> BW = (greeceBW*256).astype('b')

>>> view(BW) # shows black and white picture

We then create a table mapping the integers 0-255 to integers 0-255 using a "compressive nonlinearity":

>>> table = (255- arange(256)**2 / 256).astype('b')

>>> view(table) # shows the conversion curve

To do the "taking" into an array of the right kind, we first create a blank image array with the same shape and typecode as the original array:

>>> BW2 = zeros(BW.shape, BW.typecode())

and then perform the take() operation

>>> BW2.flat[:] = take(table, BW.flat)

>>> view(BW2)

put (a, indices, values)

put is the opposite of take . The values of the array a at the locations specified in indices are set to the corresponding value of values . The array a must be a contiguous array. The argument indices can be any integer sequence object with values suitable for indexing into the flat form of a . The argument values must be any sequence of values that can be converted to the typecode of a .

>>> x = arange(6)

>>> put(x, [2,4], [20,40])

>>> print x

[ 0 1 20 3 40 5]

Note that the target array a is not required to be one-dimensional. Since a is contiguous and stored in row-major order, the array indices can be treated as indexing a 's elements in storage order.

The routine put is thus equivalent to the following (although the loop is in C for speed):

ind = array(indices, copy=0)

v = array(values, copy=0).astype(a.typecode())

for i in len(ind): a.flat[i] = v[i]

putmask (a, mask, values)

putmask sets those elements of a for which mask is true to the corresponding value in values. The array a must be contiguous. The argument mask must be an integer sequence of the same size (but not necessarily the same shape) as a . The argument values will be repeated as necessary; in particular it can be a scalar. The array values must be convertible to the type of a .

>>> x=arange(5)
>>> putmask(x, [1,0,1,0,1], [10,20,30,40,50])
>>> print x
[10 1 30 3 50]
>>> putmask(x, [1,0,1,0,1], [-1,-2])
>>> print x
[-1 1 -1 3 -1]

Note how in the last example, the third argument was treated as if it was [-1, -2, -1, -2, -1].

transpose(a, axes=None)

transpose takes an array and returns a new array which corresponds to a with the order of axes specified by the second argument. The default corresponds to flipping the order of all the axes (it is equivalent to a.shape[::-1] if a is the input array).

>>> print a

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]

[15 16 17 18 19]]

>>> print transpose(a)

[[ 0 5 10 15]

[ 1 6 11 16]

[ 2 7 12 17]

[ 3 8 13 18]

[ 4 9 14 19]]

>>> greece.shape # it's a 355x242 RGB picture
(355, 242, 3)

>>> view(greece)

# picture of greek street is shown

>>> view(transpose(greece, (1,0,2))) # swap x and y, not color axis!

# picture of greek street is shown sideways

repeat(a, repeats, axis=0)

repeat takes an array and returns an array with each element in the input array repeated as often as indicated by the corresponding elements in the second array. It operates along the specified axis. So, to stretch an array evenly, one needs the repeats array to contain as many instances of the integer scaling factor as the size of the specified axis:

>>> view(repeat(greece, 2*ones(greece.shape[0]))) # double in X

>>> view(repeat(greece, 2*ones(greece.shape[1]), 1)) # double in Y

choose(a, (b0, ..., bn))

a is an array of integers between 0 and n. The resulting array will have the same shape as a, with element selected from b0,...,bn as indicating by the value of the corresponding element in a.

Assume a is an array a that you want to ``clip'' so that no values are greater than 100.0.

>>> choose(greater(a, 100.0), (a, 100.0))

Everywhere that greater(a, 100.0) is false (ie. 0) this will ``choose'' the corresponding value in a. Everywhere else it will ``choose'' 100.0.

This works as well with arrays. Try to figure out what the following does:

>>> ret = choose(greater_than(a,b), (c,d))

ravel(a)

returns the argument array a as a 1d array. It is equivalent to reshape(a, (-1,)) or a.flat . Unlike a.flat , however, ravel works with non-contiguous arrays.

>>> print x

[[ 0 1 2 3]

[ 5 6 7 8]

[10 11 12 13]]

>>> x.iscontiguous()

0

>>> x.flat

Traceback (innermost last):

File "<stdin>", line 1, in ?

ValueError: flattened indexing only available for contiguous array

>>> ravel(x)

array([ 0, 1, 2, 3, 5, 6, 7, 8, 10, 11, 12, 13])

nonzero(a)

nonzero() returns an array containing the indices of the elements in a that are nonzero. These indices only make sense for 1d arrays, so the function refuses to act on anything else. As of 1.0a5 this function does not work for complex arrays.

where(condition, x, y)

where(condition,x,y) returns an array shaped like condition and has elements of x and y where condition is respectively true or false

compress(condition, a, axis=0)

returns those elements of a corresponding to those elements of condition that are nonzero. condition must be the same size as the given axis of a.

>>> print x

[0 1 2 3]

>>> print greater(x, 2)

[0 0 0 1]

>>> print compress(greater(x, 2), x)

[3]

diagonal(a, k=0, axis1=0, axis2 = 1)

returns the entries along the k th diagonal of a (k is an offset from the main diagonal). This is designed for 2d arrays. For larger arrays, it will return the diagonal of each 2d sub-array.

>>> print x

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]

[15 16 17 18 19]

[20 21 22 23 24]]

>>> print diagonal(x)

[ 0 6 12 18 24]

>>> print diagonal(x, 1)

[ 1 7 13 19]

>>> print diagonal(x, -1)

[ 5 11 17 23]

trace(a, k=0)

returns the sum of the elements in a along the k th diagonal.

>>> print x

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]

[15 16 17 18 19]

[20 21 22 23 24]]

>>> print trace(x) # 0 + 6 + 12 + 18 + 24

60

>>> print trace(x, -1) # 5 + 11 + 17 + 23

56

>>> print trace(x, 1) # 1 + 7 + 13 + 19

40

searchsorted(a, values)

Called with a rank-1 array sorted in ascending order, searchsorted() will return the indices of the positions in a where the corresponding values would fit.

>>> print bin_boundaries

[ 0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]

>>> print data

[ 0.3029573 0.79585496 0.82714031 0.77993884 0.55069605 0.76043182

0.28511823 0.29987358 0.40286206 0.68617903]

>>> print searchsorted(bin_boundaries, data)

[4 8 9 8 6 8 3 3 5 7]

This can be used for example to write a simple histogramming function:

>>> def histogram(a, bins):

... n = searchsorted(sort(a), bins)

... n = concatenate([n, [len(a)]])

... return n[1:]-n[:-1]

...

>>> print histogram([0,0,0,0,0,0,0,.33,.33,.33], arange(0,1.0,.1))

[7 0 0 3 0 0 0 0 0 0]

>>> print histogram(sin(arange(0,10,.2)), arange(-1.2, 1.2, .1))

[0 0 4 2 2 2 0 2 1 2 1 3 1 3 1 3 2 3 2 3 4 9 0 0]

sort(a, axis=-1)

This function returns an array containing a copy of the data in a , with the same shape as a , but with the order of the elements along the specified axis sorted. The shape of the returned array is the same as a 's. Thus, sort(a, 3) will be an array of the same shape as a, where the elements of a have been sorted along the fourth axis.

>>> print data

[[5 0 1 9 8]

[2 5 8 3 2]

[8 0 3 7 0]

[9 6 9 5 0]

[9 0 9 7 7]]

>>> print sort(data) # Axis -1 by default

[[0 1 5 8 9]

[2 2 3 5 8]

[0 0 3 7 8]

[0 5 6 9 9]

[0 7 7 9 9]]

>>> print sort(data, 0)

[[2 0 1 3 0]

[5 0 3 5 0]

[8 0 8 7 2]

[9 5 9 7 7]

[9 6 9 9 8]]

argsort(a, axis=-1)

argsort will return the indices of the elements of a needed to produce sort(a) . In other words, for a rank-1 array, take(a, argsort(a)) == sort(a) .

>>> print data

[5 0 1 9 8]

>>> print sort(data)

[0 1 5 8 9]

>>> print argsort(data)

[1 2 0 4 3]

>>> print take(data, argsort(data))

[0 1 5 8 9]

argmax(a, axis=-1), argmin(a, axis=-1)

The argmax() function returns an array with the arguments of the maximum values of its input array a along the given axis. The returned array will have one less dimension than a. argmin() is just like argmax() , except that it returns the indices of the minima along the given axis.

>>> print data

[[9 6 1 3 0]

[0 0 8 9 1]

[7 4 5 4 0]

[5 2 7 7 1]

[9 9 7 9 7]]

>>> print argmax(data)

[0 3 0 2 0]

>>> print argmax(data, 0)

[0 4 1 1 4]

>>> print argmin(data)

[4 0 4 4 2]

>>> print argmin(data, 0)

[1 1 0 0 0]

fromstring(string, typecode)

Will return the array formed by the binary data given in string of the specified typecode. This is mainly used for reading binary data to and from files, it can also be used to exchange binary data with other modules that use python strings as storage ( e.g. PIL). Note that this representation is dependent on the byte order. To find out the byte ordering used, use the byteswapped() method described on byteswapped().

dot(m1, m2)

The dot() function returns the dot product of m1 and m2 . This is equivalent to matrix multiply for rank-2 arrays (without the transpose). Somebody who does more linear algebra really needs to do this function right some day!

matrixmultiply(m1, m2)

The matrixmultiply(m1, m2) multiplies matrices or matrices and vectors as matrices rather than elementwise. Compare:

>>> print a

[[0 1 2]

[3 4 5]]

>>> print b

[1 2 3]

>>> print a*b

[[ 0 2 6]

[ 3 8 15]]

>>> print matrixmultiply(a,b)

[ 8 26]

clip(m, m_min, m_max)

The clip function creates an array with the same shape and typecode as m, but where every entry in m that is less than m_min is replaced by m_min, and every entry greater than m_max is replaced by m_max. Entries within the range [m_min, m_max] are left unchanged.

>>> a = arange(9, Float)

>>> clip(a, 1.5, 7.5)

1.5000 1.5000 2.0000 3.0000 4.0000 5.0000 6.0000 7.0000 7.5000

indices(shape, typecode=None)

The indices function returns an array corresponding to the shape given. The array returned is an array of a new shape which is based on the specified shape, but has an added dimension of length the number of dimensions in the specified shape. For example, if the shape specified by the shape argument is (3,4), then the shape of the array returned will be (2,3,4) since the length of (3,4) is 2. The contents of the returned arrays are such that the ith subarray (along index 0, the first dimension) contains the indices for that axis of the elements in the array. An example makes things clearer:

>>> i = indices((4,3))

>>> i.shape

(2, 4, 3)

>>> print i[0]

[[0 0 0]

[1 1 1]

[2 2 2]

[3 3 3]]

>>> print i[1]

[[0 1 2]

[0 1 2]

[0 1 2]

[0 1 2]]

So, i[0] has an array of the specified shape, and each element in that array specifies the index of that position in the subarray for axis 0. Similarly, each element in the subarray in i[1] contains the index of that position in the subarray for axis 1.

swapaxes(a, axis1, axis2)

Returns a new array which shares the data of a , but which has the two axes specified by axis1 and axis2 swapped. If a is of rank 0 or 1, swapaxes simply returns a new reference to a .

>>> x = arange(10)

>>> x.shape = (5,2,1)

>>> print x

[[[0]

[1]]

[[2]

[3]]

[[4]

[5]]

[[6]

[7]]

[[8]

[9]]]

>>> y = swapaxes(x, 0, 2)

>>> print y.shape

(1, 2, 5)

>>> print y

[ [[0 2 4 6 8]

[1 3 5 7 9]]]

concatenate((a0, a1, ... , an), axis=0)

Returns a new array containing copies of the data contained in all arrays a0 ... an . The arrays ai will be concatenated along the specified axis (0 by default). All arrays ai must have the same shape along every axis except for the one given. To concatenate arrays along a newly created axis, you can use array((a0, ..., an)) as long as all arrays have the same shape.

>>> print x

[[ 0 1 2 3]

[ 5 6 7 8]

[10 11 12 13]]

>>> print concatenate((x,x))

[[ 0 1 2 3]

[ 5 6 7 8]

[10 11 12 13]

[ 0 1 2 3]

[ 5 6 7 8]

[10 11 12 13]]

>>> print concatenate((x,x), 1)

[[ 0 1 2 3 0 1 2 3]

[ 5 6 7 8 5 6 7 8]

[10 11 12 13 10 11 12 13]]

>>> print array((x,x) )

[[[ 0 1 2 3]

[ 5 6 7 8]

[10 11 12 13]]

[[ 0 1 2 3]

[ 5 6 7 8]

[10 11 12 13]]]

innerproduct(a, b)

innerproduct produces the inner product of arrays a and b. It is equivalent to matrixmultiply(a, transpose(b)).

outerproduct(a,b)

outerproduct(a,b) produces the outer product of vectors a and b, that is result[i, j] = a[i] * b[j]

array_repr()

See section on Textual Representations of arrays.

array_str()

See section on Textual Representations of arrays.

resize(a, new_shape)

The resize function takes an array and a shape, and returns a new array with the specified shape, and filled with the data in the input array. Unlike the reshape function, the new shape does not have to yield the same size as the original array. If the new size of is less than that of the input array, the returned array contains the appropriate data from the "beginning" of the old array. If the new size is greater than that of the input array, the data in the input array is repeated as many times as needed to fill the new array.

>>> x = arange(10)

>>> y = resize(x, (4,2)) # note that 4*2 < 10

>>> print x

[0 1 2 3 4 5 6 7 8 9]

>>> print y

[[0 1]

[2 3]

[4 5]

[6 7]]

>>> print resize(array((0,1)), (5,5)) # note that 5*5 > 2

[[0 1 0 1 0]

[1 0 1 0 1]

[0 1 0 1 0]

[1 0 1 0 1]

[0 1 0 1 0]]

diagonal(a, offset=0, axis1=0, axis2=1)

The diagonal function takes an array a, and returns an array of rank 1 containing all of the elements of a such that the difference between their indices along the specified axes is equal to the specified offset. With the default values, this corresponds to all of the elements of the diagonal of a along the last two axes.

repeat (a, counts, axis=0)

The repeat function uses repeated copies of a to create a result. The axis argument refers to the axis of x which will be replicated. The counts argument tells how many copies of each element to make. The length of counts must be the len(shape(a)[axis]).

In one dimension this is straightforward:

>>> y

array([0, 1, 2, 3, 4, 5])

>>> repeat(y, (1,2,0,2,2,3))

array([0, 1, 1, 3, 3, 4, 4, 5, 5, 5])

 

In more than one dimension it sometimes gets harder to understand. Consider for example this array x whose shape is (2,3).

>>> x

array([[0, 1, 2],

[3, 4, 5]])

 

>>> repeat(x, (2,6))

array([[0, 1, 2],

[0, 1, 2],

[3, 4, 5],

[3, 4, 5],

[3, 4, 5],

[3, 4, 5],

[3, 4, 5],

[3, 4, 5]])

 

>>> repeat(x, (6,3), 1)

array([[0, 0, 0, 0, 0, 0, 1, 1, 1],

[2, 2, 2, 2, 2, 2, 3, 3, 3]])

convolve (a, v, mode=2)

The convolve function returns the linear convolution of two rank 1 arrays. The output is a rank 1 array whose length depends on the value of mode which is zero by default. Linear convolution can be used to find the response of a linear system to an arbitrary input. If the input arrays correspond to the coefficients of a polynomial and mode=2, the output of linear convolution corresponds to the coefficients of the product of the polynomials.

The mode parameter requires a bit of explanation. True linear convolution is only defined over infinite sequences. As both input arrays must represent finite sequences, the convolve operation assumes that the infinite sequences represented by the finite inputs are zero outside of their domain of definition. In other words, the sequences are zero-padded. If mode is 2, then the non-zero part of the full linear convolution is returned, so the output has length len (a)+len (v)-1. Call this output f. If mode is 0, then any part of f which was affected by the zero-padding is chopped from the result. In other words, let b be the input with smallest length and let c be the other input. The output when mode is 0 is the middle len (c)-len (b)+1 elements of f. When mode is 1, the output is the same size as c and is equal to the middle len (c) elements of f.

cross_correlate (a, v, mode=0)

The cross_correlate function computes the cross_correlation between two rank 1 arrays. The output is a rank 1 array representing the inner product of a with shifted versions of v. This is very similar to convolution. The difference is that convolution reverses the axis of one of the input sequences but cross_correlation does not. In fact it is easy to verify that convolve (a, v, mode) = cross_correlate (a, v [::-1], mode)

where (condition, x, y)

The where function creates an array whose values are those of x at those indices where condition is true, and those of y otherwise. The shape of the result is the shape of condition. The type of the result is determined by the types of x and y. Either or both of x and y and be a scalar, which is then used for any element of condition which is true.

identity(n)

The identity function returns an n by n array where the diagonal elements are 1, and the off-diagonal elements are 0.

>>> print identity(5)

[[1 0 0 0 0]

[0 1 0 0 0]

[0 0 1 0 0]

[0 0 0 1 0]

[0 0 0 0 1]]

sum(a, index=0)

The sum function is a synonym for the reduce method of the add ufunc. It returns the sum of all of the elements in the sequence given along the specified axis (first axis by default).

>>> print x

[[ 0 1 2 3]

[ 4 5 6 7]

[ 8 9 10 11]

[12 13 14 15]

[16 17 18 19]]

>>> print sum(x)

[40 45 50 55] # 0+4+8+12+16, 1+5+9+13+17, 2+6+10+14+18, ...

>>> print sum(x, 1)

[ 6 22 38 54 70] # 0+1+2+3, 4+5+6+7, 8+9+10+11, ...

cumsum(a, index=0)

The cumsum function is a synonym for the accumulate method of the add ufunc.

product(a, index=0)

The product function is a synonym for the reduce method of the multiply ufunc.

cumproduct(a, index=0)

The cumproduct function is a synonym for the accumulate method of the multiply ufunc.

alltrue(a, index=0)

The alltrue function is a synonym for the reduce method of the logical_and ufunc.

sometrue(a, index=0)

The sometrue function is a synonym for the reduce method of the logical_or ufunc.

allclose (x, y, rtol = 1.e-5, atol = 1.e-8)

This function tests whether or not arrays x and y of an integer or real type are equal subject to the given relative and absolute tolerances. The formula used is:

| x - y | < atol + rtol * | y |

This means essentially that both elements are small compared to atol or their difference divided by y's value is small compared to rtol .

 

Go to Main Go to Previous Go to Next