In this chapter, a high-level overview of the extensions is provided, giving the reader the definitions of the key components of the system. This section defines the concepts used by the remaining sections.
Numeric Python consists of a set of modules:
This module defines two new object types, and a set of functions which manipulate these objects, as well as convert between them and other Python types. The objects are the new array object (technically called multiarray objects), and universal functions (technically ufunc objects).
The array objects are generally homogeneous collections of potentially large numbers of numbers. All numbers in a multiarray are the same kind (i.e. number representation, such as double-precision floating point). Array objects must be full (no empty cells are allowed), and their size is immutable. The specific numbers within them can change throughout the life of the array.
Note: In some applications arrays of numbers may contain entries representing invalid or missing values. An optional package "MA" is available to represent such arrays. Attempting to do so by using NaN as a value may lead to disappointment or lack of portability.
Mathematical operations on arrays return new arrays containing the results of these operations performed elementwise on the arguments of the operation.
The size of an array is the total number of elements therein (it can be 0 or more). It does not change throughout the life of the array.
The shape of an array is the number of dimensions of the array and its extent in each of these dimensions (it can be 0, 1 or more). It can change throughout the life of the array. In Python terms, the shape of an array is a tuple of integers, one integer for each dimension that represents the extent in that dimension.
The rank of an array is the number of dimensions along which it is defined. It can change throughout the life of the array. Thus, the rank is the length of the shape.
The typecode of an array is a single character description of the kind of element it contains (number format, character or Python reference). It determines the itemsize of the array.
The itemsize of an array is the number of 8-bit bytes used to store a single element in the array. The total memory used by an array tends to its size times its itemsize, as the size goes to infinity (there is a fixed overhead per array, as well as a fixed overhead per dimension).
To put this in more familiar mathematicial language: A vector is a rank-1 array (it has only one dimension along which it can be indexed). A matrix as used in linear algebra is a rank-2 array (it has two dimensions along which it can be indexed). There are also rank-0 arrays, which can hold single scalars -- they have no dimension along which they can be indexed, but they contain a single number.
Here is an example of Python code using the array objects (bold text refers to user input, non-bold text to computer output):
>>> vector1 = array((1,2,3,4,5))
>>> matrix1 = array(([0,1],[1,3]))
>>> print vector1.shape, matrix1.shape
[[0 1] # note that this is not the matrix
[1 9]] # multiplication of linear algebra
If this example does not work for you because it complains of an unknown name "array", you forgot to begin your session with
See Just like all Python modules and packages, the Numeric module can be invoked using either the import Numeric form, or the from Numeric import ... form. Because most of the functions we'll talk about are in the Numeric module, in this document, all of the code samples will assume that they have been preceded by a statement: from Numeric import *.
Universal functions (ufuncs) are functions which operate on arrays and other sequences. Most ufuncs perform mathematical operations on their arguments, also elementwise.
Here is an example of Python code using the ufunc objects:
>>> print sin([pi/2., pi/4., pi/6.])
>>> print greater([1,2,4,5], [5,4,3,2])
>>> print add([1,2,4,5], [5,4,3,2])
>>> print add.reduce([1,2,4,5])
Ufuncs are covered in detail in Ufuncs.
The Numeric module provides, in addition to the functions which are needed to create the objects above, a set of powerful functions to manipulate arrays, select subsets of arrays based on the contents of other arrays, and other array-processing operations.
>>> data = arange(10) # convenient homolog of builtin range()
>>> print where(greater(data, 5), -1, data)
[ 0 1 2 3 4 5 -1 -1 -1 -1] # selection facility
>>> data = resize(array((0,1)), (9, 9))
All of the functions which operate on NumPy arrays are described in Array Functions.