|
An Array Module for Python
Perry Greenfield, Todd Miller, Jin-Chung Hsu, Richard L. White
Space Telescope Science Institute,
3700 San Martin Dr.
Baltimore, MD 21218
Abstract:
Although Python has long had a module for numeric array
manipulations, it has had some shortcomings that prevent it from
being as useful for astronomy applications as it could be. We have
re-implemented the module to handle large arrays in a more
memory-efficient manner, and to support direct access of data in tables
and non-native data formats. The new module has been implemented
mostly in Python although the core computational loops are performed
in C for efficiency. The new approach allows arrays to be subclassed
as well as new kinds of array objects, such as record arrays,
to be created. This paper discusses the design and implementation
issues that were addressed in the development of the new array
module for Python and gives examples of its use.
Keywords:
Python, numeric, array
To be published in proceedings of Astronomical Data Analysis Software and Systems XI
Conference held in Victoria, BC, October 2001
The new module, numarray, is a Python module that provides array manipulation
facilities similar to those in IDL, Matlab, APL, Octave and other array
manipulation languages. We chose to develop a new module to replace
the
existing Python Numeric
module
(http://pfdubois.com/numpy/)
because: the existing code is difficult
to maintain and will not be accepted as part of the Python Standard Library
until it is rewritten; existing coercion rules waste memory when arrays are
combined with Python scalars in expressions;
efficient access of data in record arrays (i.e., tables) is not possible;
memory-mapped data cannot be used; there is no convenient means of using index
arrays (e.g., subscript arrays in IDL); temporary arrays are over-used thus
wasting memory; and there are no options for customizing the handling of
numeric exceptions (e.g., divide-by-zero, overflows, etc.).
Implementing a new module that addresses these issues poses some
design challenges. Expressions involving arrays of many
possible types must solve the problem of avoiding large amounts
of C code to handle all possible combinations of types without
requiring potentially large temporary arrays or introducing unacceptable
inefficiency within the computational loop. Furthermore when
data are stored in tables there arises the potential of
alignment problems. Some CPUs can only deal with multi-byte
types (e.g. floating point) if the data are appropriately
aligned in memory. For example, 4-byte floating point or
integer data may be required to start at memory addresses that
are divisible by 4. When tables mix columns of varying sizes,
that implies that some of the data may not be aligned. This
will happen, for example, when a table consists of two columns,
one Int16 and the other Float32. Many of the Float32 values will
be misaligned for some processsors (e.g., Sun Sparcs) and must
be copied to an aligned memory location before it can be accessed
as a float. Accessing such arrays embedded in tables requires
implicit copying; can this be done without using large amounts
of memory resulting from copying the whole array to a temporary
array? If memory mapping is to
be useful some data will be byteswapped on disk relative
to the CPU representation as will be the case for FITS files
on Intel CPUs. Again, how does one
prevent having to copy the whole array in memory and defeating
much of the advantage of memory mapping? This also exacerbates the
type handling issue because of the very large numbers of combinations
of types and representations. Finally, how does one allow for flexible
exception handling?
We approached the initial design and implementation with the following
principles in mind
- Write as much as possible in Python to speed development, more
easily test design ideas, and make design changes easier.
- Make performance of large arrays both speed and memory efficient.
All innermost computational loops on array elements must be coded
in C.
- Defer optimization of speed on smaller arrays (<10,000 elements).
- Eliminate redundant code and use code generators when appropriate.
- Incorporate extensive regression tests during development.
- Use evolutionary development by starting simple. Refactor
when new features are added; do not over-design in anticipation
of future features.
- Make available on all commonly available platforms (including Windows).
- Make it Open Source and host on Sourceforge.
- Make compatible with Python Numeric in the absence of a strong
reason to do differently. Do not make it look like a IDL or a Matlab clone
but instead retain a strong Python consistency.
- 100% compatiblity with Numeric is not a requirement.
We chose to split the implementation into two basic classes.
One, NDArray, handles all actions that do not require knowledge
of array contents or types. This includes all structural operations
from array indexing, reshaping, reading and writing, concatenation,
etc. This class can be used to develop more complex array classes,
such as arrays of records or even arbitrary Python objects,
without having to re-implement many
common array structural operations. The second array class,
NumArray, inherits from NDArray and handles all
numeric aspects such as type conversion, numeric operators and
functions, etc.
The initial set of numeric types consists of Bool, Int8, UInt8
Int16, UInt16, Int32, Float32, Float64, Complex64, and Complex128
as well as byte-swapped and non-aligned representations.
One may use index arrays of arbitrary shape for each
dimension of the array being indexed. Included in the initial
version are classes for RecArray (tables) and CharArray
(for character fields in tables)
There are various approaches to the handling of all the
possible types and representations, particularly with regard
to binary operations. We chose a hybrid approach for
handling all possible combinations of types and representations.
To avoid creating
many versions of binary functions we convert the input arrays
to a common type before calling the function. To avoid creating
large temporary arrays, we iterate over large input arrays in
sub-blocks. To illustrate, if two large 2Kx2K arrays were added,
one Int16 and the other Float32, the Int16 array would be converted
a few rows at a time to Float32 and then added to the corresponding
rows of the Float32 array. If the sub-block size is kept small
enough, the impact on memory usage is negligible and the data
can all be kept in the memory cache preventing a significant
performance problem. If the block size is large enough, the
overhead of iterating over blocks is not significant. In the
initial version of numarray, the block size is set at 10,000 bytes.
This mechanism also easily handles byteswapped or non-aligned
data.
Since much of the C code is of similar form, differing only
in the types involved or the operation being carried out, a code
generator was written in Python to dynamically generate most
of the C code for the module when it is built. Much of the
code generation is table driven.
Exception handling is performed by clearing the IEEE 754
floating point status flags before an array operation and
checking them after its completion. Numarray presents the
user with three options for handling exceptions: 1) ignore them,
2) print a warning message, or 3) raise a Python exception.
These options may be set independently for each of the
four types of floating point exceptions. Integer exceptions
can be similarly handled. However, there are no means for
identifying which array elements caused the exception
except for checking the individual elements of the results
(or the individual elements of the inputs).
We retain the ability of Numeric to generate mulitple arrays
that share all or parts of the same data buffer (as is done
when a slice of an array is requested). One must explicitly
ask for array data to be copied.
How numarray arrays work is best illustrated with a few examples
from a Python interactive session.
>>> a = array([1, 3, -2, 0]) # 4 element Int32 array
>>> a[1] # 0 indexed!
3
>>> b = arange(4) # Creates array with value=index
>>> print b
[0 1 2 3]
>>> a + b
array([1, 4, 0, 3])
>>> d = arange(12)
>>> d[a+b] # index array
array([1, 4, 0, 3])
>>> add.accumulate(d) # cumulative sum
array([0, 1, 3, 6, 10, 15, 21, 28, 36, 45, 55, 66])
>>> r = reshape(d, (2,6)) # convert to 2x6 2-d array
>>> r
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11]])
>>> # all values in 1st dim, every other value
>>> # starting at 2nd element in the 2nd dim.
>>> r[:,1:6:2] # 'slicing'
array([[ 1, 3, 5],
[ 7, 9, 11]])
>>> add.reduce(r, dim=1) # sum along 2nd dim
array([15, 51])
Record Arrays:
Create 3-element record array of integer, float and text fields
>>> r = recarray.array([(100, 2.5, `abc'), \
(200, 3.5, `xyz'), \
(300, 4.1, `pqr')], names=`a,b,c')
>>> print r[0] # first row
(100, 2.5, `abc')
>>> columnb = r.field(`b') # Numeric array view of column b
>>> print columnb
[ 2.5 3.5 4.1]
>>> columna = r.field(`a') # Likewise for column a
>>> print columna * columnb
[ 250. 700. 1230.]
>>> columna[0] = 3000 # Change first element in column
>>> # illustrating shared nature of
>>>print r[0] # records and column data
(3000, 2.5, `abc')
- Benchmarking and improving performance for small arrays.
- More complete C API to aid writing C functions that use numarray data.
- Inclusion of standard libraries that exist in Numeric into numarray
(FFT, random number generators, etc.)
- Generating mask arrays from invalid floating point results.
- Inclusion of numarray into the Python Standard Library.
Last Updated 2001 Nov 1
|
 |
|