Universal functions (ufunc) are special NumPy functions that operate on ndarrays in an element-by-element fashion.
They represent a vast array of vectorized functions that perform much better than iterative implementations and let you write concise code. Most ufuncs achieve this by providing a Python wrapper around a C implementation.
In this article, we will learn about some of these functions and a related topic called array-oriented programming
Let's get started!
It's about speed and clarity
Before exploring some of the functions you'll probably use most often, let's run a quick comparison between a universal function and an equivalent iterative implementation.
We will perform the square operation on every entry in a numpy array using both a universal function and an iterative implementation. We will time the results using the %timeit function in a Jupyter notebook.
import numpy as np arr = np.arange(100000000)
%%time squares_1 = np.square(arr)
CPU times: user 129 ms, sys: 200 ms, total: 329 ms Wall time: 328 ms
%%time squares_2 = np.empty(len(arr), dtype=np.int64) for i in range(len(arr)): squares_2[i] = arr[i] ** 2
CPU times: user 32.7 s, sys: 133 ms, total: 32.8 s Wall time: 32.8 s
Wow, that's about 100 times faster for the universal function case. Not only that, but the code is also much more straightforward and easy to understand.
Good, now that we know they are really cool things, let's explore some universal functions:
# We will use the following arrays for the examples even = np.arange(0,20,2) odd = np.arange(1,20,2) print(even) print(odd)
[ 0 2 4 6 8 10 12 14 16 18] [ 1 3 5 7 9 11 13 15 17 19]
# add performs element addition element-wise result = np.add(even, odd) print(result)
[ 1 5 9 13 17 21 25 29 33 37]
# subtract performs element-wise subtraction result = np.subtract(even, odd) print(result)
[-1 -1 -1 -1 -1 -1 -1 -1 -1 -1]
# multiply performs element-wise multiplication result = np.multiply(odd, even) print(result)
[ 0 6 20 42 72 110 156 210 272 342]
# divide performs element-wise division (second_argument / first_argument) result = np.divide(even, odd) print(result)
[0. 0.66666667 0.8 0.85714286 0.88888889 0.90909091 0.92307692 0.93333333 0.94117647 0.94736842]
# maximum returns an element-wise maximum result = np.maximum(even, odd) print(result)
[ 1 3 5 7 9 11 13 15 17 19]
# minimum performs an element-wise minimum result = np.minimum(even, odd) print(result)
[ 0 2 4 6 8 10 12 14 16 18]
# greater returns true if the values in the first argument are higher than the ones on the second # there is also a greater_equal that performs this operation as a >= result = np.greater(even, odd) print(result)
[False False False False False False False False False False]
# less returns true if the values in the first argument are lower than the ones on the second # there is also a lessequal that performs this operation as a <= result = np.less(even, odd) print(result)
[ True True True True True True True True True True]
# sin calculates the trigonometric sine element-wise result = np.sin(even) print(result)
[ 0. 0.90929743 -0.7568025 -0.2794155 0.98935825 -0.54402111 -0.53657292 0.99060736 -0.28790332 -0.75098725]
# sqrt calculates the non-negative square-root of an array element-wise. result = np.sqrt(even) print(result)
[0. 1.41421356 2. 2.44948974 2.82842712 3.16227766 3.46410162 3.74165739 4. 4.24264069]
# cbrt calculates the cube-root of an array element-wise. result = np.cbrt(even) print(result)
[0. 1.25992105 1.58740105 1.81712059 2. 2.15443469 2.28942849 2.41014226 2.5198421 2.62074139]
# log calculates the natural logarithm element-wise result = np.log(odd) print(result)
[0. 1.09861229 1.60943791 1.94591015 2.19722458 2.39789527 2.56494936 2.7080502 2.83321334 2.94443898]
# log2 calculates the base 2 logarithm element-wise result = np.log2(odd) print(result)
[0. 1.5849625 2.32192809 2.80735492 3.169925 3.45943162 3.70043972 3.9068906 4.08746284 4.24792751]
Array oriented programming is the practice of replacing loops for vectorized operations. NumPy has by default a huge amount of functions you can use to express solutions without having to write loops. This lets you solve problems using an intuitive syntax that other programmers and scientists will have a much easier time understanding.
As an example, imagine the following problem: You have two matrices, each one representing a side from a collection of 9 right triangles. You are asked with calculating a third matrix where each entry is the value for the hypotenuse of those 9 triangles.
sides_a = np.random.randint(low=1, high=10, size=9).reshape(3,3) sides_b = np.random.randint(low=1, high=10, size=9).reshape(3,3) print(sides_a) print(sides_b)
[[5 6 9] [8 1 4] [2 8 6]] [[5 9 6] [5 2 8] [4 6 9]]
# we can express the solution in a single line using intuitive syntax hypotenuse = np.sqrt(sides_a**2 + sides_b**2) print(hypotenuse)
[[ 7.07106781 10.81665383 10.81665383] [ 9.43398113 2.23606798 8.94427191] [ 4.47213595 10. 10.81665383]]
With some practice, you will be able to create incredibly intuitive code for solving numerical problems using NumPy. It's not only more efficient than loop-based solutions, but it's also much easier to read and understand!
The fun is not over yet
We explored just a very small subset of the functionality you can get from ufuncs, if you want to know what other things you can get done using them check the official documentation.
Now you are entering the realm of advanced NumPy, and the type of problems you can solve is now much bigger. In the next week, we will take a look at reductions (also known as aggregations), a set of very useful functions for statistical analysis.
Thank you for reading!
What to do next
- Share this article with friends and colleagues. Thank you for helping me reach people who might find this information useful.
- You can find the source code for this series in this repo.
- This article is based on Python for Data Analysis. These and other very helpful books can be found in the recommended reading list.
- Send me an email with questions, comments or suggestions (it's in the About Me page)