Linear algebra and functions with NumPy
The NumPy linear algebra functions rely on BLAS and LAPACK to provide efficient low-level implementations of standard linear algebra algorithms.
The Linear Algebra module of NumPy offers various methods to apply linear algebra on any NumPy array.
One can find:
- rank, determinant, trace, etc. of an array.
- eigenvalues of matrices
- matrix and vector products (dot, inner, outer, etc. product), matrix exponentiation
- solve linear or tensor equations.
The SciPy library also contains a linalg
submodule and there is overlap in the functionality provided by the SciPy and NumPy submodules.
Commonly used linalg functions:
diag
Return the diagonal (or off-diagonal) elements of a square matrix as a 1D array, or convert a 1D array into a square matrix with zeros on the off-diagonal
dot
Matrix multiplication
trace
Compute the sum of the diagonal elements
det
Compute the matrix determinant
eig
Compute the eigenvalues and eigenvectors of a square matrix
inv
Compute the inverse of a square matrix
# Importing numpy as npimport numpy as npA = np.array([[5, 1, 2], [4, 1, 5], [2, 8, 7]])# Rank of a matrixprint("Rank of A:", np.linalg.matrix_rank(A))# Trace of matrix Aprint("\nTrace of A:", np.trace(A))# Determinant of a matrixprint("\nDeterminant of A:", np.linalg.det(A))# Inverse of matrix Aprint("\nInverse of A:\n", np.linalg.inv(A))print("\nMatrix A raised to power 2:\n",np.linalg.matrix_power(A, 2))
What Is Vectorization?
“Vectorization” (simplified) is the process of rewriting a loop so that instead of processing a single element of an array N times, it processes (say) 4 elements of the array simultaneously N/4 times.It is faster as modern CPUs are optimized for such operations.
So in Numpy, The concept of vectorized operations allows the use of more optimal and pre-compiled functions and mathematical operations on NumPy array objects and data sequences. The Output and Operations will speed up when compared to simple non-vectorized operations.
# importing the modulesimport numpy as npimport timeitimport math# vectorized operationprint("Time taken by vectorized operation : ", end = "")%timeit np.exp(np.arange(100))# non-vectorized operationprint("Time taken by non-vectorized operation : ", end = "")%timeit [math.exp(item) for item in range(100)]The time taken by the vectorized operations is much lesser than the non -vectorized operations.
What are ufuncs?
Ufuncs stands for Universal Functions in Numpy. ufuncs are used to implement vectorization in NumPy which is way faster than iterating over elements.
To create your own ufunc, you have to define a function, as you do with normal functions in Python, then you add it to your NumPy ufunc library with the frompyfunc()
method.
The frompyfunc()
method takes the following arguments:
function
- the name of the function.inputs
- the number of input arguments (arrays).outputs
- the number of output arrays.
Example to create your own ufunc:
import numpy as np
def myadd(x, y):
return x+y
myadd = np.frompyfunc(myadd, 2, 1)
print(myadd([1, 2, 3, 4], [5, 6, 7, 8]))
Simple Arithmetic ufuncs:+np.add
Addition (e.g., 1 + 1 = 2
)
-np.subtract
Subtraction (e.g., 3 - 2 = 1
)
-np.negative
Unary negation (e.g., -2
)
*np.multiply
Multiplication (e.g., 2 * 3 = 6
)
/np.divide
Division (e.g., 3 / 2 = 1.5
)
//np.floor_divide
Floor division (e.g., 3 // 2 = 1
)
**np.power
Exponentiation (e.g., 2 ** 3 = 8
)
%np.mod
Modulus/remainder (e.g., 9 % 4 = 1
)
Trigonometric functions:
These functions work on radians, so angles need to be converted to radians by multiplying by pi/180. Only then we can call trigonometric functions. They take an array as input arguments.
sin, cos, tan-compute sine, cosine, and tangent of angles.
arcsin, arccos, arctan calculate inverse sine, cosine, and tangent
hypotcalculate hypotenuse of given right triangle
sinh, cosh, tanh compute hyperbolic sine, cosine, and tangent
arcsinh, arccosh, arctanh compute inverse hyperbolic sine, cosine, and tangent
deg2rad convert degree into radiansrad2degconvert radians into a degree.
# Python code to demonstrate trigonometric functionimport numpy as np# create an array of anglesangles = np.array([0, 30, 45, 60, 90, 180])# conversion of degree into radians# using deg2rad functionradians = np.deg2rad(angles)# sine of anglesprint('Sine of angles in the array:')sine_value = np.sin(radians)print(np.sin(radians))# inverse sine of sine valuesprint('Inverse Sine of sine values:')print(np.rad2deg(np.arcsin(sine_value)))# hyperbolic sine of anglesprint('Sine hyperbolic of angles in the array:')sineh_value = np.sinh(radians)print(np.sinh(radians))# inverse sine hyperbolicprint('Inverse Sine hyperbolic:')print(np.sin(sineh_value))# hypot function demonstrationbase = 4height = 3print('hypotenuse of right triangle is:')print(np.hypot(base, height))
Broadcasting and Shape manipulation
The term broadcasting refers to the ability of NumPy to treat arrays of different shapes during arithmetic operations. Arithmetic operations on arrays are usually done on corresponding elements. If two arrays are of exactly the same shape, then these operations are smoothly performed.
If the dimensions of two arrays are dissimilar, element-to-element operations are not possible.
Broadcasting is possible if the following rules are satisfied −
- Array with smaller ndim than the other is prepended with ‘1’ in its shape.
- Size in each dimension of the output shape is the maximum of the input sizes in that dimension.
- An input can be used in calculation if its size in a particular dimension matches the output size or its value is exactly 1.
- If an input has a dimension size of 1, the first data entry in that dimension is used for all calculations along that dimension.
import numpy as np
a = np.array([[0.0,0.0,0.0],[10.0,10.0,10.0],[20.0,20.0,20.0],[30.0,30.0,30.0]])
b = np.array([1.0,2.0,3.0])
print 'First array:'
print a
print '\n'
print 'Second array:'
print b
print '\n'
print 'First Array + Second Array'
print a + bFirst array:
[[ 0. 0. 0.]
[ 10. 10. 10.]
[ 20. 20. 20.]
[ 30. 30. 30.]]
Second array:
[ 1. 2. 3.]
First Array + Second Array
[[ 1. 2. 3.]
[ 11. 12. 13.]
[ 21. 22. 23.]
[ 31. 32. 33.]]
This demonstrates how array b is broadcast to become compatible with a.
Shape Manipulation
Shape manipulation is a technique by which we can manipulate the shape of a NumPy array and then convert the initial array into an array or matrix of the required shape and size. This may include converting a one-dimensional array into a matrix and vice-versa and finding the transpose of the matrix by using different functions of the NumPy module.
Changing shape can be done with the following funcs :
1 reshapeGives a new shape to an array without changing its data2 flatA 1-D iterator over the array3 flattenReturns a copy of the array collapsed into one dimension4 ravel.Returns a contiguous flattened array
Transpose operations
1 transposePermutes the dimensions of an array2 ndarray.TSame as self.transpose()3 rollsaxisRolls the specified axis backwards4 swapaxesInterchanges the two axes of an array
Boolean Masking
Masking comes up when you want to extract, modify, count, or otherwise manipulate values in an array based on some criterion: for example, you might wish to count all values greater than a certain value, or perhaps remove all outliers that are above some threshold. In NumPy, Boolean masking is often the most efficient way to accomplish these types of tasks.
Example counting rainy days:
import numpy as np
import pandas as pd
# use pandas to extract rainfall inches as a NumPy array
rainfall = pd.read_csv('data/Seattle2014.csv')['PRCP'].values
inches = rainfall / 254.0 # 1/10mm -> inches
inches.shape
(365,)The array contains 365 values, giving daily rainfall in inches from January 1 to December 31, 2014.
Date and Time in Numpy
With the help of numpy.datetime64()
method, we can get the date in a NumPy array in a particular format i.e year-month-day by using numpy.datetime64()
method.
Syntax : numpy.datetime64(date)
Return : Return the date in a format ‘yyyy-mm-dd’.# import numpyimport numpy as np# using numpy.datetime64() methodabc = np.array(np.datetime64('2019-08-26'))print(abc)rray(‘2019-08-26′, dtype=’datetime64[D]’)