4. Numpy¶
By default Python does not have a concept of Arrays. And there is no inbuilt support for multidimensional arrays.
Python Numpy is a library that handles multidimensional arrays with ease. It has a great collection of functions that makes it easy while working with arrays.
It provides a multidimensional array object, and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.
NumPy gives you an enormous range of fast and efficient ways of creating arrays and manipulating numerical data inside them. While a Python list can contain different data types within a single list, all of the elements in a NumPy array should be homogeneous. The mathematical operations that are meant to be performed on arrays would be extremely inefficient if the arrays weren’t homogeneous.
NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use. NumPy uses much less memory to store data and it provides a mechanism of specifying the data types.
The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages.
4.1. Array¶
An array is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element.
The rank of the array is the number of dimensions. The shape of the array is a tuple of integers giving the size of the array along each dimension.
## Import the Numpy package
import numpy as np
## Create Array
x = np.array([2,4,6,8])
x
## Creating Zeroes
np.zeros(3)
## Creating ones
np.ones(5)
## Creating first 5 integers
np.arange(5)
## Creating integers based in an interval with spacing
np.arange(1, 10, 3)
## Creating an Array in a linearspace
np.linspace(0, 5, num=3)
## Creating an identity matrix
np.eye(5)
## Sort the array
ar = np.array([7,9,2,6])
np.sort(ar)
## Merge arrays
x=np.array([2,4,6])
y=np.array([8,10,11])
np.concatenate((x,y))
## Reshaping an array
a = np.arange(4)
print(a)
b = a.reshape(2, 2)
print(b)
## Indexing and Slicing
data = np.array([1, 5, 9, 2])
data[1]
data[0:2]
data[1:]
data[::2]
## Reverse elements in an array
data[::-1]
data[2::-1]
a = np.array([[2, 4], [6, 8], [10, 12]])
print(a[a < 8])
greater = (a >= 6)
print(a[greater])
divisible_by_4 = a[a%4==0]
print(divisible_by_4)
range = a[(a > 2) & (a < 12)]
print(range)
up = (a > 6) | (a == 6)
print(a[up])
#Stack Arrays
a1 = np.array([[1, 2],
[3, 4]])
a2 = np.array([[5, 6],
[7, 8]])
np.vstack((a1, a2))
Similar to function “rbind” in r.
np.hstack((a1, a2))
Similar to function “cbind” in r.
## Split Arrays
x = np.arange(4).reshape((2, 2))
x
x1, x2 = np.vsplit(x, [1])
print(x1)
print(x2)
x1, x2 = np.hsplit(x, [1])
print(x1)
print(x2)
## Array functions
data = np.array([2, 4])
ones = np.ones(2, dtype=int)
data + ones
data - ones
data * data
data / 3
data//3
print("-data = ", -data)
## Power
print("data ** 2 = ", data ** 2)
## Modulus
print("data % 4 = ", data % 4)
## Summing an Array
a = np.array([2, 2, 2, 4])
a.sum()
## Summing over rows
b = np.array([[0, 1], [2, 3]])
b.sum(axis=0)
## Summing over Columns
b.sum(axis=1)
## Minimum
b.min()
## Maximum
b.max()
b.max(axis=0)
b.max(axis=1)
## Absolute Values
x = np.array([-2, -1, 0, 3, -4])
np.absolute(x)
## Aggregates
x = np.arange(1, 5)
np.add.reduce(x)
np.multiply.reduce(x)
np.add.accumulate(x)
np.multiply.accumulate(x)
np.multiply.outer(x, x)
## Random number generation
rng = np.random
rng.random(3)
## Pulling unique values
a = np.array([1, 1, 2, 3, 4, 5, 2, 3, 1, 4, 8, 9])
np.unique(a)
np.unique(a, return_index=True)
np.unique(a, return_counts=True)
## Transpose of a matrix
b
np.transpose(b)
## Flip array
x=np.arange(6)
x
np.flip(x)
y=np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
np.flip(y)
np.flip(y, axis=0)
np.flip(y, axis=1)
y[1]=np.flip(y[1])
print(y)
y[:,1] = np.flip(y[:,1])
print(y)
## Flattening an multidimensional array
y=np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
y.flatten()
y1 = y.flatten()
y1[0] = 99
print(y)
print(y1)
y1 = y.ravel()
y1[0] = 99
print(y)
print(y1)
## Save and Load
np.save('data', y1)
np.load('data.npy')
#Deleting the created file
import os
os.remove('data.npy')
## Save as csv
np.savetxt('new_data.csv', y1)
np.loadtxt('new_data.csv')
#Deleting the created file
os.remove('new_data.csv')
## Copy array
y2=y1.copy()
y2
## Dot product
a = 2
b = 6
np.dot(a,b)
A = np.array([1, 2, 3, 4])
B = np.array([5, 6, 7, 8])
np.dot(A, B)
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
np.dot(A, B)
## Cross product
A = np.array([1, 2])
B = np.array([3, 4])
np.cross(A, B)
A = np.array([1, 2, 3])
B = np.array([4, 5, 6])
np.cross(A, B)
## Square root
A = [4, 9, 16, 1, 25]
np.sqrt(A)
x = [4+1j, 9+16j]
np.sqrt(x)
y = [-4, 9]
np.sqrt(y)
## Average
a = np.array([1, 2, 3, 4]).reshape(2,2)
np.average(a)
np.average(a, axis=0)
np.average(a, axis=1)
Can perform this using “np.mean” function too.
## Mean
np.mean(a)
## Standard Deviation
np.std(a)
np.std(a,axis=1)
np.percentile(a, 25)
np.median(a)
np.percentile(a, 75)
## Converting from array to list
a.tolist()
## Converting from list to array
y=list([1, 2, 3, 4, 5])
np.array(y)
ar=np.array([[True,True],[False,False]])
np.any(ar)
## Check elements in an array is true
ar=np.array([[True,True],[True,True]])
np.all(ar)
ar = np.array([[True,True],[False,False]])
np.all(ar)
ar = np.array([[True,True], [True,False], [True,False]])
np.all(ar, axis=1)
## Trignometric functions
theta = np.linspace(1, np.pi, 2)
print("theta = ", theta)
print("sin(theta) = ", np.sin(theta))
print("cos(theta) = ", np.cos(theta))
print("tan(theta) = ", np.tan(theta))
## Inverse trignometric functions
x=[-1,0]
print("arcsin(x) = ", np.arcsin(x))
print("arccos(x) = ", np.arccos(x))
print("arctan(x) = ", np.arctan(x))
## Exponentials
x = [1, 3]
print("e^x =", np.exp(x))
print("2^x =", np.exp2(x))
print("4^x =", np.power(4, x))
## Logarithms
x = [1, 2, 3]
print("ln(x) =", np.log(x))
print("log2(x) =", np.log2(x))
print("log10(x) =", np.log10(x))
## More precision for small inputs
x = [0.001, 0.01]
print("exp(x) - 1 =", np.expm1(x))
print("log(1 + x) =", np.log1p(x))