4. Numpy¶

By default Python does not have a concept of Arrays. And there is no inbuilt support for multidimensional arrays.

Python Numpy is a library that handles multidimensional arrays with ease. It has a great collection of functions that makes it easy while working with arrays.

It provides a multidimensional array object, and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

NumPy gives you an enormous range of fast and efficient ways of creating arrays and manipulating numerical data inside them. While a Python list can contain different data types within a single list, all of the elements in a NumPy array should be homogeneous. The mathematical operations that are meant to be performed on arrays would be extremely inefficient if the arrays weren’t homogeneous.

NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use. NumPy uses much less memory to store data and it provides a mechanism of specifying the data types.

The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages.

4.1. Array¶

An array is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element.

The rank of the array is the number of dimensions. The shape of the array is a tuple of integers giving the size of the array along each dimension.

## Import the Numpy package

import numpy as np

## Create Array

x = np.array([2,4,6,8])
x

## Creating Zeroes

np.zeros(3)

## Creating ones

np.ones(5)

## Creating first 5 integers

np.arange(5)

## Creating integers based in an interval with spacing

np.arange(1, 10, 3)

## Creating an Array in a linearspace

np.linspace(0, 5, num=3)

## Creating an identity matrix
np.eye(5)

## Sort the array

ar = np.array([7,9,2,6])
np.sort(ar)

## Merge arrays

x=np.array([2,4,6])
y=np.array([8,10,11])
np.concatenate((x,y))

## Reshaping an array

a = np.arange(4)
print(a)

b = a.reshape(2, 2)
print(b)

## Indexing and Slicing

data = np.array([1, 5, 9, 2])
data[1]

data[0:2]

data[1:]

data[::2]

## Reverse elements in an array

data[::-1] 

data[2::-1]

a = np.array([[2, 4], [6, 8], [10, 12]])
print(a[a < 8])

greater = (a >= 6)
print(a[greater])

divisible_by_4 = a[a%4==0]
print(divisible_by_4)

range = a[(a > 2) & (a < 12)]
print(range)

up = (a > 6) | (a == 6)
print(a[up])

#Stack Arrays
a1 = np.array([[1, 2],
               [3, 4]])

a2 = np.array([[5, 6],
               [7, 8]])

np.vstack((a1, a2))

Similar to function “rbind” in r.

np.hstack((a1, a2))

Similar to function “cbind” in r.

## Split Arrays

x = np.arange(4).reshape((2, 2))
x

x1, x2 = np.vsplit(x, [1])
print(x1)
print(x2)

x1, x2 = np.hsplit(x, [1])
print(x1)
print(x2)

## Array functions

data = np.array([2, 4])
ones = np.ones(2, dtype=int)
data + ones

data - ones

data * data

data / 3

data//3

print("-data = ", -data)

## Power

print("data ** 2 = ", data ** 2)

## Modulus

print("data % 4 = ", data % 4)

## Summing an Array

a = np.array([2, 2, 2, 4])
a.sum()

## Summing over rows

b = np.array([[0, 1], [2, 3]])
b.sum(axis=0)

## Summing over Columns

b.sum(axis=1)

## Minimum 

b.min()

## Maximum

b.max()

b.max(axis=0)

b.max(axis=1)

## Absolute Values

x = np.array([-2, -1, 0, 3, -4])
np.absolute(x)

## Aggregates

x = np.arange(1, 5)
np.add.reduce(x)

np.multiply.reduce(x)

np.add.accumulate(x)

np.multiply.accumulate(x)

np.multiply.outer(x, x)

## Random number generation

rng = np.random
rng.random(3)

## Pulling unique values

a = np.array([1, 1, 2, 3, 4, 5, 2, 3, 1, 4, 8, 9])
np.unique(a)

np.unique(a, return_index=True)

np.unique(a, return_counts=True)

## Transpose of a matrix
b

np.transpose(b)

## Flip array

x=np.arange(6)
x

np.flip(x)

y=np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
np.flip(y)

np.flip(y, axis=0)

np.flip(y, axis=1)

y[1]=np.flip(y[1])
print(y)

y[:,1] = np.flip(y[:,1])
print(y)

## Flattening an multidimensional array

y=np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
y.flatten()

y1 = y.flatten()
y1[0] = 99
print(y) 

print(y1)

y1 = y.ravel()
y1[0] = 99
print(y) 

print(y1)

## Save and Load

np.save('data', y1)

np.load('data.npy')

#Deleting the created file

import os

os.remove('data.npy')

## Save as csv

np.savetxt('new_data.csv', y1)

np.loadtxt('new_data.csv')

#Deleting the created file

os.remove('new_data.csv')

## Copy array

y2=y1.copy()
y2

## Dot product

a = 2
b = 6
np.dot(a,b)

A = np.array([1, 2, 3, 4])
B = np.array([5, 6, 7, 8])
np.dot(A, B)

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
np.dot(A, B)

## Cross product

A = np.array([1, 2])
B = np.array([3, 4])
np.cross(A, B)

A = np.array([1, 2, 3])
B = np.array([4, 5, 6])
np.cross(A, B)

## Square root

A = [4, 9, 16, 1, 25]
np.sqrt(A)

x = [4+1j, 9+16j]
np.sqrt(x)

y = [-4, 9]
np.sqrt(y)

## Average

a = np.array([1, 2, 3, 4]).reshape(2,2)
np.average(a)

np.average(a, axis=0)

np.average(a, axis=1)

Can perform this using “np.mean” function too.

## Mean

np.mean(a)

## Standard Deviation

np.std(a)

np.std(a,axis=1)

np.percentile(a, 25)

np.median(a)

np.percentile(a, 75)

## Converting from array to list

a.tolist()

## Converting from list to array

y=list([1, 2, 3, 4, 5])
np.array(y)

ar=np.array([[True,True],[False,False]])
np.any(ar)

## Check elements in an array is true

ar=np.array([[True,True],[True,True]])
np.all(ar)

ar = np.array([[True,True],[False,False]])
np.all(ar)

ar = np.array([[True,True], [True,False], [True,False]])
np.all(ar, axis=1)

## Trignometric functions 
    
theta = np.linspace(1, np.pi, 2)
print("theta = ", theta)
print("sin(theta) = ", np.sin(theta))
print("cos(theta) = ", np.cos(theta))
print("tan(theta) = ", np.tan(theta))

## Inverse trignometric functions

x=[-1,0]
print("arcsin(x) = ", np.arcsin(x))
print("arccos(x) = ", np.arccos(x))
print("arctan(x) = ", np.arctan(x))

## Exponentials

x = [1, 3]
print("e^x =", np.exp(x))
print("2^x =", np.exp2(x))
print("4^x =", np.power(4, x))

## Logarithms

x = [1, 2, 3]
print("ln(x) =", np.log(x))
print("log2(x) =", np.log2(x))
print("log10(x) =", np.log10(x))

## More precision for small inputs

x = [0.001, 0.01]
print("exp(x) - 1 =", np.expm1(x))
print("log(1 + x) =", np.log1p(x))

Introduction to Data Science

4. Numpy¶

4.1. Array¶