num

Numerical Python (NumPy) is used for performing various numerical computation in python. Calculations using NumPy arrays are faster than the normal python array. Further, pandas are build over NumPy array, therefore better understanding of python can help us to use pandas more effectively.

What is NumPy Library

NumPy, which stands for Numerical Python, is a powerful library in Python used for scientific computing and working with large, multi-dimensional arrays and matrices. It provides a collection of high-performance mathematical functions and tools that enable efficient manipulation, computation, and analysis of numerical data.

Also Check Out: Basics Of Pandas

Creating Arrays

import numpy as np
# this is a 1-D array
d = np.array([1, 2, 3])
d
array([1, 2, 3])
# multi dimensional array
nd = np.array([[1, 2, 3], [3, 4, 5], [10, 11, 12]])
nd
array([[ 1,  2,  3],
       [ 3,  4,  5],
       [10, 11, 12]])
nd.shape # shape of array
nd.dtype # data type
dtype('int32')
# define zero matrix
np.zeros(3)
array([0., 0., 0.])
np.zeros([3, 2])
array([[0., 0.],
       [0., 0.],
       [0., 0.]])
# diagonal matrix
e = np.eye(3)
e
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])
# add 2 to e
e2 = e + 2
e2
array([[3., 2., 2.],
       [2., 3., 2.],
       [2., 2., 3.]])
# create matrix with all entries as 1 and size as 'e2'
o = np.ones_like(e2)
o
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])
# convert string-list to float
a = ['1', '2', '3']
a_arr = np.array(a, dtype=np.string_) # convert list to ndarray
af = a_arr.astype(float) # change ndarray type
af
array([1., 2., 3.])

Boolean Indexing

Boolean indexing in NumPy allows you to filter and select elements from an array based on a Boolean condition. It involves using a Boolean array of the same shape as the original array to specify which elements should be selected.

# accessing data with boolean indexing
data = np.random.randn(5, 3)
data
array([[-0.71813112,  0.81967389,  1.71596859],
       [-0.84896125,  0.80862737,  0.8190463 ],
       [ 0.45797312, -0.75678082, -0.6593751 ],
       [-1.01204652,  0.37236452, -0.77734252],
       [-1.22889579, -0.77685367, -1.83034324]])

np.random.randn(5, 3) generates a NumPy array data of shape (5, 3) filled with random numbers drawn from a standard normal distribution (mean 0 and standard deviation 1). The randn function from the random module in NumPy is used to generate random numbers.

name = np.array(['a', 'b', 'c', 'a', 'b'])
name=='a'
array([ True, False, False,  True, False])
data[name=='a']
array([[-0.71813112,  0.81967389,  1.71596859],
       [-1.01204652,  0.37236452, -0.77734252]])
data[name != 'a']
array([[-0.84896125,  0.80862737,  0.8190463 ],
       [ 0.45797312, -0.75678082, -0.6593751 ],
       [-1.22889579, -0.77685367, -1.83034324]])
data[ (data > 1) & (data < 2) ]
array([1.71596859])

Reshaping Arrays

Reshaping arrays in NumPy refers to changing the shape or dimensions of an array while maintaining the same total number of elements. It allows you to reorganize the data within an array without altering the underlying values. NumPy provides the reshape() function to perform array reshaping.

a = np.arange(0, 20)
a
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

arange(): It is a function in NumPy that creates an array with regularly spaced values. It takes in the start value (inclusive) and the end value (exclusive) as parameters and generates a sequence of numbers in between.

# reshape array a
a45 = a.reshape(4, 5)
a45
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

Make sure number of values should not be out of bound while giving parameters.

# select row 2, 0 and 1 from a45 and store in b
b = a45[ [2, 0, 1] ]
b
array([[10, 11, 12, 13, 14],
       [ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9]])
# transpose array b
b.T
array([[10,  0,  5],
       [11,  1,  6],
       [12,  2,  7],
       [13,  3,  8],
       [14,  4,  9]])

Concatenating the Data

We can combine the data to two arrays using ‘concatenate’ command.

arr = np.arange(12).reshape(3,4)
rn = np.random.randn(3, 4)
arr
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
rn
array([[-0.29285443, -0.00584242, -0.86937831,  0.08750389],
       [ 2.07061198, -1.54648838,  0.1185279 ,  0.23029147],
       [-0.47632604, -1.08762118, -0.32017944,  1.13810467]])
# merge data of rn below the arr
np.concatenate([arr, rn])
array([[ 0.00000000e+00,  1.00000000e+00,  2.00000000e+00,
         3.00000000e+00],
       [ 4.00000000e+00,  5.00000000e+00,  6.00000000e+00,
         7.00000000e+00],
       [ 8.00000000e+00,  9.00000000e+00,  1.00000000e+01,
         1.10000000e+01],
       [-2.92854435e-01, -5.84241659e-03, -8.69378309e-01,
         8.75038920e-02],
       [ 2.07061198e+00, -1.54648838e+00,  1.18527899e-01,
         2.30291468e-01],
       [-4.76326042e-01, -1.08762118e+00, -3.20179438e-01,
         1.13810467e+00]])
# merge dataof rn on the right side of the arr
np.concatenate([arr, rn], axis=1)
array([[ 0.00000000e+00,  1.00000000e+00,  2.00000000e+00,
         3.00000000e+00, -2.92854435e-01, -5.84241659e-03,
        -8.69378309e-01,  8.75038920e-02],
       [ 4.00000000e+00,  5.00000000e+00,  6.00000000e+00,
         7.00000000e+00,  2.07061198e+00, -1.54648838e+00,
         1.18527899e-01,  2.30291468e-01],
       [ 8.00000000e+00,  9.00000000e+00,  1.00000000e+01,
         1.10000000e+01, -4.76326042e-01, -1.08762118e+00,
        -3.20179438e-01,  1.13810467e+00]])

Also Check Out: Basics Of Pandas

Conclusion

NumPy is a fundamental library in Python for numerical computing and data manipulation. It provides powerful tools and functionalities for working with multi-dimensional arrays, performing mathematical operations efficiently, and enabling advanced data analysis and computation.

NumPy is widely used in various domains, including scientific research, data analysis, machine learning, and computational modeling. Its efficient array operations, extensive mathematical functions, and integration with other libraries make it a valuable tool for numerical computing tasks. Whether it’s performing complex mathematical calculations or handling large datasets, NumPy provides the foundation for efficient and scalable data manipulation and analysis in Python.

By Akshay Tekam

software developer, Data science enthusiast, content creator.

Leave a Reply

Your email address will not be published. Required fields are marked *