NumPy is a fundamental package for scientific computing in Python. It provides support for arrays, matrices, and many mathematical functions to operate on these data structures efficiently.
Use np.array()
, np.zeros()
, np.ones()
, np.empty()
, and np.arange()
to create arrays with specific values or shapes.
Broadcasting is a technique that allows NumPy to perform operations on arrays of different shapes by automatically expanding the smaller array to match the larger one.
Use operators like +
, -
, *
, /
, and functions like np.add()
, np.subtract()
, np.multiply()
, and np.divide()
for element-wise operations.
Use square brackets []
with indices and slices to access or modify elements of the array. For multi-dimensional arrays, use commas to separate dimensions.
Use the reshape()
method to change the shape of an array without changing its data. Ensure the new shape is compatible with the original data size.
Use np.vstack()
, np.hstack()
, and np.concatenate()
to stack arrays vertically, horizontally, or along a specified axis.
Use np.nan
to represent missing values and functions like np.isnan()
to identify them. You can use np.nanmean()
, np.nanstd()
, etc., to perform operations while ignoring NaNs.
np.dot()
performs dot products of two arrays, while np.matmul()
performs matrix multiplication. They differ in behavior with higher-dimensional arrays.
Use np.mean()
, np.median()
, and np.std()
to compute these statistical measures.
Use functions from np.random
, such as np.random.rand()
, np.random.randn()
, and np.random.randint()
to generate arrays with random values.
np.apply_along_axis()
applies a function along a specified axis of a NumPy array.
Use np.unique()
with the return_counts=True
argument to get unique elements and their counts.
np.copy()
creates a full copy of the array, while np.view()
creates a new view of the same data without copying it.
Use np.linalg.inv()
for matrix inversion and np.linalg.eig()
for eigenvalue decomposition.
Use np.cumsum()
for cumulative sums and np.cumprod()
for cumulative products.
np.histogram()
computes the histogram of a dataset, which is a way to summarize the distribution of data.
The rank of a NumPy array is equivalent to the number of dimensions, which you can obtain using the ndim
attribute.
NumPy uses broadcasting to perform operations between arrays of different shapes, automatically expanding the smaller array to match the larger one.
Use np.memmap()
to create memory-mapped arrays that allow you to work with large datasets by accessing data on disk as if it were in memory.
Use np.corrcoef()
to compute the correlation coefficient matrix between two or more arrays.
np.meshgrid()
creates coordinate matrices from coordinate vectors, useful for evaluating functions on a grid.
Use the *
operator or np.multiply()
function to perform element-wise multiplication.
np.extract()
extracts elements from an array that satisfy a given condition.
np.arange()
generates arrays with a specified step size, while np.linspace()
generates arrays with a specified number of equally spaced points.
Structured arrays allow for arrays with fields of different data types. You can define the dtype with a list of tuples specifying field names and types.
np.save()
saves an array to a binary file in .npy
format, while np.load()
loads an array from a .npy
file.
Use dtype=object
to create an array with mixed data types. For specific operations, ensure all elements are of compatible types or convert as needed.
np.isnan()
identifies NaN values in arrays. Handle NaNs by using functions like np.nanmean()
, np.nanstd()
, etc., which ignore NaNs during calculations.
Use comparison operators like ==
, !=
, <
, >
, <=
, >=
, and functions like np.equal()
, np.greater()
, etc., for element-wise comparisons.
np.sort()
returns a sorted copy of an array, and you can use the axis
argument to sort along a specified axis.
Use the np.sum()
function with the axis
argument to specify the axis along which to sum the elements.
np.concatenate()
joins arrays along an existing axis, while np.vstack()
stacks arrays vertically (row-wise).
np.mean()
computes the arithmetic mean, while np.average()
allows for weighted averages when weights are provided.
Use logical operators like &
, |
, ~
, and functions like np.logical_and()
, np.logical_or()
, etc., for element-wise logical operations.
np.tile()
repeats the array along specified axes, while np.repeat()
replicates elements of the array.
Use np.dot()
for dot product and np.cross()
for cross product calculations.
Use np.std()
for standard deviation and np.var()
for variance.
np.squeeze()
removes single-dimensional entries from the shape of an array, while np.expand_dims()
adds a new axis of length one to the array.
np.setdiff1d()
returns the unique values in one array that are not in another.
Use np.linalg.norm()
with the difference between the two arrays to calculate the Euclidean distance.
np.partition()
partially sorts an array, so that the k smallest elements are in the first k positions.
np.unique()
returns the sorted unique elements of an array.
np.gradient()
calculates the gradient of an array, which is useful for numerical differentiation.
np.flatnonzero()
returns the indices of the non-zero elements in a flattened array.
np.copyto()
copies values from one array to another with optional broadcasting.
np.ma.masked_where()
masks elements of an array based on a condition, creating a masked array.
np.argwhere()
returns the indices of array elements that satisfy a specified condition.
Use np.cov()
to compute the covariance matrix between two or more arrays.
np.tril()
extracts the lower triangular part, while np.triu()
extracts the upper triangular part of an array.