This method has a limitation in that it can compute the correlation matrix between two variables only.
The full name is the Pearson Product Moment Correlation (PPMC).
The PPMC is not able to tell the difference between dependent variables and independent variables.
The documentation about this function can be found here.
More examples of Pearson Correlation can be found on this website.
My example presented in this tutorial, use the random packet to randomly generate integers and then calculate the correlation coefficients.
All of these are calculated five times in a for a cycle and each time the seed parameters are changed randomly.
Each time the correlation matrices are printed and then the random number graphs are displayed.
Let's see the source code:
import random
import numpy as np
nr_integers = 100
size_integers = 100
import matplotlib
import matplotlib.pyplot as plt
# set from 0 to 4 seed for random and show result
for e in range(5):
# change random seed
np.random.seed(e)
# nr_integers random integers between 0 and size_integers
x = np.random.randint(0, size_integers, nr_integers)
# Positive Correlation with some noise created with
# nr_integers random integers between 0 and size_integers
positive_y = x + np.random.normal(0, size_integers, nr_integers)
correlation_positive = np.corrcoef(x, positive_y)
# show matrix for correlation_positive
print(correlation_positive)
# Negative Correlation with same noise created with
# nr_integers random integers between 0 and size_integers
negative_y = 100 - x + np.random.normal(0, size_integers, nr_integers)
correlation_negative = np.corrcoef(x, negative_y)
# show matrix for output with plt
print(correlation_negative)
# set graphic for plt with two graphics for each output with subplot
plt.subplot(1, 2, 1)
plt.scatter(x,positive_y)
plt.subplot(1, 2, 2)
plt.scatter(x,negative_y)
# show the graph
plt.show()