Python Tutorials · Machine Learning

Normal Data Distribution

Learn all about Normal Data Distribution in this comprehensive tutorial.

5 min read advanced

•In the previous chapter we learned how to create a completely random array, of a given size, and between two given values.

Normal Data Distribution

In the previous chapter we learned how to create a completely random array, of a given size, and between two given values.

In this chapter we will learn how to create an array where the values are concentrated around a given value.

In probability theory this kind of data distribution is known as the normal data distribution, or the Gaussian data distribution, after the mathematician Carl Friedrich Gauss who came up with the formula of this data distribution.

python

import numpyimport matplotlib.pyplot as pltx = 
  numpy.random.normal(5.0, 1.0, 100000)plt.hist(val_1, 100)plt.show()

Note: Note: A normal distribution graph is also known as the bell curve because of it's characteristic shape of a bell.

We use the array from the numpy.random.normal() method, with 100000 values, to draw a histogram with 100 bars.

We specify that the mean value is 5.0, and the standard deviation is 1.0.

Meaning that the values should be concentrated around 5.0, and rarely further away than 1.0 from the mean.

And as you can see from the histogram, most values are between 4.0 and 6.0, with a top at approximately 5.0.

Module quiz

2 questions

Which of the following is true about Normal Data Distribution?

It is strictly typedIt is evaluated at runtimeIt requires explicit memory managementIt is deprecated

What is the most common pitfall when working with Normal Data Distribution?

SyntaxErrorIndentationErrorNameErrorTypeError

Answer all questions to submit.

Data Distribution

Scatter Plot

Normal Data Distribution

AI Summary

Normal Data Distribution

Module quiz