Statistics library in python

Python has a built-in library called statistics. This module provides functions for calculating mathematical statistics of numeric (Real-valued) data.

Unless explicitly noted, these functions support intfloatDecimal and Fraction.

Some of the most common methods supported by this module are,

  1. mean()
  2. fmean()
  3. geometric_mean()
  4. median()
  5. median_low()
  6. median_high()
  7. mode()
  8. multimode()

Let us look at the above methods one by one with an example for each.

1. mean()

The first method is the mean(). This method returns the sample arithmetic mean of data which can be a sequence or iterable.

The arithmetic mean is the sum of the data divided by the number of data points. If we find are finding the average without this module we have to do some additional tasks like finding the number of elements in the iterable, finding the sum of the elements in the iterable, and then dividing them both.

The following code demonstrates how to find the mean without the statistics module.

Image for post
finding the average in python

Let us see how to find the average using the mean() method from the statistics module. We have to import the mean() method from the statistics module and pass the iterable to this method.

Image for post
finding average using the mean() method

Note that the output is an integer. Incase if the result of the mean is a float value the output will also be a float value.

2. fmean()

This method converts data to floats and then compute the mean. This runs faster than the mean() function and it always returns a float. The data may be a sequence or iterable. If the input dataset is empty, raises a StatisticsError.

This method is available in python 3.8+

Image for post
fmean() in python

Note that the same example from above yields a float value with the fmean() method.

3. geometric_mean()

This method will convert data to floats and compute the geometric mean.

In mathematics, the geometric mean is a mean or average, which indicates the central tendency or typical value of a set of numbers by using the product of their values (as opposed to the arithmetic mean which uses their sum).

For example, the geometric mean calculation can be easily understood with simple numbers, such as 2 and 8. If you multiply 2 and 8, then take the square root (the ½ power since there are only 2 numbers), the answer is 4.

Image for post
geometric mean in python

If data is empty, StatisticsError will be raised.

4. median()

The median() method will return the median (middle value) of numeric data, using the common “mean of the middle two” method.

When the number of values is odd, the middle value is returned.

Image for post
the median() method with an odd number of values

When the number of values is even, the median is interpolated by taking the average of the two middle values.

Image for post
the median() method with an even number of values

If data is empty, StatisticsError is raised. data can be a sequence or iterable.

5. median_low()

The median_low() will not affect the data with an odd number of values. The middle element will be returned as usual.

But, if the values are even then instead of finding the average of the middle two values it will return the smallest of the two values.

Image for post
median low in python

The two middle values 3 and 4 are considered. 3, which is the smaller among these two is returned.

6. median_high()

This method is similar to the median_low() method. But, instead of returning the smallest of two, this method will return the largest of two.

Image for post
median high in python

7. mode()

The mode() method will return the most frequent data point from discrete or nominal data. The mode (when it exists) is the most typical value and serves as a measure of central location.

If there are multiple modes with the same frequency, returns the first one encountered in the data. If the input data is empty, StatisticsError is raised.

Image for post
mode method in statistics

3 is the most frequent element from the above list. The mode method also supports non-numerical data such as strings.

Image for post
mode with a list of strings

‘a’ is the most frequent element in the above list.

8. multimode()

Unlike the mode() method if the list has multiple elements of the same frequency, the multimode() method will return them all as a list.

Image for post
multimode in python

The element which is encountered first is added to the list first, the second element next, and so on. In the above example 1 and 2 both have a frequency of 2 and they both are returned in the list.

This method also works with a list of strings.

Image for post
multimode with a string of lists

Conclusion

Hope this article is helpful. If you have any queries leave them in the comments below.

Happy coding!