s

Histograms



Description of the “Histograms” project…

Histograms are usually used to count values in a dataset. The data is grouped into bins, the count in the bin is then displayed. Histograms show bars that correspond to a summary statistic about a group of data points. The heights of the bars correspond to the count of each bin. (other functions other than count could be used). Histograms show how the data is distributed, where the data is centered around, is it skewed one way or the other.

gap = px.data.gapminder()
  • The number of bins can be changed using nbins.
  • The y-axis is given the title ‘count’.
  • Categorical columns can be used to colour the bars
px.histogram(data_frame = gap, x = 'lifeExp', color = 'continent')

A smaller dataframe with only 2 years

gap['year'].unique()
gapx = gap[gap['year'].isin([2002,2007])]

Side by side bars using barmode='group'

px.histogram(gapx, x = 'lifeExp', color ='year', barmode='group')

Facets to display side by side charts using facet_col.

px.histogram(gapx, x = 'lifeExp', color ='year', facet_col='year')

Showing the percentage of the number of values in a bin using histnorm='percent'

  • can add a ticksuffix to show the percentage % sign
  • The height of the bars represent the percentage instead of the absolute count
fig = px.histogram(gapx, x = 'lifeExp', color ='year', histnorm='percent', facet_col='year')
fig.layout.yaxis.ticksuffix = '%'
fig.layout.yaxis.title = 'Percent of total'
fig
Histograms screenshot

Tech used:
  • Python
  • HTML