In [1]:
#Import library
import seaborn as sns
%matplotlib inline
In [2]:
tips = sns.load_dataset('tips')
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4

barplot and countplot

These very similar plots allow you to get aggregate data off a categorical feature in your data. barplot is a general plot that allows you to aggregate the categorical data based off some function, by default the mean:

In [3]:
C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\ FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval
<matplotlib.axes._subplots.AxesSubplot at 0x9900eb8>
In [4]:
import numpy as np

You can change the estimator object to your own function, that converts a vector to a scalar:

In [5]:
<matplotlib.axes._subplots.AxesSubplot at 0x9a0fc50>


This is essentially the same as barplot except the estimator is explicitly counting the number of occurrences. Which is why we only pass the x value:

In [6]:
<matplotlib.axes._subplots.AxesSubplot at 0x9a6cb00>

boxplot and violinplot

boxplots and violinplots are used to shown the distribution of categorical data. A box plot (or box-and-whisker plot) shows the distribution of quantitative data in a way that facilitates comparisons between variables or across levels of a categorical variable. The box shows the quartiles of the dataset while the whiskers extend to show the rest of the distribution, except for points that are determined to be “outliers” using a method that is a function of the inter-quartile range.

In [7]:
sns.boxplot(x="day", y="total_bill", data=tips,palette='rainbow')
<matplotlib.axes._subplots.AxesSubplot at 0xaaa2668>
In [8]:
# Can do entire dataframe with orient='h'
<matplotlib.axes._subplots.AxesSubplot at 0xab2f898>
In [9]:
sns.boxplot(x="day", y="total_bill", hue="smoker",data=tips, palette="coolwarm")
<matplotlib.axes._subplots.AxesSubplot at 0xaba1ef0>


A violin plot plays a similar role as a box and whisker plot. It shows the distribution of quantitative data across several levels of one (or more) categorical variables such that those distributions can be compared. Unlike a box plot, in which all of the plot components correspond to actual datapoints, the violin plot features a kernel density estimation of the underlying distribution.

In [10]:
sns.violinplot(x="day", y="total_bill", data=tips,palette='rainbow')
<matplotlib.axes._subplots.AxesSubplot at 0xb094e48>
In [11]:
sns.violinplot(x="day", y="total_bill", data=tips,hue='sex',palette='Set1')
<matplotlib.axes._subplots.AxesSubplot at 0xb0ed0b8>
In [12]:
sns.violinplot(x="day", y="total_bill", data=tips,hue='sex',split=True,palette='Set1')
<matplotlib.axes._subplots.AxesSubplot at 0xac36a20>

stripplot and swarmplot

The stripplot will draw a scatterplot where one variable is categorical. A strip plot can be drawn on its own, but it is also a good complement to a box or violin plot in cases where you want to show all observations along with some representation of the underlying distribution.

The swarmplot is similar to stripplot(), but the points are adjusted (only along the categorical axis) so that they don’t overlap. This gives a better representation of the distribution of values, although it does not scale as well to large numbers of observations (both in terms of the ability to show all the points and in terms of the computation needed to arrange them).

In [13]:
sns.stripplot(x="day", y="total_bill", data=tips)
<matplotlib.axes._subplots.AxesSubplot at 0xacc45c0>
In [14]:
sns.stripplot(x="day", y="total_bill", data=tips,jitter=True)
<matplotlib.axes._subplots.AxesSubplot at 0xad15f60>
In [15]:
sns.stripplot(x="day", y="total_bill", data=tips,jitter=True,hue='sex',palette='Set1')
<matplotlib.axes._subplots.AxesSubplot at 0xad6bb70>
In [16]:
sns.stripplot(x="day", y="total_bill", data=tips,jitter=True,hue='sex',palette='Set1',split=True)
C:\ProgramData\Anaconda3\lib\site-packages\seaborn\ UserWarning: The `split` parameter has been renamed to `dodge`.
  warnings.warn(msg, UserWarning)
<matplotlib.axes._subplots.AxesSubplot at 0xae7b588>
In [17]:
sns.swarmplot(x="day", y="total_bill", data=tips)
<matplotlib.axes._subplots.AxesSubplot at 0xaf1fb38>
In [18]:
sns.swarmplot(x="day", y="total_bill",hue='sex',data=tips, palette="Set1", split=True)
C:\ProgramData\Anaconda3\lib\site-packages\seaborn\ UserWarning: The `split` parameter has been renamed to `dodge`.
  warnings.warn(msg, UserWarning)
<matplotlib.axes._subplots.AxesSubplot at 0xaf7cb38>

Combining Categorical Plots

In [19]:
sns.violinplot(x="tip", y="day", data=tips,palette='rainbow')
sns.swarmplot(x="tip", y="day", data=tips,color='black',size=3)
<matplotlib.axes._subplots.AxesSubplot at 0xafdd048>


factorplot is the most general form of a categorical plot. It can take in a kind parameter to adjust the plot type:

In [20]:
C:\ProgramData\Anaconda3\lib\site-packages\seaborn\ UserWarning: The `factorplot` function has been renamed to `catplot`. The original name will be removed in a future release. Please update your code. Note that the default `kind` in `factorplot` (`'point'`) has changed `'strip'` in `catplot`.
<seaborn.axisgrid.FacetGrid at 0x99ef358>

happy Learning....