Make a bean plot of each dataset in the data sequence.
A bean plot is a combination of a violinplot (kernel density estimate of the probability density function per point) with a line-scatter plot of all individual data points.
Parameters: | data : sequence of ndarrays
ax : Matplotlib AxesSubplot instance, optional
labels : list of str, optional
positions : array_like, optional
side : {‘both’, ‘left’, ‘right’}, optional
jitter : bool, optional
plot_opts : dict, optional
|
---|---|
Returns: | fig : Matplotlib figure instance
|
See also
References
P. Kampstra, “Beanplot: A Boxplot Alternative for Visual Comparison of Distributions”, J. Stat. Soft., Vol. 28, pp. 1-9, 2008.
Examples
We use the American National Election Survey 1996 dataset, which has Party Identification of respondents as independent variable and (among other data) age as dependent variable.
>>> data = sm.datasets.anes96.load_pandas()
>>> party_ID = np.arange(7)
>>> labels = ["Strong Democrat", "Weak Democrat", "Independent-Democrat",
... "Independent-Indpendent", "Independent-Republican",
... "Weak Republican", "Strong Republican"]
Group age by party ID, and create a violin plot with it:
>>> plt.rcParams['figure.subplot.bottom'] = 0.23 # keep labels visible
>>> age = [data.exog['age'][data.endog == id] for id in party_ID]
>>> fig = plt.figure()
>>> ax = fig.add_subplot(111)
>>> sm.graphics.beanplot(age, ax=ax, labels=labels,
... plot_opts={'cutoff_val':5, 'cutoff_type':'abs',
... 'label_fontsize':'small',
... 'label_rotation':30})
>>> ax.set_xlabel("Party identification of respondent.")
>>> ax.set_ylabel("Age")
>>> plt.show()
(Source code, png, hires.png, pdf)