Logo

Discrete Data Models

In [1]: import numpy as np

In [2]: import statsmodels.api as sm

Load data from Spector and Mazzeo (1980). Examples follow Greene’s Econometric Analysis Ch. 21 (5th Edition).

In [3]: spector_data = sm.datasets.spector.load()

In [4]: spector_data.exog = sm.add_constant(spector_data.exog, prepend=False)

Inspect the data:

In [5]: spector_data.exog[:5, :]
 Out[5]: 
array([[  2.66,  20.  ,   0.  ,   1.  ],
       [  2.89,  22.  ,   0.  ,   1.  ],
       [  3.28,  24.  ,   0.  ,   1.  ],
       [  2.92,  12.  ,   0.  ,   1.  ],
       [  4.  ,  21.  ,   0.  ,   1.  ]])

In [6]: spector_data.endog[:5]
 Out[6]: array([ 0.,  0.,  0.,  0.,  1.])

Linear Probability Model (OLS)

In [7]: lpm_mod = sm.OLS(spector_data.endog, spector_data.exog)

In [8]: lpm_res = lpm_mod.fit()

In [9]: print lpm_res.params[:-1]
[ 0.4639  0.0105  0.3786]

Logit Model

In [10]: logit_mod = sm.Logit(spector_data.endog, spector_data.exog)

In [11]: logit_res = logit_mod.fit()
Optimization terminated successfully.
         Current function value: 0.402801
         Iterations 7

In [12]: print logit_res.params
[  2.8261   0.0952   2.3787 -13.0213]

In [13]: logit_margeff = logit_res.get_margeff(method='dydx', at='overall')

In [14]: print logit_margeff.summary()
        Logit Marginal Effects       
=====================================
Dep. Variable:                      y
Method:                          dydx
At:                           overall
==============================================================================
                dy/dx    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             0.3626      0.109      3.313      0.001         0.148     0.577
x2             0.0122      0.018      0.686      0.493        -0.023     0.047
x3             0.3052      0.092      3.304      0.001         0.124     0.486
==============================================================================

l1 regularized logit

The regularization parameter alpha should be a scalar or have the same shape as results.params

In [15]: alpha = 0.1 * len(spector_data.endog) * np.ones(spector_data.exog.shape[1])

Choose not to regularize the constant

In [16]: alpha[-1] = 0

In [17]: logit_l1_res = logit_mod.fit_regularized(method='l1', alpha=alpha)
Optimization terminated successfully.    (Exit mode 0)
            Current function value: 0.610664637572
            Iterations: 13
            Function evaluations: 15
            Gradient evaluations: 13

In [18]: print logit_l1_res.summary()
                           Logit Regression Results                           
==============================================================================
Dep. Variable:                      y   No. Observations:                   32
Model:                          Logit   Df Residuals:                       30
Method:                           MLE   Df Model:                            1
Date:                Fri, 16 Aug 2013   Pseudo R-squ.:                 0.07470
Time:                        07:59:41   Log-Likelihood:                -19.054
converged:                       True   LL-Null:                       -20.592
                                        LLR p-value:                   0.07944
==============================================================================
                 coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1                  0        nan        nan        nan           nan       nan
x2             0.1524      0.112      1.361      0.174        -0.067     0.372
x3                  0        nan        nan        nan           nan       nan
const         -4.0457      2.566     -1.577      0.115        -9.075     0.984
==============================================================================

As in all the discrete data models presented below, we can print a nice summary of results:

In [19]: print logit_res.summary()
                           Logit Regression Results                           
==============================================================================
Dep. Variable:                      y   No. Observations:                   32
Model:                          Logit   Df Residuals:                       28
Method:                           MLE   Df Model:                            3
Date:                Fri, 16 Aug 2013   Pseudo R-squ.:                  0.3740
Time:                        07:59:41   Log-Likelihood:                -12.890
converged:                       True   LL-Null:                       -20.592
                                        LLR p-value:                  0.001502
==============================================================================
                 coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             2.8261      1.263      2.238      0.025         0.351     5.301
x2             0.0952      0.142      0.672      0.501        -0.182     0.373
x3             2.3787      1.065      2.234      0.025         0.292     4.465
const        -13.0213      4.931     -2.641      0.008       -22.687    -3.356
==============================================================================

Probit Model

In [20]: probit_mod = sm.Probit(spector_data.endog, spector_data.exog)

In [21]: probit_res = probit_mod.fit()
Optimization terminated successfully.
         Current function value: 0.400588
         Iterations 6

In [22]: print probit_res.params
[ 1.6258  0.0517  1.4263 -7.4523]

In [23]: probit_margeff = probit_res.get_margeff()

In [24]: print probit_margeff.summary()
       Probit Marginal Effects       
=====================================
Dep. Variable:                      y
Method:                          dydx
At:                           overall
==============================================================================
                dy/dx    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             0.3608      0.113      3.182      0.001         0.139     0.583
x2             0.0115      0.018      0.624      0.533        -0.025     0.048
x3             0.3165      0.090      3.508      0.000         0.140     0.493
==============================================================================

Multinomial Logit

Load data from the American National Election Studies:

In [25]: anes_data = sm.datasets.anes96.load()

In [26]: anes_exog = anes_data.exog

In [27]: anes_exog = sm.add_constant(anes_exog, prepend=False)

Inspect the data:

In [28]: print anes_data.exog[:5, :]
[[ -2.3026   7.      36.       3.       1.    ]
 [  5.2476   3.      20.       4.       1.    ]
 [  3.4372   2.      24.       6.       1.    ]
 [  4.42     3.      28.       6.       1.    ]
 [  6.4616   5.      68.       6.       1.    ]]

In [29]: print anes_data.endog[:5]
[ 6.  1.  1.  1.  0.]

Fit MNL model

In [30]: mlogit_mod = sm.MNLogit(anes_data.endog, anes_exog)

In [31]: mlogit_res = mlogit_mod.fit()
Optimization terminated successfully.
         Current function value: 1.548647
         Iterations 7

In [32]: print mlogit_res.params
[[ -0.0115  -0.0888  -0.106   -0.0916  -0.0933  -0.1409]
 [  0.2977   0.3917   0.5735   1.2788   1.347    2.0701]
 [ -0.0249  -0.0229  -0.0149  -0.0087  -0.0179  -0.0094]
 [  0.0825   0.181   -0.0072   0.1998   0.2169   0.3219]
 [  0.0052   0.0479   0.0576   0.0845   0.081    0.1089]
 [ -0.3734  -2.2509  -3.6656  -7.6138  -7.0605 -12.1058]]

In [33]: mlogit_margeff = mlogit_res.get_margeff()

In [34]: print mlogit_margeff.summary()
       MNLogit Marginal Effects      
=====================================
Dep. Variable:                      y
Method:                          dydx
At:                           overall
==============================================================================
       y=0      dy/dx    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             0.0087      0.004      2.250      0.024         0.001     0.016
x2            -0.0978      0.008    -12.153      0.000        -0.114    -0.082
x3             0.0027      0.001      3.856      0.000         0.001     0.004
x4            -0.0199      0.008     -2.420      0.016        -0.036    -0.004
x5            -0.0060      0.002     -2.977      0.003        -0.010    -0.002
------------------------------------------------------------------------------
       y=1      dy/dx    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             0.0070      0.004      1.791      0.073        -0.001     0.015
x2            -0.0502      0.007     -6.824      0.000        -0.065    -0.036
x3            -0.0021      0.001     -2.789      0.005        -0.004    -0.001
x4            -0.0054      0.008     -0.636      0.525        -0.022     0.011
x5            -0.0055      0.002     -2.707      0.007        -0.010    -0.002
------------------------------------------------------------------------------
       y=2      dy/dx    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.0039      0.003     -1.246      0.213        -0.010     0.002
x2            -0.0282      0.006     -4.972      0.000        -0.039    -0.017
x3            -0.0010      0.001     -1.523      0.128        -0.002     0.000
x4             0.0066      0.007      0.964      0.335        -0.007     0.020
x5             0.0010      0.002      0.530      0.596        -0.003     0.005
------------------------------------------------------------------------------
       y=3      dy/dx    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.0018      0.002     -0.940      0.347        -0.006     0.002
x2            -0.0057      0.003     -1.798      0.072        -0.012     0.001
x3         -4.249e-05      0.000     -0.110      0.912        -0.001     0.001
x4            -0.0055      0.004     -1.253      0.210        -0.014     0.003
x5             0.0005      0.001      0.469      0.639        -0.002     0.003
------------------------------------------------------------------------------
       y=4      dy/dx    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.0010      0.003     -0.330      0.741        -0.007     0.005
x2             0.0199      0.005      3.672      0.000         0.009     0.030
x3             0.0005      0.001      0.815      0.415        -0.001     0.002
x4             0.0017      0.006      0.268      0.789        -0.011     0.014
x5             0.0021      0.002      1.119      0.263        -0.002     0.006
------------------------------------------------------------------------------
       y=5      dy/dx    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.0015      0.004     -0.429      0.668        -0.009     0.005
x2             0.0376      0.007      5.404      0.000         0.024     0.051
x3            -0.0007      0.001     -0.949      0.343        -0.002     0.001
x4             0.0047      0.008      0.604      0.546        -0.011     0.020
x5             0.0025      0.002      1.140      0.254        -0.002     0.007
------------------------------------------------------------------------------
       y=6      dy/dx    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.0074      0.003     -2.205      0.027        -0.014    -0.001
x2             0.1246      0.008     14.875      0.000         0.108     0.141
x3             0.0006      0.001      0.942      0.346        -0.001     0.002
x4             0.0177      0.007      2.403      0.016         0.003     0.032
x5             0.0054      0.002      2.490      0.013         0.001     0.010
==============================================================================

l1 regularized Multinomial Logit

The regularization parameter alpha should be a scalar or have the same shape as as results.params

In [35]: alpha = 10 * np.ones((mlogit_mod.K, mlogit_mod.J - 1))

Choose not to regularize the constant

In [36]: alpha[-1, :] = 0

In [37]: mlogit_mod2 = sm.MNLogit(anes_data.endog, anes_exog)

In [38]: mlogit_l1_res = mlogit_mod2.fit_regularized(method='l1', alpha=alpha)
Optimization terminated successfully.    (Exit mode 0)
            Current function value: 1.61249508023
            Iterations: 122
            Function evaluations: 144
            Gradient evaluations: 122

In [39]: print mlogit_l1_res.summary()
                          MNLogit Regression Results                          
==============================================================================
Dep. Variable:                      y   No. Observations:                  944
Model:                        MNLogit   Df Residuals:                      912
Method:                           MLE   Df Model:                           26
Date:                Fri, 16 Aug 2013   Pseudo R-squ.:                  0.1558
Time:                        07:59:46   Log-Likelihood:                -1477.7
converged:                       True   LL-Null:                       -1750.3
                                        LLR p-value:                 1.464e-98
==============================================================================
       y=1       coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             0.0010      0.033      0.030      0.976        -0.065     0.067
x2                  0        nan        nan        nan           nan       nan
x3            -0.0218      0.006     -3.494      0.000        -0.034    -0.010
x4                  0        nan        nan        nan           nan       nan
x5                  0        nan        nan        nan           nan       nan
const          0.9099      0.328      2.774      0.006         0.267     1.553
------------------------------------------------------------------------------
       y=2       coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.0586      0.038     -1.529      0.126        -0.134     0.017
x2             0.0319      0.090      0.354      0.724        -0.145     0.208
x3            -0.0199      0.008     -2.577      0.010        -0.035    -0.005
x4             0.0331      0.075      0.441      0.660        -0.114     0.180
x5             0.0445      0.020      2.197      0.028         0.005     0.084
const         -0.5008      0.686     -0.730      0.466        -1.846     0.845
------------------------------------------------------------------------------
       y=3       coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.0615      0.057     -1.087      0.277        -0.172     0.049
x2             0.1205      0.139      0.868      0.385        -0.151     0.392
x3            -0.0081      0.011     -0.741      0.459        -0.029     0.013
x4                  0        nan        nan        nan           nan       nan
x5             0.0325      0.029      1.108      0.268        -0.025     0.090
const         -2.0829      0.929     -2.242      0.025        -3.904    -0.262
------------------------------------------------------------------------------
       y=4       coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.0477      0.042     -1.144      0.252        -0.129     0.034
x2             0.8321      0.105      7.918      0.000         0.626     1.038
x3            -0.0049      0.008     -0.613      0.540        -0.020     0.011
x4             0.0236      0.082      0.288      0.774        -0.137     0.185
x5             0.0766      0.024      3.233      0.001         0.030     0.123
const         -5.2613      0.842     -6.246      0.000        -6.912    -3.610
------------------------------------------------------------------------------
       y=5       coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.0522      0.037     -1.413      0.158        -0.125     0.020
x2             0.9233      0.092     10.012      0.000         0.743     1.104
x3            -0.0140      0.007     -1.972      0.049        -0.028  -8.4e-05
x4             0.0549      0.071      0.769      0.442        -0.085     0.195
x5             0.0727      0.020      3.618      0.000         0.033     0.112
const         -4.8678      0.727     -6.693      0.000        -6.293    -3.442
------------------------------------------------------------------------------
       y=6       coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.0952      0.039     -2.442      0.015        -0.172    -0.019
x2             1.5681      0.116     13.472      0.000         1.340     1.796
x3            -0.0056      0.007     -0.752      0.452        -0.020     0.009
x4             0.1466      0.076      1.925      0.054        -0.003     0.296
x5             0.0968      0.022      4.376      0.000         0.053     0.140
const         -9.3154      0.913    -10.207      0.000       -11.104    -7.527
==============================================================================

Poisson model

Load the Rand data. Note that this example is similar to Cameron and Trivedi’s Microeconometrics Table 20.5, but it is slightly different because of minor changes in the data.

In [40]: rand_data = sm.datasets.randhie.load()

In [41]: rand_exog = rand_data.exog.view(float).reshape(len(rand_data.exog), -1)

In [42]: rand_exog = sm.add_constant(rand_exog, prepend=False)

Fit Poisson model:

In [43]: poisson_mod = sm.Poisson(rand_data.endog, rand_exog)

In [44]: poisson_res = poisson_mod.fit(method="newton")
Optimization terminated successfully.
         Current function value: 3.091609
         Iterations 12

In [45]: print poisson_res.summary()
                          Poisson Regression Results                          
==============================================================================
Dep. Variable:                      y   No. Observations:                20190
Model:                        Poisson   Df Residuals:                    20180
Method:                           MLE   Df Model:                            9
Date:                Fri, 16 Aug 2013   Pseudo R-squ.:                 0.06343
Time:                        07:59:47   Log-Likelihood:                -62420.
converged:                       True   LL-Null:                       -66647.
                                        LLR p-value:                     0.000
==============================================================================
                 coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.0525      0.003    -18.216      0.000        -0.058    -0.047
x2            -0.2471      0.011    -23.272      0.000        -0.268    -0.226
x3             0.0353      0.002     19.302      0.000         0.032     0.039
x4            -0.0346      0.002    -21.439      0.000        -0.038    -0.031
x5             0.2717      0.012     22.200      0.000         0.248     0.296
x6             0.0339      0.001     60.098      0.000         0.033     0.035
x7            -0.0126      0.009     -1.366      0.172        -0.031     0.005
x8             0.0541      0.015      3.531      0.000         0.024     0.084
x9             0.2061      0.026      7.843      0.000         0.155     0.258
const          0.7004      0.011     62.741      0.000         0.678     0.722
==============================================================================

In [46]: poisson_margeff = poisson_res.get_margeff()

In [47]: print poisson_margeff.summary()
       Poisson Marginal Effects      
=====================================
Dep. Variable:                      y
Method:                          dydx
At:                           overall
==============================================================================
                dy/dx    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.1503      0.008    -18.164      0.000        -0.166    -0.134
x2            -0.7068      0.031    -23.164      0.000        -0.767    -0.647
x3             0.1009      0.005     19.240      0.000         0.091     0.111
x4            -0.0989      0.005    -21.354      0.000        -0.108    -0.090
x5             0.7772      0.035     22.106      0.000         0.708     0.846
x6             0.0971      0.002     58.303      0.000         0.094     0.100
x7            -0.0361      0.026     -1.366      0.172        -0.088     0.016
x8             0.1546      0.044      3.530      0.000         0.069     0.240
x9             0.5896      0.075      7.839      0.000         0.442     0.737
==============================================================================

l1 regularized Poisson model

In [48]: poisson_mod2 = sm.Poisson(rand_data.endog, rand_exog)

In [49]: alpha = 0.1 * len(rand_data.endog) * np.ones(rand_exog.shape[1])

In [50]: alpha[-1] = 0

In [51]: poisson_l1_res = poisson_mod2.fit_regularized(method='l1', alpha=alpha)
Optimization terminated successfully.    (Exit mode 0)
            Current function value: 3.13647450061
            Iterations: 19
            Function evaluations: 25
            Gradient evaluations: 19

Negative binomial model

The negative binomial model gives slightly different results:

In [52]: mod_nbin = sm.NegativeBinomial(rand_data.endog, rand_exog)

In [53]: res_nbin = mod_nbin.fit(disp=False)

In [54]: print res_nbin.summary()
                     NegativeBinomial Regression Results                      
==============================================================================
Dep. Variable:                      y   No. Observations:                20190
Model:               NegativeBinomial   Df Residuals:                    20180
Method:                           MLE   Df Model:                            9
Date:                Fri, 16 Aug 2013   Pseudo R-squ.:                 0.01845
Time:                        07:59:49   Log-Likelihood:                -43384.
converged:                      False   LL-Null:                       -44199.
                                        LLR p-value:                     0.000
==============================================================================
                 coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -0.0580      0.006     -9.517      0.000        -0.070    -0.046
x2            -0.2678      0.023    -11.802      0.000        -0.312    -0.223
x3             0.0412      0.004      9.937      0.000         0.033     0.049
x4            -0.0381      0.003    -11.219      0.000        -0.045    -0.031
x5             0.2690      0.030      8.982      0.000         0.210     0.328
x6             0.0382      0.001     26.081      0.000         0.035     0.041
x7            -0.0441      0.020     -2.200      0.028        -0.083    -0.005
x8             0.0172      0.036      0.477      0.633        -0.054     0.088
x9             0.1780      0.074      2.397      0.017         0.032     0.324
const          0.6636      0.025     26.788      0.000         0.615     0.712
alpha          1.2930      0.019     69.477      0.000         1.256     1.329
==============================================================================

Alternative solvers

The default method for fitting discrete data MLE models is Newton-Raphson. You can use other solvers by using the method argument:

In [55]: mlogit_res = mlogit_mod.fit(method='bfgs', maxiter=500)
Optimization terminated successfully.
         Current function value: 1.548647
         Iterations: 108
         Function evaluations: 119
         Gradient evaluations: 119