Bayesian Models

Otter, Thomas

doi:10.1007/978-3-319-05542-8_24-1

Thomas Otter⁴

503 Accesses

Abstract

Bayesian models have become a mainstay in the tool set for marketing research in academia and industry practice. In this chapter, I discuss the advantages the Bayesian approach offers to researchers in marketing, the essential building blocks of a Bayesian model, Bayesian model comparison, and useful algorithmic approaches to fully Bayesian estimation. I show how to achieve feasible Bayesian inference to support marketing decisions under uncertainty using the Gibbs sampler, the Metropolis Hastings algorithm, and point to more recent developments – specifically the no-U-turn implementation of Hamiltonian Monte Carlo sampling available in Stan. The emphasis is on the development of an appreciation of Bayesian inference techniques supported by references to implementations in the open source software R, and not on the discussion of individual models. The goal is to encourage researchers to formulate new, more complete, and useful prior structures that can be updated with data for better marketing decision support.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Albert, J. H., & Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88(422), 669–679. http://www.jstor.org/stable/2290350
Allenby, G. M., Arora, N., & Ginter, J. L. (1995). Incorporating prior knowledge into the analysis of conjoint studies. Journal of Marketing Research, 32(2), 152–162. http://www.jstor.org/stable/3152044
Allenby, G. M., Arora, N., & Ginter, J. L. (1998). On the heterogeneity of demand. Journal of Marketing Research, 35(3), 384–389. http://www.jstor.org/stable/3152035
Amemiya, T. (1985). Advanced econometrics. Cambridge, MA: Harvard University Press.
Google Scholar
Baron, R. M., & Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182. https://doi.org/10.1037/0022-3514.51.6.1173.
Article Google Scholar
Bernardo, J. M., & Smith, A. F. M. (2001). Bayesian theory. Measurement Science and Technology, 12(2), 221. http://stacks.iop.org/0957-0233/12/i=2/a=702.
Google Scholar
Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society. Series B (Methodological), 36(2), 192–236. http://www.jstor.org/stable/2984812
Google Scholar
Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., & Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, Articles, 76(1), 1–32. https://doi.org/10.18637/jss.v076.i01. https://www.jstatsoft.org/v076/i01.
Article Google Scholar
Chen, M.-H., Shao, Q.-M., & Ibrahim, J. G. (2000). Monte Carlo methods in Bayesian computation. New York: Springer. http://gateway.library.qut.edu.au/login?url=http://link.springer.com/openurl?genre=book&isbn=978-1-4612-1276-8.
Book Google Scholar
Chib, S., & Carlin, B. P. (1999). On MCMC sampling in hierarchical longitudinal models. Statistics and Computing, 9(1), 17–26. https://doi.org/10.1023/A:1008853808677.
Article Google Scholar
Eddelbuettel, D. (2013). Seamless R and C+ + integration with Repp. New York: Springer.
Book Google Scholar
Eddelbuettel, D., & François, R. (2011). Repp: Seamless R and C++ integration. Journal of Statistical Software, 40(8), 1–18. https://doi.org/10.18637/jss.v040.i08. http://www.jstatsoft.org/v40/i08/
Edwards, Y. D., & Allenby, G. M. (2003). Multivariate analysis of multiple response data. Journal of Marketing Research, 40(3), 321–334. https://doi.org/10.1509/jmkr.40.3.321.19233.
Article Google Scholar
Fasiolo, M. (2016). An introduction to mvnfast. R package version 0.1.6. https://CRAN.R-project.org/package=mvnfast
Frühwirth-Schnatter, S., Tüchler, R., & Otter, T. (2004). Bayesian analysis of the heterogeneity model. Journal of Business & Economic Statistics, 22(1), 2–15. https://doi.org/10.1198/073500103288619331.
Article Google Scholar
Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., & Hothorn, T. (2018). mvtnorm: Multivariate normal and t distributions. https://CRAN.R-project.org/package=mvtnorm. R package version 1.0-8.
Geweke, John. (1991). Efficient simulation from the multivariate normal and student-t distributions subject to linear constraints and the evaluation of constraint probabilities. In: E. M. Keramidas (Ed.), Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface, pp. 571–578.
Google Scholar
Gilks, W. R. (1996). Full conditional distributions. In S. (Sylvia) Richardson, D. J Spiegelhalter, & W. R. (Walter R.) Gilks (Eds.), Markov chain Monte Carlo in practice (pp. 75–88). London/Melbourne: Chapman & Hall.
Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The elements of statistical learning: data mining, inference, and prediction. New York: Springer.
Book Google Scholar
Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634.
Article Google Scholar
Hoffman, M. D., & Gelman, A. (2014). The no-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15, 1593–1623. http://jmlr.org/papers/vl5/hoffmanl4a.html.
Google Scholar
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90(430), 773–795. https://doi.org/10.1080/01621459.1995.10476572.
Article Google Scholar
Lenk, P. J., & DeSarbo, W. S. (2000). Bayesian inference for finite mixtures of generalized linear models with random effects. Psychometrika, 65(1), 93–119. https://doi.org/10.1007/BF02294188.
Article Google Scholar
Lenk, P. J., DeSarbo, W. S., Green, P. E., & Young, M. R. (1996). Hierarchical Bayes conjoint analysis: Recovery of partworth heterogeneity from reduced experimental designs. Marketing Science, 15(2), 173–191. https://doi.org/10.1287/mksc.15.2.173.
Article Google Scholar
Long, J. S. (1997). Regression models for categorical and limited dependent variables. Thousand Oaks: Sage Publications. https://uk.sagepub.com/en-gb/eur/regression-models-for-categorical-and-limited-dependent-variables/book6071.
Google Scholar
McCulloch, R., & Rossi, P. (1994). An exact likelihood analysis of the multinomial probit model. Journal of Econometrics, 64(1–2), 207–240. https://EconPapers.repec.org/RePEc:eee:econom:v:64:y:1994:i:1-2:p:207-240.
Article Google Scholar
Mersmann, O., Trautmann, H., Steuer, D., & Bornkamp, B. (2018). truncnorm: Truncated normal distribution. https://CRAN.R-project.org/package=truncnorm. R package version 1.0-8
Montgomery, A. L., & Bradlow, E. T. (1999). Why analyst overconfidence about the functional form of demand models can lead to overpricing. Marketing Science, 18(4), 569–583. http://www.jstor.org/stable/193243
Neal, R. M. (2011). MCMC using Hamiltonian dynamics. In S. Brooks, A. Gelman, G. L. Jones, & X-L. Meng (Eds.), Handbook of Markov chain Monte Carlo (Chap. 5). Chapman & Hall/CRC. http://arxiv.org/abs/1206.1901
Orme, B. (2017). The CBC system for choice-based conjoint analysis. Technical Report. https://sawtoothsoftware.com/download/techpap/cbctech.pdf
Otter, T., Tüchler, R., & Frühwirth-Schnatter, S. (2004). Capturing consumer heterogeneity in metric conjoint analysis using Bayesian mixture models. International Journal of Research in Marketing, 21(3), 285–297. https://doi.org/10.1016/j.ijresmar.2003.11.002. http://www.sciencedirect.com/science/article/pii/S0167811604000308
Article Google Scholar
Otter, T., Gilbride, T. J., & Allenby, G. M. (2011). Testing models of strategic behavior characterized by conditional likelihoods. Marketing Science, 30(4), 686–701. http://www.jstor.org/stable/23012019
Otter, T., Pachali, M. J., Mayer, S., & Landwehr, J. R. (2018). Causal inference using mediation analysis or instrumental variables – Full mediation in the absence of conditional independence. Marketing ZFP, 40(2), 41–57. https://doi.org/10.15358/0344-1369-2018-2-41.
Article Google Scholar
Pachali, M. J., Kurz, P., & Otter, T. (2018). How to generalize from a hierarchical model? Technical Report. https://ssrn.com/abstract=3018670
Pearl, J. (2009). Causality: Models, reasoning and inference (2nd ed.). New York: Cambridge University Press.
Book Google Scholar
Plummer, M., Best, N., Cowles, K., & Vines, K. (2006). Coda: Convergence diagnosis and output analysis for MCMC. R News, 6(1), 7–11. https://journal.r-project.org/archive/.
Google Scholar
Ritter, C., & Tanner, M. A. (1992). Facilitating the Gibbs sampler: The Gibbs stopper and the Griddy-Gibbs sampler. Journal of the American Statistical Association, 87(419), 861–868. https://doi.org/10.1080/01621459.1992.10475289.
Article Google Scholar
Robert, C. P. (1994). The Bayesian choice: a decision-theoretic motivation. New York: Springer.
Book Google Scholar
Roberts, G. O. (1996). Markov chain concepts related to sampling algorithms. In S. (Sylvia) Richardson, D. J. Spiegelhalter, & W. R. (Walter R.) Gilks (Eds.), Markov chain Monte Carlo in practice (pp. 45–58). London/Melbourne: Chapman & Hall.
Google Scholar
Rossi, P. E., McCulloch, R. E., & Allenby, G. M. (1996). The value of purchase history data in target marketing. Marketing Science, 15(4), 321–340. https://doi.org/10.1287/mksc.l5.4.321.
Article Google Scholar
Rossi, P. E., Allenby, G. M., & McCulloch, R. E. (2005). Bayesian statistics and marketing. Chichester: Wiley.
Book Google Scholar
Wachtel, S., & Otter, T. (2013). Successive sample selection and its relevance for management decisions. Marketing Science, 32(1), 170–185. https://doi.org/10.1287/mksc.1120.0754.
Article Google Scholar
Zellner, A. (1971). An introduction to Bayesian inference in econometrics. New York: Wiley.
Google Scholar

Download references

Acknowledgments

I would like to thank Anocha Aribarg, Albert Bemmaor, Joachim Büschken, Arash Laghaie, anonymous reviewers, the editors, and participants in my class on “Bayesian Modeling for Marketing” helpful comments and feedback. All remaining errors are obviously mine.

Author information

Authors and Affiliations

Goethe University Frankfurt am Main, Frankfurt am Main, Germany
Thomas Otter

Authors

Thomas Otter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thomas Otter .

Editor information

Editors and Affiliations

Universität Mannheim, Mannheim, Germany
Christian Homburg
Inst. Informations Systems and Marketing Marketing & Sales Research Group, Karlsruher Institut für Technologie, Karlsruhe, Germany
Martin Klarmann
LS für ABWL und Marketing I, Universität Mannheim, Mannheim, Baden-Württemberg, Germany
Arnd Vomberg

Appendix

MCMC for Binomial Probit Without Data Augmentation

Simulate data, call MCMC routine, plot MCMC-traces. This R-script sources a RW-MH-sampler for the binomial probit model (see teh following script), simulates probit data, and runs the code with different step-sizes (standard deviations of ϵ).

# may need to install these packages first library ( bayesm ) library ( latex2exp ) # needs to be in R's working directory source (' rbprobitRWMetropolis .r') # function to simulate from binary probit simbprobit = function (X, beta ) { y= ifelse ((X%*% beta + rnorm ( nrow (X))) <0 ,0 ,1) list (X=X,y=y, beta = beta ) } nobs =500 # number of simulated observations X= cbind ( rep (1, nobs ), runif ( nobs ), runif ( nobs )) beta =c( -3 ,2 ,4) # data generating parameters nvar = ncol (X) simout = simbprobit (X, beta ) # probit responses y= simout $y R =200000 # length of MCMC sample # data list to passed to MCMC routine Data = list (X= simout $X,y= simout $y) Mcmc = list (R=R, keep =1) # prior mean set to zero, prior variances set to 100 Prior = list ( betabar = double ( nvar ),A= diag ( rep (.01, nvar ))) out _1= rbprobitRWMetropolis ( Data =Data, Mcmc =Mcmc, Prior =Prior, stepsize =.001) out _2= rbprobitRWMetropolis ( Data =Data, Mcmc =Mcmc, Prior =Prior, stepsize =.005) out _3= rbprobitRWMetropolis ( Data =Data, Mcmc =Mcmc, Prior =Prior, stepsize =.8) out _4= rbprobitRWMetropolis ( Data =Data, Mcmc =Mcmc, Prior =Prior, stepsize =3) windows () par ( mfrow =c (2,2)) matplot ( out _1$ betadraw, type ='l',xlab = “, ylab = “, main = TeX ('$\cr epsilon $-standard⌴deviation⌴=⌴.001 ')); grid () matplot ( out _2$ betadraw, type ='l',xlab = “, ylab = “, main = TeX ('$\cr epsilon $-standard⌴deviation⌴=⌴.005 ')); grid () matplot ( out _3$ betadraw, type ='l',xlab = “, ylab = “, main = TeX ('$\cr epsilon $-standard⌴deviation⌴=⌴.8 ')); grid () matplot ( out _4$ betadraw, type ='l',xlab = “, ylab = “, main = TeX ('$\cr epsilon $-standard⌴deviation⌴=⌴3')); grid ()

MCMC function. The following function implements a simple RW-MH-sampler for the binomial probit model coupled with a multivariate normal prior. All regression parameters are updated simultaneously in one MH-step.

rbprobitRWMetropolis <- function (Data, Prior, Mcmc, stepsize ) { require ( bayesm ) # because of the use of lndMnv to evaluate the log - density of a ... # ... multivariate normal distribution y = Data $y nvar = ncol (X) nobs = length (y) betabar = Prior $ betabar A = Prior $A R = Mcmc $R keep = Mcmc $ keep betadraw = matrix ( double ( floor (R/ keep ) * nvar ), ncol = nvar ) loglike = double ( floor (R/ keep )) beta = c( rep (0, nvar )) priorcov = chol2inv ( chol (A)) rootp = chol ( priorcov ) rootpi = backsolve (rootp, diag ( nvar )) # intialize log - likelihood at starting value oldloglike = sum ( pnorm (0, (X%*% beta )[ as. logical (y)], 1, log .p= TRUE ))+ sum ( pnorm (0, (-X%*% beta )[!as. logical (y)], 1, log .p= TRUE )) # compute non - normalized log - posterior at starting value oldlpost = oldloglike + lndMvn (beta, betabar, rootpi) naccept = 0 for ( rep in 1:R) { betac = beta + rnorm ( nvar )* stepsize # random walk proposal # compute probit log - likelihood at proposed value cloglike = sum ( pnorm (0, -(X%*% betac )[ as. logical (y)], 1, log .p= TRUE ))+ sum ( pnorm (0, (X%*% betac )[!as. logical (y)], 1, log .p= TRUE )) # compute non - normalized log - posterior at proposed value clpost = cloglike + lndMvn (betac, betabar, rootpi ) # compute log - ratio of non - normalized posterior at proposed ... # ... and old value ldiff = clpost - oldlpost alpha = min (1, exp ( ldiff )) # acceptance probability if ( alpha < 1) { unif = runif (1) } else { unif = 0 } if ( unif <= alpha ) { beta = betac oldloglike = cloglike oldlpost = clpost naccept = naccept + 1 } if ( rep %% keep == 0) { mkeep = rep / keep betadraw [mkeep, ] = beta loglike [ mkeep ] = oldloglike } } # betadraw is the matrix containing draws from the posterior # rateaccept is the relative frequency of accpeting proposed moves ... # ... from oldbeta to betac # loglike is the log - likelihood ... # ... evaluated at the current MCMC state ( beta ) return ( list ( betadraw = betadraw, mkeep =mkeep, rateaccept = naccept /R, loglike = loglike )) }

Stan probit definition file. This file that is called as StanProbit.stan by the R-script immediately below defines a binomial probit model with a multivariate normal prior for Stan. According to the model, the data are independently Bernoulli distributed with probabilities implied by the probit-link, parameters, and covariates.

data { int N; // number of observations int K; // number of covariates int < lower =0, upper =1> y[N]; // information matrix [N,K] X; // design matrix } parameters { vector [K] beta ; // beta coefficients } model { vector [N] mu; beta ^~ normal (0, 100); mu = X* beta ; for (n in 1:N) mu[n] = Phi (mu[n ]); y ^~ bernoulli (mu ); }

Calling Stan from R to estimate a binomial probit model. This R-script calls Stan to sample from the posterior of the binomial probit model coupled with a multivariate normal prior defined in the file above.

# may need to install the rstan package first require ( rstan ) # load the rstan package # see sripts above for nobs, nvar, simout objects prob _ data = list (N=nobs ,K=nvar ,X= simout $X,y=as. vector ( simout $y)) rstan _ options ( auto _ write = TRUE ) options (mc. cores = parallel :: detectCores ()) stanfit _ probit = stan ( file =" StanProbit . stan ",data = prob _ data, pars = c(" beta "), chains = 1, iter = 600000, warmup = 1000) # Make draws available for posterior analysis in R out _ StanProbit = extract ( stanfit _ probit )

HB -Logit Example

This code generates MNL-data from a hierarchical model, estimates an HB-logit model, and compares selected individual level posteriors to the corresponding maximum likelihood estimates.

genXy <- function (betai ,p,T){ ## generate multinomial logit choices # alternative specific constants # ... this assumes p=3 ( two inside brands, one outside choice ) X= kronecker ( rep (1,T), matrix (c(1 ,0 ,0 ,0 ,1 ,0) , ncol =( length ( betai ) -1))) # add the continuous covariate X= cbind (X, runif (T*p)) index = seq (p,p*T,p) X[index ,]=0 # outside good Xbeta =t( matrix (X%*%betai , nrow =p)) index = cbind (1:T, max . col ( Xbeta )) maxl = Xbeta [ index ] logsumel = log ( rowSums ( exp (Xbeta - maxl ))) + maxl logprob = matrix (Xbeta - logsumel , nrow =T) y= double (T) for (t in 1:T){ y[t]= sum ( cumsum ( exp ( logprob [t ,])) < runif (1))+1 ## draw from the CDF of probs } return ( list (y=y,X=X)) } p=3 # number of alterantives in each choice set T=5 # number of repeated measurements, i.e., choice sets or choices # generate panel data for MCMC analysis N =2000 # number of individuals in the panel # population mean preference betap =c(.3 , -2 , -1) # variance - covariance of preferences in the population Vbeta = matrix (c(3 , -2.99 ,0 , -2.99 ,3 ,0 ,0 ,0 ,.1) , ncol =3) # just for demonstration to make sure we all get ... # ... the same result date and results set . seed (66) # draw individual specific preferences from MVNormal distribution betai = betap +t( chol ( Vbeta ))%*% matrix ( rnorm (N* length ( betap )), ncol =N) lgtdata <- vector (" list ", N) T=5 # number of choices per individual betaMLE = betai betaMLE [ , ]=0 for (i in 1:N){ outgen = genXy ( betai [,i],p,T) # For Bayesian analysis using rhierMnlRwMixture ... # ... you need to organize your data in list format as ... # ... in the command line below # y :: vector of choice outcomes of length T or ... # ... T_i in case different panel units provide different numbers of choices # X :: A (p*T) rows x length ( beta [,i]) columns model matrix ; # the first ( second ) p rows correspond to the first ( second ) choice set, and so on. # Each alternative is represented by one row in X. # The numbers in y point to which 'row ' was chosen from a particular choice set lgtdata [[i ]]= list (y= outgen [[1]], X= outgen [[2]]) out = optim ( par = betai [,i], fn=llMNL, gr=NULL, y= outgen [[1]], X= outgen [[2]], p=p, hessian = FALSE, control = list ( fnscale = -1)) betaMLE [,i]= out $ par # collect MLE estimates } # load the bayesm package into the workspace # (if this gives you an error, ... # ... you need to install the package first ) library ( bayesm ) # run the Bayesian hierarchical model outMCMC = rhierMnlRwMixture ( Data = list (p=p, lgtdata = lgtdata ), Prior = list ( ncomp =1), Mcmc = list (R =100000, keep =10)) # posterior of individual specific coefficients betaimc = outMCMC $ betadraw index =1001:10000 # may need to install this first library ( latex2exp ) M=c (3 ,99 ,2000) # plot betai posterior for consumers in M jpeg ( filename =" ILposteriors880 . jpg ", quality = 100 , width = 880 , height = 480) # windows () par ( mfcol =c( length ( betap ), length (M)* 2)) for (i in M){ plot ( density ( betaimc [i ,1, index ]), xlab = TeX ('$\cr beta _{A}$'), ylab = "⌴", main = paste ("panel - unit⌴", i)) abline (v= betai [1,i], col ='green ', lwd =5, lty =1 ) abline (v= betaMLE [1,i], col ='red ', lwd =5, lty =2 ) plot ( density ( betaimc [i ,2, index ]), xlab = TeX ('$\cr beta _{B}$'), ylab = "⌴", main ="⌴") abline (v= betai [2,i], col ='green ', lwd =5, lty =1 ) abline (v= betaMLE [2,i], col ='red ', lwd =5, lty =2 ) plot ( density ( betaimc [i ,3, index ]), xlab = TeX ('$\cr beta $'), ylab = "⌴", main ="⌴") abline (v= betai [3,i], col ='green ', lwd =5, lty =1 ) abline (v= betaMLE [3,i], col ='red ', lwd =5, lty =2 ) plot ( betaimc [i ,1, index ], type ='l', xlab ="⌴", ylab = TeX ('$\cr beta _{A}$'), main = paste (" MLE : ⌴", round ( betaMLE [1,i ]))) abline (h= betai [1,i], col ='green ', lwd =5, lty =1 ) abline (h= betaMLE [1,i], col ='red ', lwd =5, lty =2 ) plot ( betaimc [i ,2, index ], type ='l', xlab ="⌴", ylab = TeX ('$\cr beta _{B}$'), main = paste (" MLE : ⌴", round ( betaMLE [2,i ]))) abline (h= betai [2,i], col ='green ', lwd =5, lty =1 ) abline (h= betaMLE [2,i], col ='red ', lwd =5, lty =2 ) plot ( betaimc [i ,3, index ], type ='l', xlab ="⌴", ylab = TeX ('$\cr beta $'), main = paste (" MLE : ⌴", round ( betaMLE [3,i ]))) abline (h= betai [3,i], col ='green ', lwd =5, lty =1 ) abline (h= betaMLE [3,i], col ='red ', lwd =5, lty =2 ) }

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Otter, T. (2019). Bayesian Models. In: Homburg, C., Klarmann, M., Vomberg, A. (eds) Handbook of Market Research. Springer, Cham. https://doi.org/10.1007/978-3-319-05542-8_24-1

Download citation

DOI: https://doi.org/10.1007/978-3-319-05542-8_24-1
Received: 26 September 2018
Accepted: 01 October 2018
Published: 05 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05542-8
Online ISBN: 978-3-319-05542-8
eBook Packages: Springer Reference Business and ManagementReference Module Humanities and Social SciencesReference Module Business, Economics and Social Sciences

Publish with us

Policies and ethics

Bayesian Models

Abstract

Access this chapter

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

MCMC for Binomial Probit Without Data Augmentation

HB -Logit Example

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Publish with us

Search

Navigation