Abstract
Research and practice have called for the incorporation of customer mindset metrics (CMMs) to improve the accuracy of models that predict individual customer profits. However, as CMMs are self-reported data, collected through customer surveys, they are seldom available for a firm’s entire customer database and in addition always measured with some degree of error. Their usage in models for individual-level predictions of customer profit has therefore proven challenging. We offer a solution through a new method called multiple overimputation (MO). MO treats missing data as an extreme form of measurement error and imputes the CMMs for both customers with observed, albeit with measurement error, as well as missing values, that are then included as predictors in a model of individual customer profits. Through a simulation study, empirical application in the pharmaceutical industry, and a customer selection exercise, we demonstrate the predictive and economic value of applying MO in the context of CRM.
Similar content being viewed by others
Notes
Note that the estimation sample in MO does not have to be restricted to customers with observed CMMs, because all CMMs are overimputed.
For confidentiality reasons, we cannot reveal any further information about the drug category or the pharmaceutical firm.
This practice is common in the pharmaceutical industry, although it would be ideal to survey physicians at random points in time. The firm used these surveys to inform its salesforce evaluation and training, but not to determine sales calls levels for individual customers.
Please note that there are no standard items to measure attitudinal CMMs in the literature. In general, related studies measure for instance customers’ product- or service-related satisfaction (e.g., Bowman and Narayandas 2004, Cooil et al. 2007) or performance perceptions (e.g., Petersen et al. 2018) which should also be appropriate in our study of pharmaceutical sales.
For better readability, we use “CMMs” to mean “relative CMMs” throughout.
Our modeling framework similarly applies to predictions of customer lifetime value (CLV) by extending the projection window to three years (Venkatesan and Kumar 2004).
We also evaluated a regular Poisson model, but found the ZIP model to provide better model fit and predictive accuracy.
Predicted sales are obtained by first predicting a customer’s retention status and then sales conditional on retention. The MAD of predicted and observed sales therefore evaluates the accuracy of both the sales and retention models.
Please note that these metrics do not apply to VAR models.
MO is therefore an effective alternative to minimize the threat of the mere measurement effect, because it does not require firms to reach out to a broad sample of customers. As such, it reduces the chances of over-estimating the effects of sales calls that are actually attributable to the mere measurement of CMMs.
Although we have no definite information about the firm’s actual customer selection process, it was not based on CMM information and therefore likely similar to, or even less effective than, Model 1.
In a simulation study, our model specification and estimation algorithm satisfactorily recovered the true parameters.
In the rest of the manuscript, CMMs therefore refer to customer i’s prior CMMs.
Although, in general, customers’ CMMs as well as their spending behavior can vary over time, due to the nature of our data and similar to Petersen et al. (2018), we treat these variables as time-invariant and compute their average value during period 2, prior to making predictions in period 3.
We repeated the estimation by varying the specification of the initialization time period 1. The substantive results remained unchanged.
References
Aaker, D.A., & Jacobson, R. (1994). The financial information content of perceived quality. Journal of Marketing Research, 31(2), 191–201.
Abe, M. (2009). Counting your customers one by one: a hierarchical bayes extension to the pareto/nbd model. Marketing Science, 28(3), 541–553.
Adigüzel, F., & Wedel, M. (2008). Split questionnaire design for massive surveys. Journal of Marketing Research, 45(5), 608–617.
Ahearne, M., Jelinek, R., Jones, E. (2007). Examining the effect of salesperson service behavior in a competitive context. Journal of the Academy of Marketing Science, 35(4), 603–616.
Aksoy, L., Cooil, B., Groening, C., Keiningham, T.L. (2008). The long-term stock market valuation of customer satisfaction. Journal of Marketing, 72(4), 105–122.
Allenby, G.M., & Ginter, J.L. (1995). Using extremes to design products and segment markets. Journal of Marketing Research, 32(4), 392–403.
Alwin, D.F., & Krosnick, J.A. (1991). The reliability of survey attitude measurement: the influence of question and respondent attributes. Sociological Methods & Research, 20(1), 139–181.
Anderson, E.W., Fornell, C., Mazvancheryl, S.K. (2004). Customer satisfaction and shareholder value. Journal of marketing, 68(4), 172–185.
Arora, N. (2006). Estimating joint preference: a sub-sampling approach. International Journal of Research in Marketing, 23(4), 409–418.
Bijmolt, T.H., Leeflang, P.S., Block, F., Eisenbeiss, M., Hardie, B.G., Lemmens, A., Saffert, P. (2010). Analytics for customer engagement. Journal of Service Research, 13(3), 341–356.
Blackwell, M., Honaker, J., King, G. (2017). A unified approach to measurement error and missing data: overview and applications. Sociological Methods & Research, 46(3), 303–341.
Bolton, R.N. (1998). A dynamic model of the duration of the customer’s relationship with a continuous service provider: the role of satisfaction. Marketing Science, 17(1), 45–65.
Bolton, R.N., Kannan, P.K., Bramlett, M.D. (2000). Implications of loyalty program membership and service experiences for customer retention and value. Journal of the Academy of Marketing Science, 28(1), 95–108.
Bolton, R.N., & Lemon, K.N. (1999). A dynamic model of customers’ usage of services: usage as an antecedent and consequence of satisfaction. Journal of Marketing Research :171–186.
Bolton, R.N., Lemon, K.N., Verhoef, P.C. (2004). The theoretical underpinnings of customer asset management: a framework and propositions for future research. Journal of the Academy of Marketing Science, 32(3), 271–292.
Bowman, D., & Narayandas, D. (2004). Linking customer management effort to customer profitability in business markets. Journal of Marketing Research, 41(4), 433–447.
Bradlow, E.T., Hu, Y., Ho, T. -H. (2004). A learning-based model for imputing missing levels in partial conjoint profiles. Journal of Marketing Research, 41(4), 369–381.
Brown, B., Kanagasabai, K., Serpa Pinto, G. (2017). Capturing value from your customer data. Retrieved December 1, 2018 from https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/capturing-value-from-your-customer-data/.
Cooil, B., Keiningham, T.L., Aksoy, L., Hsu, M. (2007). A longitudinal analysis of customer satisfaction and share of wallet: investigating the moderating effect of customer characteristics. Journal of marketing, 71(1), 67–83.
De Haan, E., Verhoef, P.C., Wiesel, T. (2015). The predictive ability of different customer feedback metrics for retention. International Journal of Research in Marketing, 32(2), 195–206.
Dong, X., Janakiraman, R., Xie, Y. (2014). The effect of survey participation on consumer behavior: the moderating role of marketing communication. Marketing Science, 33(4), 567–585.
Donkers, B., Verhoef, P.C., de Jong, M.G. (2007). Modeling clv: a test of competing models in the insurance industry. Quantitative Marketing and Economics, 5(2), 163–190.
Du, R.Y., Kamakura, W.A., Mela, C.F. (2007). Size and share of customer wallet. Journal of Marketing, 71(2), 94–113.
Ebbes, P., Papies, D., Van Heerde, H.J. (2011). The sense and non-sense of holdout sample validation in the presence of endogeneity. Marketing Science, 30(6), 1115–1122.
European Commission. (2018). Data protection in the EU. Retrieved May 1, 2018 from https://ec.europa.eu/info/law/law-topic/data-protection/data-protection-eu_en/.
Fader, P.S., Hardie, B.G., Lee, K.L. (2005a). Counting your customers the easy way: an alternative to the pareto/nbd model. Marketing Science, 24(2), 275–284.
Fader, P.S., Hardie, B.G., Lee, K.L. (2005b). Rfm and clv: using iso-value curves for customer base analysis. Journal of Marketing Research, 42(4), 415–430.
Fischer, M., & Albers, S. (2010). Patient-or physician-oriented marketing: what drives primary demand for prescription drugs? Journal of Marketing Research, 47(1), 103–121.
Fornell, C., & Larcker, D.F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research :39–50.
Fornell, C., Mithas, S., Morgeson III, F.V., Krishnan, M.S. (2006). Customer satisfaction and stock prices: high returns, low risk. Journal of Marketing, 70(1), 3–14.
Ghosh, S.K., Mukhopadhyay, P., Lu, J.-C.J. (2006). Bayesian analysis of zero-inflated regression models. Journal of Statistical planning and Inference, 136(4), 1360–1375.
Gibbons, R.V., Landry, F.J., Blouch, D.L., Jones, D.L., Williams, F.K., Lucey, C.R., Kroenke, K. (1998). A comparison of physicians’ and patients’ attitudes toward pharmaceutical industry gifts. Journal of General Internal Medicine, 13(3), 151–154.
Gilula, Z., & McCulloch, R. (2013). Multi level categorical data fusion using partially fused data. Quantitative Marketing and Economics, 11(3), 353–377.
Gilula, Z., McCulloch, R.E., Rossi, P.E. (2006). A direct approach to data fusion. Journal of Marketing Research, 43(1), 73–83.
Gönül, F.F., Carter, F., Petrova, E., Srinivasan, K. (2001). Promotion of prescription drugs and its impact on physicians’ choice behavior. Journal of Marketing, 65(3), 79–90.
Granville, K. (2018). Facebook and cambridge analytica what you need to know as fallout widens. Retrieved May 1, 2018 from https://www.nytimes.com/2018/03/19/technology/facebook-cambridge-analytica-explained.html/.
Greene, W.H. (2017). Econometric analysis. London: Pearson.
Gruca, T.S., & Rego, L.L. (2005). Customer satisfaction, cash flow, and shareholder value. Journal of Marketing, 69(3), 115–130.
Gupta, S., & Zeithaml, V. (2006). Customer metrics and their impact on financial performance. Marketing Science, 25(6), 718–739.
Gustafsson, A., Johnson, M.D., Roos, I. (2005). The effects of customer satisfaction, relationship commitment dimensions, and triggers on customer retention. Journal of Marketing, 69(4), 210–218.
Horsky, D., Misra, S., Nelson, P. (2006). Observed and unobserved preference heterogeneity in brand-choice models. Marketing Science, 25(4), 322–335.
Ittner, C.D., & Larcker, D.F. (1998). Are nonfinancial measures leading indicators of financial performance? an analysis of customer satisfaction. Journal of Accounting Research, 36, 1–35.
Johansson, J.K., Dimofte, C.V., Mazvancheryl, S.K. (2012). The performance of global brands in the 2008 financial crisis: a test of two brand value measures. International Journal of Research in Marketing, 29(3), 235–245.
Kamakura, W.A., & Wedel, M. (1997). Statistical data fusion for cross-tabulation. Journal of Marketing Research :485–498.
Kamakura, W.A., & Wedel, M. (2000). Factor analysis and missing data. Journal of Marketing Research, 37 (4), 490–498.
Kamakura, W.A., & Wedel, M. (2003). List augmentation with model based multiple imputation: a case study using a mixed-outcome factor model. Statistica Neerlandica, 57(1), 46–57.
Kamakura, W.A., Wedel, M., De Rosa, F., Mazzon, J.A. (2003). Cross-selling through database marketing: a mixed data factor analyzer for data augmentation and prediction. International Journal of Research in marketing, 20(1), 45–65.
Koperwas, A. (2015). Are you sharing the customer journey across your org? Retrieved May 1, 2018 from https://theblog.adobe.com/sharing-customer-journey-across-your-org/.
Kumar, V., Venkatesan, R., Bohling, T., Beckmann, D. (2008). Practice prize report—the power of clv: managing customer lifetime value at ibm. Marketing Science, 27(4), 585–599.
Lambert, D. (1992). Zero-inflated poisson regression, with an application to defects in manufacturing. Technometrics, 34(1), 1–14.
Luo, X., Homburg, C., Wieseke, J. (2010). Customer satisfaction, analyst stock recommendations, and firm value. Journal of Marketing Research, 47(6), 1041–1058.
Malthouse, E.C., & Blattberg, R.C. (2005). Can we predict customer lifetime value? Journal of Interactive Marketing, 19(1), 2–16.
Manchanda, P., Rossi, P.E., Chintagunta, P.K. (2004). Response modeling with nonrandom marketing-mix variables. Journal of Marketing Research, 41(4), 467–478.
Martin, K.D., & Murphy, P.E. (2017). The role of data privacy in marketing. Journal of the Academy of Marketing Science, 45(2), 135–155.
Maynes, J., & Rawson, A. (2016). Linking the customer experience to value. Retrieved May 1, 2018 from https://www.mckinsey.com/business-functions/marketing-and-sales/our-insights/linking-the-customer-experience-to-value/.
McKinney, W.P., Schiedermayer, M., Simpson, D.E., Rich, E.C. (1990). Pharmaceutical sales representatives. Journal of the American Medical Association, 264(13), 1693–1697.
Mittal, V., & Kamakura, W.A. (2001). Satisfaction, repurchase intent, and repurchase behavior: investigating the moderating effect of customer characteristics. Journal of marketing research, 38(1), 131–142.
Mizik, N., & Jacobson, R. (2004). Are physicians easy marks? Quantifying the effects of detailing and sampling on new prescriptions. Management Science, 50(12), 1704–1715.
Mizik, N., & Jacobson, R. (2009). Valuing branded businesses. Journal of Marketing, 73(6), 137–153.
Montoya, R., Netzer, O., Jedidi, K. (2010). Dynamic allocation of pharmaceutical detailing and sampling for long-term profitability. Marketing Science, 29(5), 909–924.
Musalem, A., Bradlow, E.T., Raju, J.S. (2008). Who’s got the coupon? Estimating consumer preferences and coupon usage from aggregate information. Journal of Marketing Research, 45(6), 715–730.
Narayanan, S., Manchanda, P., Chintagunta, P.K. (2005a). Temporal differences in the role of marketing communication in new product categories. Journal of Marketing Research, 42(3), 278–290.
Narayanan, S., Manchanda, P., Chintagunta, P.K. (2005b). Temporal differences in the role of marketing communication in new product categories. Journal of Marketing Research, 42(3), 278–290.
Petersen, J.A., Kumar, V., Polo, Y., Sese, F.J. (2018). Unlocking the power of marketing: understanding the links between customer mindset metrics, behavior, and profitability. Journal of the Academy of Marketing Science, 46(5), 813–836.
Phillips, L.W. (1981). Assessing measurement error in key informant reports: a methodological note on organizational analysis in marketing. Journal of Marketing Research :395–415.
Qian, Y., & Xie, H. (2011). No customer left behind: a distribution-free bayesian approach to accounting for missing xs in marketing models. Marketing Science, 30(4), 717–736.
Qian, Y., & Xie, H. (2014). Which brand purchasers are lost to counterfeiters? An application of new data fusion approaches. Marketing Science, 33(3), 437–448.
Reinartz, W.J., & Kumar, V. (2000). On the profitability of long-life customers in a noncontractual setting: an empirical investigation and implications for marketing. Journal of Marketing, 64(4), 17–35.
Reinartz, W.J., & Kumar, V. (2003). The impact of customer relationship characteristics on profitable lifetime duration. Journal of marketing, 67(1), 77–99.
Reinartz, W.J., & Venkatesan, R. (2008). Decision models for customer relationship management (crm). In Handbook of marketing decision models (pp. 291–326): Springer.
Rust, R.T., Lemon, K.N., Zeithaml, V.A. (2004). Return on marketing: using customer equity to focus marketing strategy. Journal of Marketing, 68(1), 109–127.
Schmittlein, D.C., Morrison, D.G., Colombo, R. (1987). Counting your customers: who-are they and what will they do next? Management Science, 33(1), 1–24.
Seiders, K., Voss, G.B., Grewal, D., Godfrey, A.L. (2005). Do satisfied customers buy more? Examining moderating influences in a retailing context. Journal of Marketing, 69(4), 26–43.
Srinivasan, S., Vanhuele, M., Pauwels, K. (2010). Mind-set metrics in market response models: an integrative approach. Journal of Marketing Research, 47(4), 672–684.
Venkatesan, R., & Kumar, V. (2004). A customer lifetime value framework for customer selection and resource allocation strategy. Journal of marketing, 68(4), 106–125.
Verhoef, P.C. (2003). Understanding the effect of customer relationship management efforts on customer retention and customer share development. Journal of marketing, 67(4), 30–45.
Verhoef, P.C., & Franses, P.H. (2003). Combining revealed and stated preferences to forecast customer behaviour: three case studies. International Journal of Market Research, 45(4), 1–8.
Verhoef, P.C., Franses, P.H., Hoekstra, J.C. (2001). The impact of satisfaction and payment equity on cross-buying: a dynamic model for a multi-service provider. Journal of Retailing, 77(3), 359– 378.
Voss, G.B., Godfrey, A., Seiders, K. (2010). How complementarity and substitution alter the customer satisfaction–repurchase link. Journal of Marketing, 74(6), 111–127.
Wedel, M., & Kannan, P. (2016). Marketing analytics for data-rich environments. Journal of Marketing, 80 (6), 97–121.
Weijters, B., Cabooter, E., Schillewaert, N. (2010). The effect of rating scale format on response styles: the number of response categories and response category labels. International Journal of Research in Marketing, 27(3), 236–247.
Zheng, Z., & Padmanabhan, B. (2006). Selectively acquiring customer information: a new data acquisition problem and an active learning-based solution. Management Science, 52(5), 697– 712.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
J. Andrew Petersen served as Area Editor for this article.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendices
Appendix A: Details on model specification for sales and retention
Zero-inflated Poisson (ZIP) model
In each month t during period 3 (i.e., t = 11-45), we observe for each customer i (i = 1 to N) the level of sales (\(y_{it}^{p3}\)) and sales calls (\(Det_{it}^{p3}\)) directed toward that customer. We assume that sales from customer i in month t follow a ZIP model (Lambert 1992), such that at any time t customer i can belong to either of two latent states, dormant, or inactive, (Bit = 1) versus active (Bit = 0). Market forces, marketing, and other influences likely affect customer i’s switching between states. We assume that customers never quit a relationship, such that there is always a finite probability (1 − πit) that they will prescribe the firm’s drugs, in line with extant research (Kumar et al. 2008). Under the ZIP model, the probability that sales (yit) from customer i in time t equals k is;
where λit > 0. As per Eq. A.1a, customer i is active (πit = 0) when sales reach at least one new prescription in time t (i.e., \(y_{it}^{p3} > 0\)). When we do not observe sales in time t, customer i could either belong to the dormant state with probability πit or the active state with probability 1 − πit , or yit = 0. We therefore include the term (1 − πit)exp(−λit) when modeling the probability that sales equal 0, or \(p (y_{it}^{p3} = 0)\). Both, λit and πit are unknown customer-specific parameters, modeled as functions of observed covariates (Ghosh et al. 2006). We rewrite Eq. A.1a as a mixture model of latent random variables \(V_{it}^{p3}\) and \(B_{it}^{p3}\);
The expected number of new prescriptions from physician i in time t, which represents the Poisson mean λit, is modeled as;
where \(\beta ^{\lambda }_{i}\) represents the customer-specific coefficients and \(X^{\lambda p3}_{it}\) refers to the corresponding covariates that capture customer past purchase behavior, i.e., lagged sales (\(y_{it-1}^{p3}\)) and firm actions, i.e., sales calls (\(Det_{it}^{p3}\)). We model the Bernoulli random variable Bit (Eq. A.1b), which represents the probability that customer i is inactive in time t, as;
Similar to Eq. A.2a, \(\beta ^{\pi }_{i}\) represents the customer-specific coefficients and \(X^{\pi p3}_{it}\) refers to the covariates that capture firm actions and customer past purchase behavior.Footnote 12
Hierarchical model
With the following hierarchical model of customer-specific coefficients, \(\beta _{i} = (\beta ^{\lambda }_{i},\beta ^{\pi }_{i})\), we can assess the influence of CMMs and their behavioral predictors on sales and retention;
We measure observed customer heterogeneity covariates (Zi) during period 2, to control for the endogeneity among sales and CMMs (or the reinforcing effect of sales on CMMs). Our model thus captures the influence of customer i’s prior CMMs (during period 2) on his or her future behavior (during period 3).Footnote 13 The specific heterogeneity covariates include CMMs (\(\overline {CMM_{i}^{p2}}\)), specialty, i.e., whether the physician is a specialist in a certain medical field (SPCi), and logarithm of average period 2 sales, \(ln(\overline {y^{p2}_{i}}\)), as a proxy for the size of the customer wallet. By accounting for these measures, we can evaluate the effect of CMMs over and above commonly available measures of observed customer heterogeneity.Footnote 14
Further, νi represents the unobserved heterogeneity component that we assume to follow a multivariate normal distribution with zero mean and a variance-covariance matrix V. Similar to Allenby and Ginter (1995), in the absence of γ’s and covariates (Zi), Eq. A.3 represents a standard random effects distribution for βi. Since CMMs can be considered part of the unobserved heterogeneity, νi allows us to assess the value of including CMM information in the hierarchical model (Eq. A.3) over and above a random effects specification of unobserved heterogeneity.
Appendix: B : Model estimation and prediction of twelve-months ahead customer profits in the holdout sample
We estimate the CMM imputation model based on behavioral predictors from period 1 using the estimation sample of 407 customers. We conduct the estimation of parameters in the ZIP and imputation models as well as the prediction of sales in the holdout sample in a fully Bayesian framework employing MCMC algorithms to enable posterior inference. We provide the prior specifications for the model parameters, estimation, and imputation algorithms in Web Appendix B in Supplementary Material. Each MCMC iteration in our model estimation proceeds in three phases. In the first phase, we simulate draws from the posterior distribution of the MO model parameters (Eq. 5) and use them to replace CMMs in the estimation dataset. In the second phase, we simulate draws from the posterior distribution of the ZIP model parameters using the multiple overimputed data from the first phase (Eqs. A.1a, A.2a, A.2b, and A.3 in Appendix A). In the third phase, we simulate the predictive posterior distribution of sales, retention, and overimputed CMMs for customer i in month T + k (where k = 1, 2, ...12; T = 10) at the end of every iteration of the MCMC algorithm as follows;
-
1.
Predict CMMs (\(\hat {CMM_{i}^{p2}}\)) with Equation 5. Use predicted CMM values for all customers to accomplish overimputation.
-
2.
Predict customer i’s hierarchical coefficients \(\hat {\beta _{i}}\) using Equation A.3. Predicted CMMs (\(\hat {CMM_{i}^{p2}}\)) come from step 1.
-
3.
Predict \(\hat {\pi _{iT+k}}\) and \(\hat {y_{iT+k}}\) using the predicted coefficients (\(\hat {\beta _{i}}\)), lagged sales (\(\hat {y_{iT+k-1}^{p3}}\)), and sales calls (\(Det_{iT+k}^{p3}\)), predicted in the holdout period.
For each iteration of the MCMC algorithm, the predicted values \(\hat {\pi _{i}}=(\hat {\pi _{T+1}},\hat {\pi _{T+2}}, ...\hat {\pi _{T+12}})\) and \(\hat {y_{i}}=(\hat {y_{T+1}}, \hat {y_{T+2}}, ..., \hat {y_{T+12}})\) serve to compute the profits for customer i from Eq. 4. The posterior expected profit for customer i is the Monte Carlo average;
where, np refers to the number of posterior iterations.
Of the 50,000 MCMC algorithm iterations, we employ the initial 30,000 as burn-in and the last 20,000 as the posterior sample to make inferences. To assess convergence, we also assess trace plots and simulate the posterior distribution using five different parallel chains. The multivariate potential scale reduction factor (MPSRF), computed using the posterior sample of five chains ranging from 1.2 to .9 (across all variables), indicates convergence in the posterior sample.Footnote 15
Rights and permissions
About this article
Cite this article
Venkatesan, R., Bleier, A., Reinartz, W. et al. Improving customer profit predictions with customer mindset metrics through multiple overimputation. J. of the Acad. Mark. Sci. 47, 771–794 (2019). https://doi.org/10.1007/s11747-019-00658-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11747-019-00658-6