Abstract
Along with the growth of internet market a shopping, retailers such as supermarket chains need a strategy corresponding to customers in each store. The purpose of this research using ID-POS data of supermarket chain is to clarify customer characteristics and purchase behavior for each store. First of all, we categorize stores based on causal data concerning each store such as sales floor area and peripheral population. Second, we analyze customer’s purchasing behavior using ID-POS data for each class that we tried to classify above, and extract characteristic purchasing behavior. Finely we evaluate these results together and clarify customer characteristics and purchasing behavior for each store causal.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Recently, EC (Electronic Commerce) sites that are online stores to purchase variety of products through internet getting popular (Fig. 1). Under such situation, in real-world retailing such as supermarket industry, are required improvement of sales by unique service.
In order to propose some new unique services, analyst must know the characteristics of customers and tendency of purchasing behavior. Moreover, characteristics of each store are equally important to consider because the characteristics of store such as floor layout, competitor of the area and population of the market area, are very different. However, previous analyses of real-world retailing performed only purchasing history. It is necessary analyze using store causal data to improve the sales because customers are different depending on the surrounding environment.
2 Purpose of This Study
In this study, we aim to clarify customer characteristics and purchasing behavior using store causal data. It can provide useful suggestion for proposing sales strategies such as shelf allocation within stores and types of products to be stocked compared to analysis using only sales history. In addition, analysis of the relationship between product sales histories, will lead to improvement in store sales and satisfaction.
3 Analytical Procedure
We use ID-POS (point of sales with customer ID number) data of 53 store at a supermarket chains from March 2015 to February 2016 excluding nonmember data. Figures 2 and 3 the summarizing own data.
Figure 4 shows the outline of analytical procedure. First, we classify stores by cluster analysis using store causal data. Second, focusing on each classified cluster, we aggregate customer characteristics such as age structure and household composition and classify customers. Second, we perform association analysis on purchasing behaviors and clarify concurrent selling commodities for each cluster. Finally, we clarify customer characteristics and purchasing behavior for each store causal from these analysis results.
4 Analysis Method
4.1 Hierarchical Clustering Analysis
Hierarchical cluster analysis is a method of classifying objects by creating collections of objects similar to each other from a group in which objects of different properties are mixed based on the distance between objects.
In this study, 53 stores were classified into nine clusters by using store causal data, in order to clarify the relationship between the characteristics owned by the store and the customer characteristic and purchasing behavior.
The cosine distance was used as the distance between objects used for classification, and the Ward method [2] was adopted to merge cluster (Table 1).
4.2 Decile Analysis
Decile analysis is an analytical method for calculating the sales composition ratio of each rank by ranking the purchase price of all customers, based on purchase history data.
Generally, decile analysis is conducted to clarify customers with high purchase price per cluster and to evaluate differences in purchase behavior with other customers.
In this study, we performed decile analysis for each cluster that classified by hierarchical cluster analysis, and we defined that Decile rank 1 to 8 customers as “general customers” and 9 to 10 customers as “good customers”.
4.3 Association Analysis
Association analysis or market basket analysis in marketing is used to extract meaningful relevance between products from enormous log data. In order to pick up concurrent selling relationship and to obtain suggestions leading to sales measures such as display and sales floor placement, association analysis was conducted.
In this analysis, association analysis is performed for “good customer” and “general customer” set for each cluster, we used the basket ID assigned to the shopping cart as a key for making concurrent selling relationship.
5 Results of Analysis
5.1 Store Classification
The results of the hierarchical cluster analysis are shown in Tables 2 and 3.
-
Clusters 1 and 8
They composed only of city center stores.
-
Clusters 2, 3, 6 and 7
They composed only of suburban stores, clusters 4 and 5 are clusters constituted only of mountainous stores.
-
Cluster 9
It composed of city center stores and suburban stores.
In order to extract customer features for each cluster, we compiled the ratio of male and female (Fig. 5), the ratio of age, the ratio of unmarried and married (Fig. 6), and the ratio of the number of household members.
From the Fig. 8 in the ratio of male to female, the cluster with the lowest female ratio is cluster 1, the highest cluster is cluster 9. In the ratio of unmarried and married, the cluster with the lowest marriage ratio is cluster 1, the highest cluster is cluster 2, 7. However, there was not much difference between the clusters in either case.
On the other hand, the number of households and age differed among the clusters. In the ratio of the number of household members, the proportion of “1 person” in clusters 1, 8, 9 is high, while in the clusters 4, 5, the ratio of “5 or more people” is high.
In this way, it was found that there is a difference in characteristics of customers for each cluster (Fig. 7).
5.2 Classification of Customers and Evaluation of Purchasing Behavior
We performed a decile analysis for each cluster and classified our customers as “general customers” and “good customers.”
Then, we analyzed purchasing behavior such as aggregation of product categories with high purchase price.
Table 4 shows the number of goods per purchasing opportunity for general customers and good customers is described for each cluster.
From Tables 5, 6, 7 and 8 show the top five items of the item category with high purchase price are listed. In this paper, we described only the results of cluster 1 and cluster 4.
The number of items per purchase is higher for good customers than for general customers. Also, it can be seen that there is a difference in the number of items per purchasing opportunity for each cluster.
-
Cluster 1
For general customers, merchandise categories that do not require cooking such as ready to eat sushi and frozen boiled rice are higher in purchase price, whereas good customers are higher price of merchandise category requiring cooking such as brand pork and Japanese beef.
Although not listed in the table, beer and the third beer (malt-free beer like alcoholic beverage) were included at the top of purchase price for both general customers and good customers.
-
Cluster 4
There was almost no difference between categories where ordinary customers’ purchase price was high and categories with high purchase price of superior customers.
5.3 Association Analysis
Using the purchase history from 2015/03/01 to 2016/02/29, we conducted association analysis with basket ID as the key for general customers and good customers in all clusters.
At this time, we extracted the association rule with support: greater than or equal 0.1%, Lift: greater than or equal 1%, rule length: 2 as each threshold (Table 9).
There were many association rules are extracted for general customers, even for good customers. The clusters with number of rules to be extracted are clusters 4, 5 and 6, and clusters with few rules to be extracted are clusters 1, 3 and 9.
Tables 10, 11, 12 and 13 describe characteristic rules from the extracted association rules. In this study, we described only the results of cluster 1 and cluster 4.
It can be seen that there is a difference between general customers and good customers, we did not find much difference between clusters.
6 Discussions
As the customer characteristic of Clusters 1 and 8, the number of households is one person, the unmarried rate is slightly high, and the proportion of elderly people is low. Clusters 1 and 8 are considered to be clusters where single households in 30 s to 40 s are more than other clusters. On the other hand, as a feature of purchasing behavior, cluster 1 includes beer and third beer in upper category of purchase price. It seems that this is because customers of single households are using stores in Cluster 1, which is centered on small stores, to buy alcoholic drink.
As the customer characteristic of suburban stores in clusters 2, 3, 6 and 7, the ratio of 2 to 4 people in household composition ratio is 70%, that are higher than other clusters, while the ratio of one or five people is low. From these facts, it is speculated that customers are mainly housewives of nuclear families living in the suburbs.
As a characteristic of purchasing behavior, clusters 2, 6, and 7 are such that the number of purchased goods per purchasing opportunity is large and the number of extracted association rules is also large. In Cluster 3, the number of products purchased per purchasing opportunity is small, but this is probably because the average store area is small and the number of products is small. As association rule, typical characteristic was not found. In the clusters 4 and 5 in the mountainous area, the number of households was 5 or more, and the ratio of households in the 60 s and over 70 s was higher in the ages. From these facts, clusters 4 and 5 are considered to be clusters of elderly people couple families who live in mountainous areas compared with other clusters.
The characteristic of purchasing behavior is that the purchase price of sushi is high, the number of purchased goods per purchasing opportunity is large, and the number of extracted association rules is large.
Also, since it is a cluster with an average parking number of 100 or more, it is thought that there are many customers who visit by car and buy many items at a time.
As a feature of purchasing behavior common to all clusters, general customers have higher product categories that do not require cooking, whereas good customers tend to have higher-ranking product categories requiring cooking. In addition, as a result of association analysis, rules such as “chocolate - snack” and “banana - yogurt” are found for general customers, and rules such as “potatoes - onions” and “wooden mushrooms - tofu” are found for good customers.
For this reason, we think that there are differences in the purpose of using supermarkets for general customers and good customers, and it is necessary to propose appropriate measures for each.
7 Conclusion and Future Works
Study, first, we classified stores by cluster analysis using store causal data. Second, focusing on each classified cluster, we aggregate customer characteristics such as age structure and household composition and classified customers. In addition, we performed association analysis on purchasing behaviors and clarified concurrent selling commodities for each cluster. Finally, we clarified customer characteristics and purchasing behavior for each store causal from these analysis results.
Own analysis revealed that there are differences in customer characteristics and purchasing behavior depending on the characteristics of each store, such as the sales floor area and the population within the trading area. By using this result, it is possible to propose marketing measures unique to that store according to the surrounding environment, store size, etc., which were not taken into account in analysis using only the purchase history. In order to classify the stores more accurately, it is necessary to consider a method for the commercial areas of each store, taking into account competing stores and the like. In addition, it is thought that more useful suggestions can be obtained by looking at changes in purchasing trends such as the seasons of each cluster and considering customers using many stores.
References
Romesburg, C.: Cluster Analysis for Researchers, pp. 133–135. Lulu.com
Decile Analysis. http://www.totalcustomeranalytics.com/decile_analysis.htm
Nagasawa, T., Yamagishi, A., Yokoyama, S.: Analysis of consumer purchase behavior using supermarket sales data and regional information. Commun. Jpn. Ind. Manag. Assoc. 25(3), 158–163 (2015)
Namatame, T., Suyama, N.: Weather effects on consumer behavior in retailing - an analysis by using a supermarket’s POS data. Bull. Inst. Commer. 41(8), 1–29 (2010)
Sato, M., Kato, E., Matsuda, Y.: Research of FSP analysis in food supermarket. UNISYS Technol. Rev. 25(3), 330–338 (2005)
Annie, L.C., Kumar, A.: Market basket analysis for a supermarket based on frequent itemset mining. Int. J. Comput. Sci. 9(5), 257–264 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Usami, S., Otake, K., Namatame, T. (2017). Valuation of Customer and Purchase Behavior of a Supermarket Chain Using ID-POS and Store Causal Data. In: Meiselwitz, G. (eds) Social Computing and Social Media. Human Behavior. SCSM 2017. Lecture Notes in Computer Science(), vol 10282. Springer, Cham. https://doi.org/10.1007/978-3-319-58559-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-58559-8_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58558-1
Online ISBN: 978-3-319-58559-8
eBook Packages: Computer ScienceComputer Science (R0)