1 Introduction

Without any doubt, recommendation systems play an essential role in this era of information explosion, since it extracts and predicts what users want or need from the huge amount of published information or data. Recommendation systems have been applied in a variety of areas, such as e-commerce, travel recommendations, online video platforms, and social tagging predictions, to name a few. Collaborative filtering (CF) is one of the most successful recommendation techniques and has been widely utilized due to the precise prediction of users’ interests. Currently, CF can mainly be categorized into user-based and item-based CF according to the similarity calculation and preference prediction. User-based CF and item-based CF consider and derive the similarity between users and items, respectively, and then predict the rating of the target item based on the exploited similarities.

User-based CF first finds similar users with common patterns and then recommends items that these similar users are interested in. As shown in Fig. 1a, we can see that Tim and John are similar; hence, we recommend the sundae and donut to John since Tim likes both of them. In contrast, item-based CF uses similarity among items to determine whether a user would like them or not. For example, in Fig. 1b, since ice cream and sundaes are similar, when John chooses ice cream, we will also recommend the sundae to him.

Fig. 1
figure 1

Concepts of user-based and item-based filtering

Obviously, both the user- and item-based CF methods adopt overall ratings to make predictions by collecting preference information from other users. However, in real applications, people’s interests usually vary with time; traditional CF could not properly capture the change in users’ preferences. For example, many little girls love Barbie dolls, but most of them are no longer interested in them when they grow up.

In this paper, we propose a novel CF-based recommendation, dynamic decay collaborative filtering (abbreviated as DDCF), which includes the concept of dynamic decay function. DDCF can capture users’ preference variations and depict the evolution of their interests. Actually, several prior studies [1, 3, 29] pointed out that interest and memory retention are very similar. People’s preferences usually decay and vary with time. Our motivation behind this work comes from the concept of the Ebbinghaus forgetting curve [29] of human memory to directly describe the change in a user’s interest, as shown in Fig. 2. We have developed the idea of human brain memory to specify the level of a user’s preferences (i.e., instantaneous, short-term, or long-term) in our recommendations. For example, as shown in Fig. 2, the number of user reviews affects the preference decay status directly. In short, DDCF has the capability to determine the appropriate decay function to describe the evolution of users’ preferences based on their behaviors.

Fig. 2
figure 2

Evolution of preference retention using the brain memory curve

The contributions of our work are described as follows,

  • We point out the significance of the time factor in the recommendation system, i.e., the interest of users may vary with time. An elegant recommendation algorithm should gradually attenuate the impact of old data and accurately predict users’ future preferences. In this study, the preference decay concept is discussed and included in the CF-based recommendations. We also extend the idea of the human brain memory model to describe preference evolution.

  • With the decay function consideration, a novel algorithm, DDCF, has been proposed to effectively recommend items based on users’ preferences. To tackle the cold start and sparsity issues of recommendation systems, DDCF utilizes item clustering to group similar items together without any predefined parameters.

  • Differing from previous related studies, we propose a dynamic decay method in this study. DDCF specifies the preference level of items, i.e., instantaneous, short-term, or long-term level, and dynamically determines the decay function based on users’ rating behaviors.

  • When predicting the rating of items, DDCF combines the baseline estimation and decay item-based CF. It can also control the portion of baseline and decay CF to contribute more accurate recommendations.

  • To show the practicability of the proposed algorithms, we apply DDCF to real datasets. The experimental studies indicate that the proposed methods are both effective and scalable and outperform the state-of-the-art CF-based algorithms.

The remainder of this paper is organized as follows. Sections 2 and 3 provide the related work and some preliminaries, respectively. Section 4 describes the DDCF algorithm. Section 5 presents the experiments and performance study. Finally, we conclude in Sect. 6.

2 Related work

2.1 Collaborative filtering-based recommendation

Sarwar et al. [27] analyzed different item-based recommendation algorithms. They utilized several techniques for computing item–item similarities and proposed recommendation methods. SCF [28] combines item- and user-based collaborative filtering techniques together for recommendation. The authors also mentioned that user-based CF is only suitable for popular item recommendation; by observation, for unpopular items, we should use the item-based CF instead. Zhou et al. [33] utilized a bi-clustering method to group items with an order-preserving matrix and then integrated the similarity calculation into the user-based CF recommendation system. Cai et al. [4] borrowed the idea of object typicality from cognitive psychology and proposed a typicality-based collaborative filtering recommendation system, TyCo. Instead of deriving the similarity according to neighboring users, TyCo has more accurate prediction based on object typicality calculation.

Niemann et al. [22] proposed a collaborative filtering approach based on the items’ usage contexts. This approach increases the rating predictions for niche items with fewer usage data available and improves the aggregate diversity of the recommendations. Ma et al. [19] proposed a CF-based method combining k-means clustering and improved the result with SOM. SOM could do a rough cluster preprocessing as input, since k-means clustering needs a proper k setting to get better results. Zhang et al. [31] used a two-layer selection scheme to improve the quality of the selected neighbor for CF recommendation. Two-layer neighbor selection consists of two parts: the availability evaluation module and the trust evaluation module. These two modules are used to calculate user influence and improve recommendations. Gupta et al. [9] combined CF with demographics-based user clusters in a weighted scheme to predict the item rating. The proposed solution is scalable while successfully addressing user cold start and has higher accuracy and coverage. Melville et al. [20] presented an effective framework for combining content and collaboration. They used a content-based predictor to enhance existing user data and then provided personalized suggestions through collaborative filtering. Zhao et al. [32] used a pipeline concept to implement item-based CF on a MapReduce environment for solving the information explosion problem. With this method, CF can be easily applied to a huge dataset.

Some prior studies utilized matrix factorization for improving CF-based recommendation. Nie et al. [21] developed a third-order tensor factorization integrating CF-based technique for recommendation. They also used some latent characteristics to improve accuracy. Chen et al. [5] proposed a tri-factorization method based on orthogonal nonnegative matrix decomposition. After combining with the CF method, the proposed methods could handle the data sparsity issue effectively. Koren et al. [15] proposed a multifaceted CF model, which combines baseline estimates, the neighborhood model, and the latent factor model [14], to significantly improve the accuracy of similarity calculation and output prediction. Ba et al. [2] proposed an approach combining clustering and SVD for collaborative filtering recommendation. They decomposed the rating matrix with the SVD algorithm and calculated the similarity between users, and then found the nearest neighbors in the CF recommendation and predicted the ratings of the items. Using global preference and interest-specific latent factors, Kabbur et al. [13] proposed a nonlinear matrix factorization method to recommend the top-n items that users may be interested in. Pirasteh et al. [23] enhanced the recommendation system by exploiting matrix factorization with asymmetric user similarities. Intuitively, two users should be similar when they have common neighbors, even though they do not have any co-rated items.

Renaud-Deputter et al. [26] proposed a novel approach in the implicit feedback recommender system that combines clustering and matrix factorization to yield good results while using only implicit feedback on users’ purchasing history without requiring any parameters. There are also some studies which discuss model-based recommendations. Hofmann [10] describes a new model-based algorithm based on a generalization of probabilistic latent semantic analysis to continuous-valued response variables. Hofmann assumes that the observed user ratings can be modeled as a mixture of interest groups which could be characterized by a Gaussian distribution on the normalized ratings. Jiang et al. [12] developed an author topic model-based CF method to facilitate comprehensive points of interest (POIs) recommendations for social users. Many social attributes are adopted for making recommendations or predictions.

2.2 Decay collaborative filtering-based recommendation

As already mentioned, user preferences usually change with time. Some previous works on recommender systems have investigated how to incorporate temporal information into CF-based approaches. Ding [7] mentions the importance of time weight in CF-based methods for recommendation. The accuracy of prediction of collaborative filtering may gradually stop being influenced by the old data. Actually, this concept is intuitive, since the users’ preferences usually vary with time. Wu et al. [30] used the power decay function combining user- and item-based collaborative filtering for social tagging label prediction in a digital library. Lee et al. [17] constructed a pseudo-rating CF method using the implicit feedback data. They considered the user’s purchase time and the item’s rating time for weight decay to improve the recommendation accuracy. Gong et al. [8] proposed a method to evaluate the user’s interest change and combined it with the CF model. They used a fixed weight to decay all users' ratings based on item rating time. Richards et al. [25] discussed the advantages and disadvantages of each decay function using for CF. The experimental result also indicates the post-processing time of each decay combined with CF-based recommendation.

To the best of our knowledge, most prior studies utilized one decay function to evaluate and describe the user preference change. Obviously, only one decay function may not properly describe users’ complex preference variations. In this paper, due to the similarity of preference and memory, we utilize the memory principle of the human brain to build a model with multiple decay function considerations based on the number and time of the item rating.

Here, we present some related studies about the human memory principle. Memory is the ability to reproduce information stored in the brain. Usually, researchers divide memory into three phases, instantaneous memory, short-term memory and long-term memory [1, 3, 29]. Instantaneous memory storage time is very short, and information could be forgotten very fast. On the contrary, the information storage in short-term memory could stay longer in the human brain than instantaneous memory, but will still be forgotten after a while. The information stored in the long-term memory phase is able to stay for a long time and is not easily forgotten.

3 Preliminary

Suppose that there are a set of users U = {u1, …, un} and a set of items O = {o1, …, om} in a recommendation system. A rating record is a pair \(\left( {r_{ij} , t_{ij} } \right)\) where \(r_{ij}\) and \(t_{ij}\) are the rate and time of user \(u_{i} \) rating item \(o_{j}\), respectively. The rating set \(e_{i} \) is the collection of all rating records of user \(u_{i}\). A user rating vector is defined as \(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{u_{i} }} = \left( {r_{i1} , t_{i1} } \right), \left( {r_{i2} , t_{i2} } \right), \ldots ,\left( {r_{im} , t_{im} } \right)\), i.e., rating records in \(e_{i}\) with respect to all items in O. Note that if user \(u_{i}\) does not rate item \(o_{j}\), the values of \(r_{ij} , t_{ij}\) in \(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{u_{i} }}\) are both zero. A rating matrix in a recommendation system is defined as,

$$ M = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{u_{1} }} } \\ {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{u_{2} }} } \\ \vdots \\ \end{array} } \\ {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{u_{n} }} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {\left( {r_{11} , t_{11} } \right)} & \cdots & {\left( {r_{1m} , t_{1m} } \right)} \\ \vdots & \ddots & \vdots \\ {\left( {r_{n1} , t_{n1} } \right)} & \cdots & {\left( {r_{nm} , t_{nm} } \right)} \\ \end{array} } \right], $$

where n and m are the number of users and items, respectively.

Definition 1

(Decayed rate) Assume that the current time is t. The decayed rate of a rating record is

$$ D\left( {r_{ij} , t_{ij} } \right) = {\text{decay}}_{{\mathcal{L}}} \left( {\Delta t} \right) \times r_{ij} , $$
(1)

where \(\Delta t = t - t_{ij}\). The decay function \({\text{decay}}_{{\mathcal{L}}} \left( . \right)\) could be linear, logistic, power or exponential decay, to name a few. The concept and examples of the decay function are shown in Fig. 3.

Fig. 3
figure 3

An example of linear, exponential, power, and logistic decay

4 Proposed method: DDCF

According to the aforementioned discussion, the single decay function may not properly describe the user’s complex preference variation. In this study, we propose a novel approach, dynamic decay collaborative filtering (abbreviated as DDCF), to effectively predict users’ preferences. DDCF has four steps: (1) item clustering, (2) interest level identification, (3) decay function specification, and (4) preference prediction, as shown in Fig. 4.

Fig. 4
figure 4

Concept of proposed DDCF

To tackle the cold start and sparsity issues of recommendation systems, DDCF utilizes item clustering [6] to group similar items together without any predefined parameters. Then, for each user, we identify each cluster’s interest level according to the time and number of rating records in the cluster. For each level, DDCF utilizes different decay functions to describe the preference evolution. Finally, we calculate the similarities among users based on the derived decayed rates and predict the future preferences.

4.1 Item clustering

Cold start and sparsity are two fatal issues in CF-based recommendation. Cold start is related to recommendations for new users or items. Since the system does not have information about new users or items, it is very difficult to make precise recommendations. The sparsity problem is caused by the insufficient number of transactions and feedback data. It is difficult for the recommendation system to distinguish the similar interests among users, which will downgrade the usability of the collaborative filtering. DDCF uses a parameter-free clustering algorithm to solve the cold start and sparsity issues in CF-based recommendation. As in Definition 2, we derive the relation strength by Jaccard coefficient between two users by filtering out the insignificant relations (i.e., when the relation value is lower than a user-specified threshold α).

Definition 2

(Relation strength) Given an item o, the profile p = {p1, p2, …, pk} consists of k features of item o. The relation between two items can be derived by.

$$ R\left( {o_{i} , o_{j} } \right) = \frac{{\left| {p_{i} \cap p_{j} } \right|}}{{\sqrt {\left| {p_{i} \left| \times \right|p_{j} } \right|} }}. $$
(2)

With the user-specified threshold α, the relation strength is defined as,

$$ {\text{RS}}\left( {o_{i} , o_{j} } \right) = \left\{ {\begin{array}{*{20}l} {R\left( {o_{i} , o_{j} } \right),} \hfill & {{\text{if}}\;R\left( {o_{i} , o_{j} } \right) \ge \alpha } \hfill \\ {0,} \hfill & {{\text{if}}\;R\left( {o_{i} , o_{j} } \right) < \alpha } \hfill \\ \end{array} } \right.. $$
(3)

Obviously, α could control how dense the relations among items are when clustering and then this would affect the efficiency of the process.

figure a

After deriving the relation strength, we use a parameter-free algorithm to cluster the items in the system. The pseudocode is given in Algorithm 1. DDCF proposes a modularity-like evaluation, as shown in Definition 3, to be the terminated criteria of the hierarchical clustering. At each iteration, based on the clustering result from the last iteration, we merge all pairs of items with the strongest relation strength among their neighbors to form larger clusters. Suppose the clustering results in the last iteration and in the current iteration are C and C', respectively. If the strength gained from C to C' is negative, DDCF will stop clustering, since the previous clustering result is good enough. Obviously, we can significantly decrease the time consumed in the clustering due to reducing the computation iteration.

Definition 3

(Strength gain) Given an item set O = {o1, …, om} in a recommendation system and the clustering result C = {c1, c2, …, cp}, the strength function is defined as,

$$ S\left( C \right) = \mathop \sum \limits_{k = 1}^{p} \left[ {\frac{{{\text{IS}}_{k} }}{{{\text{TS}}}} - \left( {\frac{{{\text{OS}}_{k} }}{{{\text{TS}}}}} \right)^{2} } \right], $$
(4)

where \({\text{IS}}_{k} = \mathop \sum \nolimits_{{o_{i} ,o_{j} \in c_{k} }} {\text{RS}}\left( {o_{i} ,o_{j} } \right)\) is the summation of total relation strengths among items inside cluster ck, \({\text{DS}}_{k} = \mathop \sum \nolimits_{{o_{i} \in c_{k} ,o_{j} \in O}} {\text{RS}}\left( {o_{i} ,o_{j} } \right)\) is the summation of relation strengths of items in cluster ck and other items not in ck, and \({\text{TS}} = \mathop \sum \nolimits_{{o_{i} ,o_{j} \in O}} {\text{RS}}\left( {o_{i} ,o_{j} } \right)\) is the summation of all relation strengths between any two items in the recommendation system. With two different clustering results C and C', the strength gain from C to C' is defined as,

$$ \Delta S_{{C \to C^{\prime}}} = S\left( C \right) - S\left( {C^{\prime}} \right). $$
(5)

4.2 Interest level and decay function identification

After clustering items, for each user, DDCF identifies the interest level of each cluster based on his/her rating behavior. We borrow the concept of human brain memory [1, 3] to describe the preference variation. DDCF categorizes users’ preferences into instantaneous, short-term, and long-term interest level extending from the idea of the Ebbinghaus forgetting curve [29]. The preference at the instantaneous level is usually very short and may decay fast. On the contrary, the preference at the short-term level may stay longer in the brain than it does at the instantaneous level, but will still decay after a while. The preference at the long-term level is able to stay for a long time and is not easily forgotten.

Suppose the clustering result of item set O in a recommendation system is C = {c1, c2, …, cp}. For a user \(u_{i}\) and his/her rating set \(e_{i}\), we could collect all rating records of items clustering in ck and derive a rating sequence \(\left( {r_{i1} , t_{i1} } \right), \left( {r_{i2} , t_{i2} } \right), \ldots , \left( {r_{i\ell } , t_{i\ell } } \right)\) by sorting the rating record with \(t_{ij}\) in nondecreasing order. Given a user-specified time size w, the significant set \(se_{ik} = \{ (r_{ij} , t_{ij} {)|} t_{ij + 1} - t_{ij} \le w, t_{i0} = t_{i1} ,0 \le j \le \ell \}\). According to the Ebbinghaus memory curve [29], we usually will not forget one thing easily after reviewing or mentioning it a significant number of times. We borrow this idea and extend it to describe preference variation. Hence, the interest level \({\mathcal{L}}_{ik}\) of ck for ui is defined as,

$$ {\mathcal{L}}_{ik} = \left\{ {\begin{array}{*{20}l} {{\text{instantaneous}}\_{\text{level, }}} \hfill & {{\text{if}}\; 0 < \left| {se_{ik} } \right| \le \delta } \hfill \\ {{\text{short-term}}\_{\text{level,}}} \hfill & {{\text{if}}\; \delta < \left| {se_{ik} } \right| \le \theta } \hfill \\ {{\text{long-term}}\_{\text{level, }}} \hfill & {{\text{if}}\;\theta < \left| {se_{ik} } \right| } \hfill \\ \end{array} } \right.. $$
(6)

Notice that δ and θ are two thresholds to identify the minimum number of ratings of interest level.

According to the level characteristic, DDCF assigns different decay functions for three interest levels, instantaneous, short-term, and long-term. As mentioned above, the instantaneous level is usually very short and may decay fast. We choose the power decay to simulate the preference change. However, when a user rates the items in one cluster over several times, it may mean that he/she is quite interested in this type of item. We could utilize the logistic and exponential decay functions to simulate the preference evolutions of the short-term and long-term levels, respectively. The decay function of each level is defined as,

$$ \begin{aligned} & {\text{decay}}_{{{\text{instant}}}} \left( {\Delta t} \right) = \Delta t^{ - \lambda } \cdot \alpha , \\ & {\text{decay}}_{{{\text{short}}}} \left( {\Delta t} \right) = \frac{2}{{1 + e^{\lambda \cdot \Delta t} }} , \\ & {\text{decay}}_{{{\text{long}}}} \left( {\Delta t} \right) = e^{ - \lambda \cdot \Delta t} . \\ \end{aligned} $$
(7)

Notice that the parameters \(\lambda\) and \(\alpha\) could tune the decay degree of function and are usually derived by heuristic evaluation.

4.3 Preference prediction

Differing from traditional CF-based recommendation, DDCF uses baseline estimation and similarity calculation with decay consideration to predict the rate of the item. As pointed out in several discussions in prior studies [16, 24, 27], item-based CF methods usually have better accuracy than user-based CF. Hence, we extend the idea of item-based CF for recommendation. There are several methods which can be utilized for deriving the similarity between two items. In this study, DDCF adopts three methods: cosine, adjusted cosine, and Pearson coefficients, for calculating the similarity between two items. In the next section, we will discuss how each derivation method affects the final prediction results.

Definition 4

(Item similarity) Suppose that there are a set of users U = {u1, …, un} and a set of items O = {o1, …, om} in a recommendation system. Given two items \(o_{x} ,o_{y} \in O\), \(U_{{o_{x,} o_{y} }}\) is the set of users in U rated ox and oy simultaneously. The \(\overline{{o_{x} }}\) and \(\overline{{o_{y} }}\) are the average rates of items ox and oy in the recommendation system, respectively. Three methods are adopted for deriving similarity between two items ox and oy.

  1. 1.

    The cosine similarity is defined as,

    $$ cos\_sim\left( {o_{x} , o_{y} } \right) = \frac{{\mathop \sum \nolimits_{{u_{k} \in U_{{o_{x} ,o_{y} }} }}^{{}} r_{{u_{k} ,o_{x} }} \times r_{{u_{k} ,o_{y} }} }}{{\sqrt {\mathop \sum \nolimits_{k = 1}^{n} (r_{{u_{k} ,o_{x} }} )^{2} } \times \sqrt {\mathop \sum \nolimits_{k = 1}^{n} (r_{{u_{k} ,o_{y} }} )^{2} } }}, $$
    (8)
  2. 2.

    The adjusted cosine similarity is defined as,

    $$ acos\_sim\left( {o_{x} , o_{y} } \right) = \frac{{\mathop \sum \nolimits_{{u_{k} \in U_{{o_{x} ,o_{y} }} }}^{{}} (r_{{u_{k} ,o_{x} }} - \overline{r}_{{o_{x} }} ) \times (r_{{u_{k} ,o_{y} }} - \overline{r}_{{o_{y} }} )}}{{\sqrt {\mathop \sum \nolimits_{k = 1}^{n} (r_{{u_{k} ,o_{x} }} - \overline{r}_{{o_{x} }} )^{2} } \times \sqrt {\mathop \sum \nolimits_{k = 1}^{n} (r_{{u_{k} ,o_{y} }} - \overline{r}_{{o_{y} }} )^{2} } }}, $$
    (9)
  3. 3.

    The Pearson similarity is defined as,

    $$ pear\_sim\left( {o_{x} , o_{y} } \right) = \frac{{\mathop \sum \nolimits_{{u_{k} \in U_{{o_{x} ,o_{y} }} }}^{{}} (r_{{u_{k} ,o_{x} }} - \overline{r}_{{u_{k} }} ) \times (r_{{u_{k} ,o_{y} }} - \overline{r}_{{u_{k} }} )}}{{\sqrt {\mathop \sum \nolimits_{k = 1}^{n} (r_{{u_{k} ,o_{x} }} - \overline{r}_{{u_{k} }} )^{2} } \times \sqrt {\mathop \sum \nolimits_{k = 1}^{n} (r_{{u_{k} ,o_{y} }} - \overline{r}_{{u_{k} }} )^{2} } }}, $$
    (10)

In DDCF, for a user ui, the rate prediction of a certain item oj could be derived as follows:

$$ P_{{u_{i} ,o_{j} }} = \left( {1 - \rho } \right) \times \left( {\mu - b_{{u_{i} }} - b_{{o_{j} }} } \right) + \rho \times \left( {\overline{{r_{{o_{j} }} }} + \frac{{\mathop \sum \nolimits_{k = 1}^{m} D(r_{{u_{i} ,o_{k} }} , t_{{u_{i} ,o_{k} }} ) \times sim\left( {o_{j} ,o_{k} } \right)}}{{\mathop \sum \nolimits_{k = 1}^{m} \left| {sim\left( {o_{j} ,o_{k} } \right)} \right|}}} \right), $$
(11)

where \(0 \le \rho \le 1.\) Actually, Eq. (11) can be decomposed into two parts: baseline estimation and decay CF. The parameter \(\rho\) is used to control the portion of baseline estimation and decay CF contributing to the final prediction result. We utilize \(\mu - b_{{u_{i} }} - b_{{o_{j} }}\) as the baseline estimation to predict the rating value; \(\mu\) is the average rate of all items in the recommendation system, and \(b_{{u_{i} }} and b_{{o_{j} }}\) are the deviations of the rates of user ui and item oj, respectively. Then, when calculating the decay CF for prediction, we use Eq. (1) to derive the decayed rate \(D(r_{{u_{i} ,o_{k} }} ,t_{{u_{i} ,o_{k} }} )\) based on the time \(t_{{u_{i} ,o_{k} }}\) and the corresponding decay function in Eq. (7). The similarity function sim(.) could be cos_sim(.), acos_sim(.), or pear_sim(.) as defined in Definition 4. Obviously, given a dataset, different similarity calculations may have different prediction results. We will discuss how similarity calculation affects prediction accuracy in more detail in the next section.

5 Experimental results

To evaluate the performance of the proposed DDCF, five CF-based methods: (1) traditional item-based CF (IBCF), (2) fixed exponential decay CF (DCF-exp), (3) fixed power decay CF (DCF-pow), (4) fixed logistic decay CF (DCF-log), and (5) fixed linear decay CF (DCF-lin), are implemented for comparison. All algorithms were coded in C+ + language and tested on a workstation with Intel i7-3370 3.4 GHz with 8 GB main memory. A comprehensive performance study has been conducted on two real datasets, MovieLens [11], to show the applicability of DDCF. The description of the MovieLens dataset is as shown in Table 1. The MovieLens-100K dataset contains 100,000 ratings (1–5 scale) from 716 users for 3,952 movies, while the MovieLens-1M dataset contains 1,000,000 ratings (1–5 scale) from 6040 users for 3952 movies.

Table 1 Descriptions of the MovieLens [11] datasets

5.1 Discussion of prediction accuracy

In this section, we discuss the accuracy of prediction by DDCF. To measure the statistical accuracy of prediction, we use the mean absolute error (MAE) and root-mean-square error (RMSE) as the metrics to evaluate the quality of the prediction results. The MAE and RMSE are defined as,

$$ {\text{MAE}} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left| {\tilde{r}_{i} - r_{i} } \right|}}{n}. $$
(12)
$$ {\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {\tilde{r}_{i} - r_{i} } \right)^{2} }}{n}} . $$
(13)

n is the number of total predicted rating, \(\tilde{r}_{i}\) the predicted rating for the ith item, and \(r_{i}\) is the user’s true rating for the ith item. MAE is defined as the average absolute difference between predicted ratings and actual ratings; likewise, RMSE is the average root-square difference between predicted values and actual values. Both measures are frequently used to assess the goodness of the predicted values by a model or an estimator.

In the first experiment, we compare the MAE and RMSE of DDCF (utilized adjusted cosine similarity, i.e., acos_sim in Eq. (9)) with another five CF-based methods on two real datasets, as shown in Tables 2 and 3. Obviously, the proposed DDCF has the best accuracy of prediction compared to the other algorithms.

Table 2 MAE and RMSE in the MovieLens-100K dataset
Table 3 MAE and RMSE in the MovieLens-1M dataset

In the second experiment, to show the accuracy of DDCF with different training–testing partitions, we vary the ratio of the training and testing portions of the MovieLens-1M dataset from 50 to 90%. As shown in Figs. 5 and 6, compared to the other CF-based methods, DDCF has better accuracy. Notice that DDCF still has more precise prediction than CF using the fixed decay functions (i.e., power, logistic, linear, and exponential). This is partly because dynamically tuning the decay function could more accurately simulate the variance of preference.

Fig. 5
figure 5

MAE of the six algorithms on the MovieLens-100K dataset

Fig. 6
figure 6

RMSE of the six algorithms on the MovieLens-1M dataset

5.2 The effect of the similarity function

In this section, we discuss how similarity deriving methods affect the prediction results. We compare the adoption of cosine, adjusted cosine, and Pearson similarity calculation in DDCF to observe the accuracy of the Movielens-1M datasets. With the training–testing ratio 90–10%, as shown in Table 4, adjusted cosine could lead to better prediction results than the other two methods. This is partly because the adjusted cosine performs normalization, i.e., minus the item average rating, before calculating the similarity between two items. We can also observe that the Pearson coefficient does not perform well in terms of depicting the similarity between two items. The normalization with user means in Pearson may not be suitable for describing the preference variations of items in DDCF.

Table 4 MAE and RMSE with different similarity calculations. (The MovieLens-1M dataset with 90%-10% training–testing ratio)

5.3 Recommendation quality

To show the quality of the recommendations, we use the precision rate, recall rate, and f-measure to evaluate the top-k recommendations of DDCF and the other CF-based methods. We use Table 5 to explain the concept of precision and recall rates. True Positive (TP) means the set of recommended movies that users will watch and give ratings for; False Positive (FP) means the set of recommended movies that users will never watch. On the contrary, False Negative (FN) is the set of nonrecommended movies that users will actually watch and True Negative (TN) is the set of nonrecommended movies that users really will not watch.

Table 5 Concept of precision and recall rates

The precision rate, recall rate, and f-measure are defined as follows:

$$ {\text{Precision}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FP}}}}. $$
(14)
$$ {\text{Recall}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}}. $$
(15)
$$ F{\text{-Measure}} = 2 \times \frac{{{\text{Precision}} \times {\text{Recall}}}}{{{\text{Precision}} + {\text{Recall}}}}. $$
(16)

In the following experiments, the top-k highest prediction score movies are recommended to users. From Figs. 7, 8, 9, 10, 11, and 12, we use the precision rate, recall rate, and f-measure to evaluate the quality of recommendations on the MovieLens-100K and MovieLens-1M datasets. As shown in Figs. 7 and 8, the precision results indicate that the recommendation of DDCF has the best results compared with all different top-k settings. In addition, compared to other CF-based methods, DDCF has shown the most significant highest precision results, especially with the top-10 recommendations.

Fig. 7
figure 7

Precision rate based on the top-N recommendations from the MovieLens-100K dataset

Fig. 8
figure 8

Precision rate based on top-N recommendations from the MovieLens-1M dataset

Fig. 9
figure 9

Recall rate based on the top-N recommendations from the MovieLens-100K dataset

Fig. 10
figure 10

Recall rate based on the top-N recommendations from the MovieLens-1M dataset

Fig. 11
figure 11

F-measure based on the top-N recommendations from the MovieLens-100K dataset

Fig. 12
figure 12

F-measure based on the top-N recommendations from the MovieLens-1M dataset

For recall rate measurement, as shown in Figs. 9 and 10, DDCF still has the best results compared with other CF-based methods in both the MovieLens 100K and 1M datasets. The recall rate of the proposed DDCF outperforms IBCF and other fixed decay CF-based methods when using the top-30 recommendations.

Regarding the final experiment, we discuss the F-measure of all algorithms. Likewise, for the evaluation of F-measure, DDCF also performs better than the other recommendation algorithms. From Figs. 11 and 12, we can observe that DDCF has a high F-measure value with the top-10 and top-50 recommendations. Notice that when doing top-30 and top-40 recommendations, the F-measure of DDCF is almost the same as DCF-pow and DCF-log. This is partly because the recall rates of the three algorithms are very similar when N = 30 and 40.

6 Conclusion

In this paper, we propose a novel CF-based recommendation system, DDCF, which includes the concept of dynamic decay function. DDCF extends the idea of human brain memory to dynamically adjust the decay functions based on users’ behaviors. To tackle the cold start and sparsity issues of recommendation systems, we utilize item clustering to group similar items together without any predefined parameters. Furthermore, DDCF combines baseline estimation and decay item-based recommendation to predict users’ ratings. The experimental results indicate that DDCF performs better than traditional collaborative filtering and other fixed decay function consideration. We also applied the proposed DDCF to two real datasets to show its practicability.