Modular Framework for Data Prefetching and Replacement at the Edge

Ramljak, Dusan; Tom, Deepak Abraham; Voigt, Doug; Kant, Krishna

doi:10.1007/978-3-319-94340-4_2

Dusan Ramljak ORCID: orcid.org/0000-0001-7477-1973¹⁷,
Deepak Abraham Tom¹⁸,
Doug Voigt¹⁹ &
…
Krishna Kant¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10973))

Included in the following conference series:

International Conference on Edge Computing

1968 Accesses
5 Citations

Abstract

In this paper, we define and evaluate a Bayesian reasoning based cache management framework to minimize data movement and hence the latency and energy consumption of edge devices when interacting with the cloud to retrieve the needed data. The framework can be implemented either directly as a real cache, or as a virtual cache that acts as an advisor to the real cache. The latter strategy is useful when the real cache already exists and deals with complexities such as pinning/unpinning of some objects. The caching framework makes intelligent prefetching and eviction decisions using contextual and temporal relationships while automatically adjusting its parameters in the background. This flexibility and adjustability is crucial for the edge because of the prevalence of the context dependent and heterogeneous nature of the cloud interaction. The paper shows through several storage traces that the mechanism is at least as good as other state of the art algorithms, and can adapt faster to workload changes.

You have full access to this open access chapter, Download conference paper PDF

A software defined caching framework based on user access behavior analysis for transparent computing server

Article 07 December 2018

Optimization of heat-based cache replacement in edge computing system

Article 08 June 2020

Cache Based Cloud Architecture for Optimization of Resource Allocation and Data Distribution

1 Introduction

The increasing proliferation of the Internet of Things (IoT) devices and systems [1,2,3] results in large amounts of highly heterogeneous data to be collected. It is critical for many of today’s organizations to have fast and actionable insight into this data by correlating newly obtained data (at the edge) with the historical or legacy data stored in the cloud. In the resource constrained environment, getting the data that is needed for analysis at the right time is crucial both from application responsiveness and energy consumption perspectives. The edge access patterns are expected to be highly complex and context dependent [4]. This motivates us to study intelligent and flexible caching of the required cloud hosted data at the edge.

Given the importance of context and ability to add any relevant information and address new requirements, Bayesian reasoning provides the opportunity to add evidence (information that will help improve our belief) on the fly. This, in turn, allows us to both adapt to the workload changes and re-train the algorithm to handle new environments.

We use the notion of “belief” to leverage the context of the data. A “belief” encodes relationships across storage entities which could be blocks, objects, files, storage chunks, etc., but generically referred to here as “objects”. Consider two objects X and Y and a time window W. The belief of X regarding Y relative to window W, can be defined as the conditional probability that object Y is requested within the time window W following request for X. The belief is then used to determine (or suggest, in case of a virtual cache) the objects to be evicted (low belief) or prefetched (belief higher than elements in the cache, but not present in the cache).

The rest of the paper is organized as follows. Section 2 introduces similarities and differences between our approach and approaches that have been used in related literature to address our needs. The model that we use in the experiments is described in Sect. 3. Data and BeliefCache characteristics are presented in Sect. 4, while results are explained and discussed in Sect. 5. We conclude the paper by providing interesting areas of future studies in Sect. 6.

2 Related Work

Algorithms for predicting future data access in a caching context have been the object of intensive research for many decades at the page, object, cache-line, file, etc. levels, and an enormous body of knowledge has been accumulated in this area. Therefore, we can’t provide an extensive survey of cache management and prefetching techniques, just an overview of what we were looking for as a basis for comparison with BeliefCache.

“The cornerstone of read cache management is to keep recently requested data in the cache in the hope that such data will be requested again in the near future [5]”. Even though this simple LRU caching and its more complex improvements [6, 7] are predictive approaches they rely on the first order caching properties (recency and/or frequency of particular objects) and we wanted to explore them in combination with second order properties (relationships between the objects).

A lot of techniques explore sequentiality since it is important and widely present [8,9,10,11,12,13,14,15,16]. Yang et al. [17] indicate the need to prefetch based on random access patterns in addition to sequential ones, and observe that a cloud gateway equipped with adaptive caching/prefetching policies could significantly reduce tail latency. There are a lot of methods that leverage the access history information [13, 14, 18,19,20,21,22,23,24,25,26]. A lot of them use the weighted edges as a predictor of which objects will be requested next. A possible disadvantage is that rare requests may not be a good enough indicator of what follows next. In our case, we solve this issue by using a window size to allow several requests to vote on what is a likely successor. That way even if the current request is rare, the previous requests can be used to vote for prefetching candidates. Also, in a lot of works prefetching degree and prefetching trigger point are fixed constants throughout the workload. These works [5, 8,9,10, 27,28,29,30] also try to determine the prefetching trigger point and prefetching degree. In order to be adaptive and flexible we don’t fix either, and from request to request, we let the algorithm determine both based on belief. We will show in the experimental section how fixing the prefetching degree affects the quality of the results.

We consider works [17], and it’s containing algorithms AMP [5] and SARC [11], as important steps towards augmenting first order caching properties with second order properties, but we aim to go step further. Part of the Tombolo system that prefetches random access patterns is the history based prefetching algorithm they call GRAPH. It works by creating a graph that captures the relationships between requested objects and the requests that occur immediately afterwards. This is an important restriction which affects the ability of GRAPH to address more complex access patterns and adapt to the changes in the workload. We are gauging the relationships between the objects by looking into the window of object successors rather than just the immediate successor. This is a generalization since immediate successor in our framework could be obtained for a window size of 1. We believe that, at the edge, successor relationships are workload dependent. Related objects might come at inconsistent intervals which could be captured by increasing the window of object successors that we are looking into.

Both SARC and AMP, and therefore Tombolo since it is using them, try to relate utility to the amount of space that each part of cache should contain and therefore partly decide which data to keep based on access history. We have a unique policy that is able to address any pattern. Instead of utility, we examine belief and our decisions are related to each object rather than the amount of space necessary for objects in each group.

We rely on the claim from [31] that efficient pattern discovery and description, coupled with the observed predictability of complex patterns within many high-performance applications, offers significant potential to enable many additional I/O optimizations. Thus, we are confident that belief-based caching and prefetching is well grounded and might be a good alternative to address complex access patterns. Directions how to address complex access patterns could also be found in [32].

3 Method

In very broad terms, we rely on the idea of “belief” to guide underlying heuristics in our framework and carry out all of the necessary functions of a cache. Belief is an estimate of a conditional probability that a particular object Y will be accessed next, after accessing X. It is calculated from how many times that object has been accessed in the history “look-ahead” windows matching the current access sequence. Note that the special case of $Y=X$ captures the access recency, and indirectly access frequency of object X. In literature [27, 28, 33] the “look-ahead” period defines what it means for one file to be opened “soon” after another file. We consider that two files are related if the files are accessed within a “look-ahead” period of one another.

Thus, beliefs provide a directed model of conditional dependence across random access patterns such that integrated use of belief guides both prefetch and eviction decisions. When determining whether a new object should be fetched, we examine the belief we have that it will be accessed in the near future. In order to determine what objects should be removed from the cache, we compare the belief values of the objects currently in the cache and remove the ones with the smallest values. In addition to determining when a new object should be fetched, we also need to decide how many objects should be cycled into the cache. This can range from replacing a single cache element in the event of both a miss and a hit if we believe that another object will be more popular, to replacing multiple objects if we believe that pattern of accesses has been drastically changed. This way we are addressing the increase in complexity of the encountered access patterns.

Maintaining a good hit ratio requires us to constantly monitor and update the belief values of the current cache elements, and compare them against potential replacements to determine if new objects should be cycled into the cache. While maintaining a good hit ratio is important, first priority is to satisfy demand requests as quickly as possible. This is one context where the notion of virtual cache, shown in Fig. 1 becomes important. The virtual cache advises the real cache about the cache objects to be prefetched or evicted based on the belief calculations. This mechanism enables the real cache to service IO access demand requests as fast as possible while the virtual cache (possibly running on a separate core) is doing the belief calculations. Note that the real cache can ignore eviction requests from virtual cache for those objects that need to remain pinned, and can surely ignore prefetch requests if there are too many demand requests to satisfy.

3.1 Calculating Beliefs

For each object we calculate the belief that it will be accessed after the current object if the current object has been accessed more than once. Belief is a conditional probability that an object we are calculating belief for will be accessed in the window of the size h given that we access the current element. Belief for a particular object is calculated as the ratio of the normalized counts of how many times that object appeared in the “look-ahead” history window and counts of how many times that object appeared in any “look-ahead” history window:

$$\begin{aligned} P(\{x\}|E = o) = \frac{c_{o-x}}{c_{x_p}} \times \frac{f_x}{f_o} \end{aligned}$$

(1)

Where $\{x\}$ means that x is in the window of the size h, and $\{x\}|E = o$ therefore means that x is in the window of the size h conditioned that o is the current object. Furthermore, $c_{o-x}$ is the count of x in “look ahead” history windows of o, and $c_{x_p}$ is the count of x in all “look ahead” history windows up to current time and it could be considered as popularity of x. Finally, $f_x$, $f_o$ are the current overall frequency (could also be considered as credibility of predictive strength) of o, and x respectively.

Moving through the trace we calculate the belief for all objects in the “look-ahead” history window of each object we encounter in the trace.

A rolling history in toy example trace on the left hand side is shown in Fig. 2. Objects are shown by their ids. Circled are the objects for which we count the “look-ahead” history windows. “Look-ahead” history windows of size 5 are represented by rectangles. On the right hand side is how the counts get updated. Counts are updated for the circled objects only when the last object in rectangle is accessed.

For each object we need to keep the counts of only the objects that appear in its “look-ahead” windows. In practice, we update the counts after the fact, when we access the last element in one elements’ “look-ahead” window. That is why in the Fig. 2 the last object to have the counts updated is object with the ID 4. Note that Fig. 2 represents overly simplified example trace and look ahead history windows of size 5 and it is shown here for better understanding of the algorithm.

3.2 Virtual Caching Algorithm

Figure 4 gives the virtual caching algorithm which stores the object IDs as well as the sorted beliefs calculated for the current object. As stated before, the virtual cache only determines the belief and what should be prefetched. It does not contain demand request elements unless their ID’s are brought in by high belief. We now briefly explain the algorithm with the help of Fig. 3.

Figure 3, shows the trace history of the simple example when a current object o with ID 15 is accessed. Cache represents the objects in the virtual cache (cache size - $k_t$). Cache candidates (cache candidate size - $k_c$) are objects that are considered to be put in cache. Cache size elements that vote (vote size - v) are objects that vote upon cache candidates and cache elements. All sizes in this toy example, h, v, $k_c$, and $k_t$ are equal to 5.

For the currently accessed object,“current object” o (line 2), if it has been accessed at least once before (line 3), we first calculate beliefs for all other objects (lines 5–7). Next, we choose $k_c$ objects not currently in the virtual cache which have maximum beliefs that will be accessed after the current object (lines 8–9). We call those objects “cache candidates”. We use $k_c$ last accessed elements in the trace, including the current element, to “vote” upon which elements out of the current cache elements and the cache candidates should be in the cache. “Vote” upon means that, for each object in the virtual cache and among the cache candidates, we find the maximum belief that voters provide (lines 10–13).

After obtaining maximal beliefs, we then sort these cache elements and cache candidates according to those maximal beliefs (line 15). Even though we have one sorted list we keep the information which element of the list is cache candidate and which one is already in the virtual cache.

In case of any event, our assumption is that we might need to replace more than one element depending on belief values. Therefore, we try to place in the prefetch suggestion list the top $k_c$ elements from the sorted list of cache candidates and virtual cache elements (lines 15–21). We say “try” because we do so only if, for each object, the obtained maximal belief is higher than the threshold t (line 15). By threshold t we refer to the minimum belief or Bayesian probability required for an object so that it may be considered as a cache candidate. Also, for each element, starting from the top of the sorted list, we compare its found belief value with the smallest stored belief value of the objects in the virtual cache (line 15). If that object from the sorted list was already in the virtual cache we only update its current stored belief value (line 16). If the object from the sorted list was one of the cache candidates we mark it for prefetching from its remote location (line 19), and store its obtained maximal belief (line 16). We continue this process until we reach the list element for which the obtained maximal belief is either smaller than the threshold, or smaller than the smallest stored belief in the virtual cache. After this process is finished, before moving on to consider the next event, we have the list of objects marked for prefetching – PREFETCH (line 22). We update the associated belief values for all objects in the virtual cache based on values from the new object (line 23) and updates for elements we don’t want to evict.

Updating the belief for all elements, according to current element beliefs, and updates for the elements we don’t want to evict (line 23) prevents building up the high beliefs in the both virtual and real cache. This way we control the ability to effectively change the elements of the real cache.

Subtracting Counts. At equidistant intervals $s_l$ we take a snapshot of the sparse matrix of counts after subtracting the previous snapshot from the current sparse matrix of counts.

Subtracting counts contributes to at least two goals. First, our beliefs remain current and we are able to adapt to workload changes faster. Second, we prevent the overflow of the cells in a sparse matrix of counts which might happen if we experience excessive access to certain elements. Nevertheless, we add a measure to prevent the overflow by limiting the maximal cell number.

Virtual Cache Complexity. Let n be the number of unique objects in the trace. Note, that $k_c$ and h are fixed and lot smaller than n. On the other hand $k_w$ is initially not fixed and is expected to be lot smaller than n.

For each event we have to increment h object counters and h belief counters (O(h) in worst case). We calculate belief and sort $k_c$ best candidates, which takes $O(k_c*log\,k_c*k_w)$. We then sort $k_t$ cache elements with complexity $(O(k_t*log\,k_t)$ and compare at most $2*k_t(O(k_t))$. The total final complexity could be controlled by avoiding calculation of probabilities when belief could be gauged by simple heuristics, and keeping all lists sorted. In that case each event only costs $O(log\,k_t)$.

The required storage for belief counters is $O(n\times k_w)$. In the worst case scenario, which is highly unlikely, $k_w$ could approach n. In the literature, Oly and Reed [34] claim that can happen only for the most complex, irregular patterns. For that to happen, all objects have to be related to all other objects, or in other words every window after the object needs to contain different objects. Even being so, since h is lot smaller than n and fixed, for all objects to build relationships to other objects they need to be accessed a lot and a lot of time has to pass. However, by subtracting counts we limit the time interval during which this phenomenon would have to occur and therefore $k_w$ could never approach n.

3.3 Real Cache Module

On completion of every IO access virtual cache is supposed to provide the prefetch and eviction advice, i.e. lists PREFETCH and BELIEF. The real cache takes PREFETCH list of elements as an advice for prefetching elements in to itself. For the real cache elements, belief information associated with them is obtained from the belief vector BELIEF which is regularly updated by the virtual cache. However, we keep a copy of the list in case that the list is locked for processing by virtual cache. Additional to this, unlike the virtual cache, the real cache also ensures that the element which it requires to currently access be brought in to the real cache, if the current IO access is a cache miss. For completion, the algorithm for the real cache is shown in Fig. 5

Note that, since PREFETCH and BELIEF lists are only suggestions real cache could as well use other policies to evict elements.

4 Evaluation Characteristics

In order to show the characteristics of our framework we have built a simulator in which we implemented our algorithm along with algorithms that we intend to compare to, Tombolo (SARC-GRAPH-AMP).

In the following sections we first introduce evaluation measures in Sect. 4.1. Section 4.2 contains the characteristics of the traces we worked with.

Characterization of the effects of user settable parameters, and decisions we made are provided in Sect. 4.3.

All the experiments were performed on 2 6C Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60 GHz with 128 GB RAM. All code was written in Python 3. Hereinafter, if not stated otherwise for the particular result, all the sizes are relative to the number of unique objects in the trace.

4.1 Evaluation Measures

We consider the following metrics.

(a)
Hit ratio: Fraction of requests that have a cache hit over all the requested data (same as fraction of requests that do not incur on-demand fetch latency).
(b)
Used ratio: Fraction of objects inserted into cache that receive at least one request before being kicked out.
(c)
Insertion ratio (includes prefetch requests and demandfetch requests): Average number of objects inserted per request.

Used ratio is used only to characterize the algorithm in the Sect. 4.3. The important aspects of energy consumption are prefetching and demand fetching. That part of energy consumption is directly proportional to insertion ratio.

4.2 Workloads

We focus on read requests since writes can largely be buffered locally in persistent storage and flushed to remote system asynchronously. We used the trace using only the FileName field to discriminate between what has been accessed, and used TimeStamp, ElapsedTime, and ByteOffset for latency evaluation.

Microsoft Block IO Traces. For our evaluation, we used Microsoft servers block traces, available on the Storage Networking Industry Association (SNIA) website^{Footnote 1}. The traces include Display Ad (DA), Microsoft Network (MSN), and Exchange Server (ES), all collected in 2007-08 time frame.

The traces are primarily Disk IO (block level), File IO traces and represent the original stream of access events already filtered through a cache.

SPEC SFS ’14 Workload. The other workload we used is The Video Data Acquisition (VDA), which is a part of SPEC-SFS (a file system benchmark). We set up the cloud gateway file system, cloudfuse, which talks swift API to the Object store and exposes the POSIX interface to the user. As an object store we set up OpenStack on a single node.

The VDA streaming workload is the most appropriate for the edge context since it simulates applications that store data acquired from individual devices such as surveillance camera. Generally, the idea is to admit as many streams as possible subject to bit rate and fidelity constraints. The workload consists of two entities, VDA1 (data stream) and VDA2 (companion applications), each about 36 Mb/s. VDA1 has 100% write access, VDA2 has mostly random read access.

Workload Characteristics. One of the important characteristics of the workload is object popularity distribution. We calculated it as a CFD (Cumulative Frequency Distribution) of a number of accesses to unique objects arranged in decreasing access rate. Figure 6, shows the object popularity distribution for the DA workload, but all the other popularity distributions look almost the same. x axis represents the fraction of the number of unique objects, while y axis represents the cumulative popularity of the objects. It is seen that relatively small percentage of the number of unique objects is accessed frequently for all traces.

Note that all further evaluations were done for different real cache sizes ranging typically (in terms of number of 16 kB blocks) from 10% to 30% of the number of unique IO accesses in the workload which is less than 3% of the total number of unique objects. The size of the virtual cache was varied between 1 to 100% of the real cache size, as appropriate.

Another important characteristic of the workload is the discretized autocorrelation function. It could show the retention time of objects and is calculated by incrementing a counter each time we encounter the same object being accessed for a given lag value. The function is then plotted after normalization. Figures 7(a), (b), and (c) show the discretized autocorrelation function for the workloads. x axis represents the time lag, while y axis represents the fraction of the number of unique objects that are repeatedly accessed after that lag. For DA it can be seen that the same objects appear regularly with a short gap. MSN workload has very similar characteristics to DA, which is why we are not showing it here. For ES a medium-term “memory” is evident, i.e., a pattern persists for some time and then doesn’t show up again. SPEC-SFS shows a truly short pattern and some long term correlations, but they are all rather weak.

4.3 BeliefCache Characteristics

In this section we discuss BeliefCache characteristics and how internal parameters affect the behaviour of our algorithm. Note that our end goal is to have an adaptive algorithm after the parameters are learned.

General Trends. We noticed that, with the available different workloads exhibiting different short medium and long-term correlations general trends were similar except that peaks and valleys were at slightly different points in the graph.

BeliefCache vs. Fixed Degree Prefetching. BeliefCache, for all of the workloads we measured, significantly outperforms cache management algorithms similar to BeliefCache where the prefetching policy fixes the number of objects to prefetch.

Internal Parameters. The way the parameters are tuned during training is as follows. For each parameter, the value that causes the ratio of hit over insertion to be maximum is chosen. In other words we want our algorithm to provide as good of a hit rate as possible, while keeping the data movement minimal. For the workloads we examined it is sufficient, and algorithm delivers what it promises during the validation.

Parameters are tuned individually so as to reduce the complexity of the training. Tuning the parameters together is possible through the application of randomized optimization algorithm thus finding parameter points that give better results.

Note that, while our algorithm continues to update the probability values during both training and testing, we only learn the values of internal parameters during the training phase.

Training Size: In general there exists a common learning curve for every problem that involves learning. It assumes a low accuracy at the beginning, increase for some period of time and the saturation point. We noticed a lack of common learning curve dependency for Microsoft traces. That could be explained by the fact that our algorithm needs a small amount of data to produce good results, or that for these workloads the amount of training data is not a significant parameter. For SPECSFS trace there is a lot to learn and bigger training sizes are needed.

Cache Candidate Size: Cache candidate size does not affect the virtual cache computation time. However, care must be taken in choosing the appropriate cache candidate size. With an extremely small cache candidate size no significant prefetch takes place. On reaching a suitably high cache candidate size Hit Rate peaks and then remains stagnant with further increase in cache candidate size since what ends up being examined as a cache candidate are the elements with lower belief values.

Voter Size: Contrary to what one would expect the Hit Rate peaks initially and then dips with further increase in voter size. One would expect the prediction accuracy to improve with the increase in number of voters. However, with increase in the voter size, potentially, the elements with less temporal locality to the current object (further away from it) might boost the belief of the wrong potential cache candidate element. A large voter size is undesirable since a larger voter size increases the probability calculation overhead. Therefore, the best choice of voter size will be a smaller quantity, in fact having a larger voter size is counter-productive.

Look-ahead History Window Size: With the increase in the window size, Hit Rate is initially small until a peak is reached. Further on, the trend is again descending. The reason why this takes place is that, with the increase in the window size more elements come up in the window of an object and thereby the probabilities get diluted. As a result, the probabilities do not get to be sufficiently large in order for the elements to be prefetched in to the cache. Observing the time for computation with varying look-ahead history window sizes which steadily increases, it is evident that the best choice of look-ahead history window size should also be a smaller quantity.

Belief Threshold: As we already mentioned, belief threshold is the minimal belief that the object should have in order to be considered to be put in the cache. For the workloads we explored, this parameter proves to have a high value in protecting the cache from cache pollution. We expect that overprotecting the cache might hurt the performance.

Hit rate peaks initially and then dips with further increase in belief threshold. That means we should have a certain belief formed about the objects before we try to put them in the cache. It is also noticeable that after a certain point increasing the belief hurts the performance. Lower belief threshold recommends more irrelevant prefetch elements, which explains the low initial Used Ratio. With higher belief threshold, the algorithm becomes more conservative about the elements to be prefetched. This in turn leads to lesser insertions, and thus directly affecting the Hit Rate.

Tests also showed that the choice of belief threshold does not cause a variation in the average computation time per IO access in the virtual cache. Thus, the best choice of belief threshold will be workload specific.

5 BeliefCache Evaluation

Evaluation of the algorithm is done in comparison with Tombolo (SARC-GRAPH-AMP) in several stages. The most important requirement, the ability to address complex patterns is addressed first. In Sect. 5.2 we evaluate the speed of the adjustment. We expect that the framework which better adjusts to the variations in the workload is better able to reduce the tail latency. Throughout the experiments both Tombolo and our framework had about the same execution times.

5.1 Evaluation with Complex Patterns

We compare the performance of BeliefCache against the Tombolo scheme. In Figs. 8(a), (b), (c), and 9 we show how hit ratio depends on varying the cache size. In all figures, x axis represents the cache size as fraction of the number of unique objects and y axis the hit rate. Different line colors represent different prefetching policies. As expected, both algorithms perform better on larger cache on the workloads we examined. For the traces that exhibit random access patterns and short to medium-term correlations it can be seen that BeliefCache is slightly better than Tombolo, but differences are not significant. For the trace which exhibit complex access patterns and long-term correlations BeliefCache is significantly better than Tombolo. Differences could be explained by the decisions to keep the graph manageable and ignore long chains of repeated accesses made in parts of Tombolo that should address the random access patterns.

5.2 Adjustment to the Variations in the Workload

Finally, we evaluate the ability of BeliefCache to adjust to changes in the workload and compare the speed of adjustment to Tombolo.

To do this we have divided traces in two parts. One remains untouched and the other is perturbated. On untouched part we train both algorithms. We measure the hit rate on the perturbated part which is changed on the following way. We kept the same access pattern but changed the objects which exhibit it. For example, if the accesses to objects were 1 7 1 7, we changed it to 2 8 2 8. Therefore, everything that our algorithm has learned from previous accesses should be changed to adjust to perturbation. Since the access patterns remain the same, both algorithms should at some point recover the measured hit rate on that part before the perturbation.

For all the examined workloads BeliefCache recovers the previous hit rate 2 times faster than Tombolo. As the experiment shows BeliefCache quickly adopts to the new workload and converges to the old hit rate result, requiring only a small fraction of the new test set to make the adjustment. On the other hand Tombolo’s prefetching has a slight upward slope as it is going through the trace. This implies a rigidity of the structure that is being created as part of the relationship it has created between different objects in the trace.

6 Conclusions and Future Work

The BeliefCache modular framework exploits contextual and temporal relationships gauged from the access history in the presence of complex access patterns at the edge to improve prefetching and eviction performance. In this work – time window, voting window size and belief threshold are among the parameters learned from access history and kept constant while executing the framework. In contrast, the beliefs themselves are dynamic values that are updated on each request. The elements with the highest belief values are considered for prefetching and those with the lowest values for eviction. The net result is a unified prefetching/caching algorithm that is not only competitive with the state-of- the-art algorithms, but can also quickly adjust itself to changes in workload.

Important next steps are to explore the robustness of the parameters with respect to the workload changes. Clearly, if the workload changes very substantially, the performance of the algorithm may drop and require an update to the parameters. Understanding the right time when to retrain the parameters might improve the performance of the framework. Further, interesting avenues of research would be to extend the belief by looking at temporal, behavioral and structural factors and explore multiple sets of beliefs.

Notes

1.
http://iotta.snia.org/tracetypes/3.

References

Gubbi, J., et al.: Internet of things (IoT): a vision, architectural elements, and future directions. Future Gener. Comput. Syst. 29(7), 1645–1660 (2013)
Article Google Scholar
Rose, K., et al.: The Internet of Things: An Overview. The Internet Society
Google Scholar
Kolb, S., Lenhard, J., Wirtz, G.: Application migration effort in the cloud. Serv. Trans. Cloud Comput. 3(4), 1–15 (2015)
Article Google Scholar
Zhang, F., et al.: Edgebuffer: caching and prefetching content at the edge in the mobility first future internet architecture. In: 2015 IEEE 16th International Symposium on World of Wireless, Mobile and Multimedia Networks (WoWMoM), pp. 1–9. IEEE (2015)
Google Scholar
Gill, B.S., Bathen, L.A.D.: AMP: adaptive multi-stream prefetching in a shared cache. In: FAST, vol. 7, pp. 185–198 (2007)
Google Scholar
Arı, İ., et al.: ACME: adaptive caching using multiple experts. In: Proceedings in Informatics, vol. 14 (2002)
Google Scholar
Megiddo, N., Modha, D.S.: ARC: a self-tuning, low overhead replacement cache. In: FAST, vol. 3, pp. 115–130 (2003)
Google Scholar
Smith, A.J.: Cache memories. ACM Comput. Surv. (CSUR) 14(3), 473–530 (1982)
Article Google Scholar
Tcheun, M.K., et al.: An adaptive sequential prefetching scheme in shared-memory multiprocessors. In: Proceedings of the 1997 International Conference on Parallel Processing, pp. 306–313. IEEE (1997)
Google Scholar
Pendse, R., Bhagavathula, R.: Pre-fetching with the segmented LRU algorithm. In: 42nd Midwest Symposium on Circuits and Systems, vol. 2 (1999)
Google Scholar
Gill, B.S., Modha, D.S.: SARC: sequential prefetching in adaptive replacement cache. In: USENIX Annual Technical Conference, General Track, pp. 293–308 (2005)
Google Scholar
Cao, P., et al.: A study of integrated prefetching and caching strategies. ACM SIGMETRICS Perform. Eval. Rev. 23(1), 188–197 (1995)
Article Google Scholar
Curewitz, K.M., et al.: Practical prefetching via data compression. ACM SIGMOD Rec. 22, 257–266 (1993)
Article Google Scholar
Griffioen, J., Appleton, R.: Performance measurements of automatic prefetching. In: Parallel and Distributed Computing Systems, pp. 165–170 (1995)
Google Scholar
Madhyastha, T.M.: Automatic Classification of Input/Output Access Patterns. Ph.D. thesis, University of Illinois at Urbana-Champaign (1997)
Google Scholar
Madhyastha, T.M., Reed, D.A.: Input/output access pattern classification using hidden Markov models. In: Proceedings of the Fifth Workshop on I/O in Parallel and Distributed Systems, pp. 57–67. ACM (1997)
Google Scholar
Yang, S., et al.: Tombolo: performance enhancements for cloud storage gateways. In: Proceedings of the 32nd International Conference on Massive Storage Systems and Technology (MSST 2016) (2016)
Google Scholar
Li, Z., et al.: C-miner: Mining block correlations in storage systems. In: FAST, vol. 4, pp. 173–186 (2004)
Google Scholar
Kuenning, G.H., Popek, G.J.: Automated hoarding for mobile computers, vol. 31. ACM (1997)
Article Google Scholar
Grimsrud, K.S., et al.: Multiple prefetch adaptive disk caching. IEEE Trans. Knowl. Data Eng. 5(1), 88–103 (1993)
Article Google Scholar
Joseph, D., Grunwald, D.: Prefetching using Markov predictors. In: Proceedings of the 24th Annual International Symposium on Computer Architecture, ISCA 1997, pp. 252–263 (1997)
Google Scholar
Palmer, M., Zdonik, S.B.: Fido: a cache that learns to fetch. Brown University, Department of Computer Science (1991)
Google Scholar
Kroeger, T.M., Long, D.D.: Design and implementation of a predictive file prefetching algorithm. In: USENIX Annual Technical Conference, General Track
Google Scholar
He, J., et al.: Knowac: I/O prefetch via accumulated knowledge. In: 2012 IEEE International Conference on Cluster Computing (CLUSTER), pp. 429–437. IEEE (2012)
Google Scholar
Vitter, J.S., Krishnan, P.: Optimal prefetching via data compression. J. ACM 43(5), 771–793 (1996)
Article MathSciNet Google Scholar
Lei, H., Duchamp, D.: An analytical approach to file prefetching. In: USENIX Annual Technical Conference, pp. 275–288 (1997)
Google Scholar
Lin, L., et al.: AMP: an affinity-based metadata prefetching scheme in large-scale distributed storage systems. In: 8th IEEE International Symposium on Cluster Computing and the Grid, CCGRID 2008, pp. 459–466. IEEE (2008)
Google Scholar
Gu, P., et al.: Nexus: a novel weighted-graph-based prefetching algorithm for metadata servers in petabyte-scale storage systems. In: 6th IEEE International Symposium on Cluster Computing and the Grid, CCGRID 2006, vol. 1, pp. 409–416. IEEE (2006)
Google Scholar
Cortes, T., Labarta, J.: Linear aggressive prefetching: a way to increase the performance of cooperative caches. In: Proceedings of the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing, IPPS/SPDP, pp. 46–54. IEEE (1999)
Google Scholar
Dahlgren, F., et al.: Fixed and adaptive sequential prefetching in shared memory multiprocessors. In: International Conference on Parallel Processing, ICPP 1993, vol. 1, pp. 56–63. IEEE (1993)
Google Scholar
He, J., et al.: I/O acceleration with pattern detection. In: Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, pp. 25–36. ACM (2013)
Google Scholar
Miller, J.A., Ramaswamy, L., Kochut, K.J., Fard, A.: Directions for big data graph analytics research. Int. J. Big Data 2(1) (2015)
Google Scholar
Griffioen, J., Appleton, R.: Reducing file system latency using a predictive approach. In: USENIX Summer, pp. 197–207 (1994)
Google Scholar
Oly, J., Reed, D.A.: Markov model prediction of I/O requests for scientific applications. In: Proceedings of the 16th International Conference on Supercomputing, pp. 147–155. ACM (2002)
Google Scholar

Download references

Acknowledgments

The authors would like to thank Sam Fineberg, and other colleagues from HPE, Jesse Friedman, Anis Alazzawe, and Alexey Uversky from Temple University, for valuable discussions and contributions in the initial phases of this work.

Author information

Authors and Affiliations

Temple University, Philadelphia, PA, USA
Dusan Ramljak & Krishna Kant
University of Florida, Gainesville, FL, USA
Deepak Abraham Tom
Hewlett Packard Enterprise, Palo Alto, USA
Doug Voigt

Authors

Dusan Ramljak
View author publications
You can also search for this author in PubMed Google Scholar
Deepak Abraham Tom
View author publications
You can also search for this author in PubMed Google Scholar
Doug Voigt
View author publications
You can also search for this author in PubMed Google Scholar
Krishna Kant
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dusan Ramljak .

Editor information

Editors and Affiliations

Shandong University, Jinan, China
Shijun Liu
Wageningen University, Wageningen, The Netherlands
Bedir Tekinerdogan
Nanzan University, Nagoya, Japan
Mikio Aoyama
Kingdee International Software Group CO., Ltd., Shenzhen, China
Liang-Jie Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ramljak, D., Tom, D.A., Voigt, D., Kant, K. (2018). Modular Framework for Data Prefetching and Replacement at the Edge. In: Liu, S., Tekinerdogan, B., Aoyama, M., Zhang, LJ. (eds) Edge Computing – EDGE 2018. EDGE 2018. Lecture Notes in Computer Science(), vol 10973. Springer, Cham. https://doi.org/10.1007/978-3-319-94340-4_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-94340-4_2
Published: 16 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94339-8
Online ISBN: 978-3-319-94340-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics