Abstract
A novel method of motion estimation based on block matching and motion refinement is proposed for frame interpolation. The new motion estimation is a two-stage method consisting of coarse searching and fine searching. Coarse searching aims to reduce the amount of calculation and the algorithm complexity, while fine searching is utilized to refine motion vectors for improving the final performance. In the stage of coarse searching, a new algorithm of motion refinement, that Angular-Distance Median Filter (ADMF), is proposed to correct wrong motion vectors, which can solve the blurry-problems resulted from overlapped situations. Overlapped situations mean that different blocks move towards similar position after the initial motion estimation. Experimental results show that the proposed approach outperforms the other compared approaches in subjective and objective evaluation.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
- Frame interpolation
- Frame Rate Up-Conversion
- Motion estimation
- Motion refinement
- Angular-Distance Median Filter
1 Introduction
Frame interpolation, that is also called Frame Rate Up-Conversion (FRUC), is to generate new frames on the basis of the prior information, which increases the frame rate. For example, we can utilize the technique of FRUC to convert a video at 30 frames per second to 60 frames per second or more by interpolating new frames. Techniques of frame interpolation can be applied to improve the visual effect of videos in various electronic equipments such as the television, game consoles, computers and so on.
The conventional framework of frame interpolation is composed of block matching motion estimation and Motion-Compensated Interpolation. Block matching motion estimation is our main concern. Furthermore, there are various kinds of algorithms using block matching motion estimation. [8] paid more attention to reducing the computational complexity. [1, 9, 15] took use of multivariate information including multi-frame and multi-level. [4, 6, 7, 12, 14] concentrated on getting true motion vectors through motion estimation and motion refinement.
Motivated by the efficiency and the performance of block matching motion estimation and motion refinement, new motion estimation with coarse-to-fine searching and Angular-Distance Median Filter (ADMF) is proposed for frame interpolation as illustrated in Fig. 1. The reasons how we design the framework from coarse to fine and the algorithm of motion refinement called ADMF will be explained as follows.
In [4, 6, 12, 14], some researchers combined Unidirectional Motion Estimation (UME) and Bidirectional Motion Estimation (BME), while other some researchers combined Forward Motion Estimation and Backward Motion Estimation. The combinations of various kinds of motion estimation aim to obtain more accurate motion vectors. But, simply combining them together is not efficient and maintains much redundant calculation.
For the purpose of reducing the redundant computation, Bidirectional Motion Estimation is applied in both coarse and fine searching. Because, the computation of BME is much less than the motion estimation abovementioned. In addition, relatively accurate motion vectors generated by BME can be used to refine wrong motion vectors by the proposed algorithm ADMF. Based on BME, the framework from coarse to fine not only reduces the amount of calculation, but also improves the final performance.
After the initial motion estimation, a variety of methods of motion refinement can be applied to refine motion vectors. These algorithms include Spatio-Temporal Motion Vector Smoothing [12], Two-Dimensional Weighted Motion Vector Smoothing (2DW-MVS) [14] and Trilateral Filtering Motion Smoothing [4]. These methods are unable to correct all the wrong motion vectors generated by the initial motion estimation, which results in blurry-problems. Blurry-problems are resulted from overlapped situations that different estimative blocks move towards similar position.
In order to effectively correct wrong motion vectors, a new algorithm of motion refinement, that Angular-Distance Median Filter, is proposed to applied after the initial motion estimation. ADMF is based on the angular and the distance of motion vectors. Furthermore, wrong motion vectors can be refined according to neighbouring motion vectors of them and their neighbours. So, it is an excellent algorithm for refining most wrong motion vectors. More details of ADMF will be elaborated in Sect. 2.
The contributions of this paper are summarized as follows. Firstly, a new method of motion estimation, which is a two-stage method from coarse searching to fine searching, is proposed for frame interpolation. Secondly, a novel algorithm of motion refinement called Angular-Distance Median Filter is put forward to effectively correct wrong motion vectors. Thirdly, experimental results demonstrate that the proposed approach outperforms the other compared techniques for frame interpolation in both subjective and objective evaluation.
The rest of this paper is organized as follows. In Sect. 2, we will elaborate the proposed algorithm including coarse-to-fine searching and Angular-Distance Median Filter. In Sect. 3, the experimental results will be showed and analysed. In Sect. 4, conclusions will be discussed.
2 Methodology
As illustrated in Fig. 1, the framework of frame interpolation is composed of motion estimation and Motion-Compensated Interpolation. This article proposes a new method of motion estimation based on coarse-to-fine searching and Angular-Distance Median Filter (ADMF).
Firstly, from the previous frame and the current frame in test video sequences, the initial motion vectors are estimated by Bidirectional Motion Estimation (BME) in a wide search range. Secondly, ADMF will be applied to update motion vectors until it meets terminal conditions. Thirdly, on the basis of updated motion vectors generated by ADMF, BME is employed again to refine motion vectors in a small search range. Fourthly, the algorithm of Motion-Compensated Interpolation will utilize the final motion vectors to generate interpolated frames.
2.1 Coarse Searching
In coarse searching, BME [3] is applied to estimate the initial motion vectors. The reasons why BME is chosen will be explained as follows. Firstly, its computational complexity is much less than the combination of Forward Motion Estimation and Backward Motion Estimation [12]. Secondly, the hole-problems resulted from Unidirectional Motion Estimation (UME) will not exist in BME. The hole-problems exist where no estimative blocks move to. Thirdly, the initial motion vectors calculated by BME are sufficient for the following motion refinement, which can utilize true motion vectors to correct wrong motion vectors.
The schematic diagram of BME is as illustrated in Fig. 2. In the left half of Fig. 2, \( F(n-1) \), \( F(n-\dfrac{1}{2}) \) and F(n) denote the previous frame, the interpolated frame and the current frame, respectively. The motion of blocks is assumed to be linear. In addition, motion vectors v are estimated by comparing the similarity of different blocks. The discriminate criterion of the similarity is the sum of absolute difference (SAD) between the pixel values in the previous frame \( F(n-1) \) and that in the current frame F(n) .
As shown in the right half of Fig. 2, \( B_{ij} \) represents a block which is in the i th row and the j th column of the interpolated frame. It is defined as:
where (x, y) denotes the position of the pixel in the interpolated frame and BS means the block size of \( B_{ij} \). In order to enhance the accuracy of motion estimation, a trick is applied here, which expands the block size of \( B_{ij} \). \( EB_{ij} \) represents an expanding block of \( B_{ij} \) with expanded size ES:
After the definitions of block and block size, SAD is used to calculate motion vectors \( \lbrace \varvec{v}_{ij}\rbrace \). \( \varvec{v}_{ij}=(v_x,v_y)\), a motion vector, denotes the distance which the block \( EB_{ij} \) moves relative to the previous frame and the current frame. In order to differentiate SAD values in various stages, the SAD value in coarse searching is called SADC. The mathematical expressions of SADC and motion vectors \( \lbrace \varvec{v}_{ij}\rbrace \) are:
and
where
In the above Eq. (5), CSR represents the search range in coarse searching, while CWS means the search window size in coarse searching.
2.2 Angular-Distance Median Filter
When the initial motion vectors \( \lbrace \varvec{v}_{ij}\rbrace \) are generated by BME, Angular-Distance Median Filter is proposed to refine motion vectors as illustrated in Fig. 3. Red arrows mean wrong motion vectors, while black arrows mean true motion vectors. As the blue circle of Fig. 3 shows, motion vectors of adjacent blocks point to the similar position, which will result in blurry-problems in the final interpolated frame. It is observed that there exists a main direction in most frames of test video sequences, which means that wrong motion vectors can be improved or corrected by neighbouring motion vectors. Then, the mathematical theory about ADMF algorithm will be explained as follows.
ADMF is an algorithm using the angular and the distance. The definitions of the angular A and the distance D are:
and
where \( \varvec{v} \) denotes the initial motion vector generated by BME, while \( \varvec{v}_0=(1,0) \) and it is chosen as a reference direction.
On the basis of \( A(\varvec{v},\varvec{v_0}) \) and \( D(\varvec{v}) \), Absolute Angular Difference (AAD) and Absolute Distance Difference (ADD) are calculated to judge the validity of motion vectors \( \lbrace \varvec{v}_{ij}\rbrace \). \( AAD(\varvec{v}_{ij}) \) and \( ADD(\varvec{v}_{ij}) \) are defined as:
and
where \( N=8 \) and \( \lbrace \varvec{v}_k \rbrace \) means 8 neighbour motion vectors of the center motion vector \( \varvec{v}_{ij} \).
After the calculation of \( AAD(\varvec{v}_{ij}) \) and \( ADD(\varvec{v}_{ij}) \), the reasonable threshold is set to judge the validity \( V_{ij} \) of the motion vector \( \varvec{v}_{ij} \):
If \( V_{ij}=0 \), the motion vector \( \varvec{v}_{ij} \) will be updated through median filter:
Terminal conditions include two parts that the number of times of filtering and the percentage of valid motion vectors. The upper limit of the number is set to be \( num\le 5 \), because the visual effect of the interpolated frames will become blurry after so many times of median filtering. The lower limit of the percentage of valid motion vectors is set to be \( 95\% \) that
where m denotes the number of blocks in row direction, while n denotes the number of blocks in column direction.
The proposed ADMF is composed of three steps as shown in Table 1. Firstly, num and V are initialized to zero. Secondly, Validity \( V_{ij} \) and the motion vector \( v_{ij} \) are updated according to Eqs. (10) and (11). Thirdly, repeat step 2 until it meets terminal conditions.
2.3 Fine Searching
After ADMF, not only wrong motion vectors will be corrected, but also true motion vectors will have a minor adjustment. Ensuring the accuracy of motion vectors refined with a fine adjustment is a major concern in fine searching. So, BME is utilized to refine motion vectors in a small search range. The main difference between coarse searching and fine searching is the search range because of their own purposes. Coarse searching aims to get the initial motion vectors in a wide search range, while fine searching aims to refine motion vectors in a small search range.
The SAD value in fine searching is called SADF. \( \varvec{\hat{v}}_{ij} = (\hat{v}_x,\hat{v}_y) \) generated by ADMF denotes the refined motion vector, while \( \varvec{v}_{ij} = (\hat{v}_x+v_x,\hat{v}_y+v_y) \) denotes the final motion vector for Motion-Compensated Interpolation. The mathematical expressions of SADF and motion vectors \( \lbrace \varvec{v}_{ij}\rbrace \) are:
and
where
In the above Eq. (15), FSR represents the search range in fine searching, while FWS means the search window size in fine searching.
3 Experiments
In our experiments, 10 test video sequences are applied to verify the validity of the proposed motion estimation for frame interpolation. These video sequences include Akiyo, Crew, Football, Foreman, Ice, Mobile, Paris, Silent, Soccer and Stefan. Especially in the sequences of Football and Soccer, there exists significant difference between adjacent video sequences because of high-speed moving objects, which means a great challenge for motion estimation. The resolution of 10 test video sequences is \( 352\times 288 \). Furthermore, experiments are conducted on the platform that Matlab R2015b.
In order to evaluate the performance of interpolated frames, even frames of video sequences are skipped and generated according to neighbouring odd frames by various methods of Frame Rate Up-Conversion. For example, the 2nd frame predicted can be calculated by the 1st frame and the 3rd frame. Peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) [13] are utilized as the evaluative criteria for describing the difference between even frames predicted and true even frames.
A complete framework of frame interpolation can be divided into two modules including motion estimation and Motion-Compensated Interpolation. Motion estimation is a major focus, so our proposed motion estimation will be compared with five other methods. These algorithms in this paper are called Bidirectional Motion Estimation (BME) [3], Forward-Backward Jointing Motion Estimation (FBJME) [12], Dual Motion Estimation (DME) [6], Direction-Select Motion Estimation (DSME) [14] and Linear Quardratic Motion Estimation (LQME) [4]. In the procedure of Motion-Compensated Interpolation, Overlapped Block Motion Compensation (OBMC) described in [2, 5] is applied to generate the final interpolated frames.
Experimental settings of the proposed approach will be detailed in the following. Block size is set to be \( BS = 16 \), while expanded size is set to be \( ES = 8 \). In coarse searching, coarse-searching window size is set to be \( CWS = 12 \), and step size is set to be 2. In fine searching, fine-searching window size is set to be \( FWS = 4 \), and step size is set to be 1. In addition, experimental settings of compared methods are according to those in [3, 4, 6, 12, 14].
Experimental results will be analysed in both subjective evaluation and objective evaluation. In subjective evaluation, interpolated frames generated by various methods will be displayed in the form of pictures. In objective evaluation, three numerical indexes including PSNR, SSIM [13] and running time will be considered.
3.1 Subjective Evaluation
In order to test the superiority of diverse methods of motion estimation subjectively, the original frame and interpolated frames generated by DME, DSME and the proposed are as shown in Fig. 4. The 78th frame in foreman sequences is utilized as a reference picture, so we can compare the visual effect of it with interpolated frames.
As (b) and (c) of the Fig. 4 show, there exist the blurry-problems on the face of the person. It means that DME and DSME are inaccurate especially in details such as eyes, the nose and the mouth. Compared to DME and DSME, the interpolated frame of the proposed, that the picture (d), is much more clear and similar to the original picture (a). Furthermore, PSNR values of various methods also indicate that the proposed motion estimation outperforms DME and DSME.
3.2 Objective Evaluation
In objective evaluation, a series of experiments are performed for testing the performance of 6 methods of motion estimation. The 6 types of motion estimation are BME, FBJME, DME, DSME, LQME and the proposed, and they are used together in 10 test video sequences. Three numerical indexes including PSNR, SSIM and running time will be considered.
As shown in Table 2, it is firmly convinced that the proposed motion estimation outperforms other compared methods in consideration of average PSNR and SSIM values. In addition, the proposed approach has outstanding performance especially in the sequences of Football and Soccer. It means that ADMF is an excellent algorithm of motion refinement, which can effectively refine wrong motion vectors in scenes that objects move fast.
For the purpose of comparing the efficiency of motion estimation, these methods that FBJME, DME, DSME and the proposed will be analysed in comprehensive consideration of average PSNR, average SSIM and average running time. The running time means the time of generating every interpolated frame. As shown in Table 3, the proposed motion estimation is the most efficient algorithm in contrast to the compared methods.
4 Conclusion
This paper has proposed a novel method of motion estimation based on block matching and motion refinement for frame interpolation. Firstly, the proposed framework consists of coarse searching and fine searching using Bidirectional Motion Estimation. The framework has been proven to be efficient due to requiring only low computation. Secondly, Angular-Distance Median Filter as an excellent algorithm of motion refinement has been verified that it can effectively correct wrong motion vectors. Thirdly, our proposed motion estimation has been analysed in overall consideration of PSNR, SSIM, running time and different scenes. Fourthly, experimental results have shown that the performance of the proposed method outperforms the other compared techniques for frame interpolation in both subjective and objective evaluation.
In the research of Frame Rate Up-Conversion, how to get true motion vectors in motion estimation is our main focus. In addition, how to generate interpolated frames in Motion-Compensated Interpolation still need to be studied. Furthermore, it is also interesting to implement frame interpolation in other frameworks, e.g., the phase-based method [10] and the method based on convolution neural network [11].
References
Cho, Y.H., Lee, H.Y., Park, D.S.: Temporal frame interpolation based on multiframe feature trajectory. IEEE Trans. Circuits Syst. Video Technol. 23(12), 2105–2115 (2013)
Choi, B.D., Han, J.W., Kim, C.S., Ko, S.J.: Motion-compensated frame interpolation using bilateral motion estimation and adaptive overlapped block motion compensation. IEEE Trans. Circuits Syst. Video Technol. 17(4), 407–416 (2007)
Choi, B.T., Lee, S.H., Ko, S.J.: New frame rate up-conversion using bi-directional motion estimation. IEEE Trans. Consum. Electron. 46(3), 603–609 (2002)
Guo, Y., Chen, L., Gao, Z., Zhang, X.: Frame rate up-conversion using linear quadratic motion estimation and trilateral filtering motion smoothing. J. Disp. Technol. 12(1), 89–98 (2016)
Ha, T., Lee, S., Kim, J.: Motion compensated frame interpolation by new block-based motion estimation algorithm. IEEE Trans. Consum. Electron. 50(2), 752–759 (2004)
Kang, S.J., Yoo, S., Kim, Y.H.: Dual motion estimation for frame rate up-conversion. IEEE Trans. Circuits Syst. Video Technol. 20(12), 1909–1914 (2011)
Kim, D.Y., Lim, H., Park, H.W.: Iterative true motion estimation for motion-compensated frame interpolation. IEEE Trans. Circuits Syst. Video Technol. 23(3), 445–454 (2013)
Kim, U.S., Sunwoo, M.H.: New frame rate up-conversion algorithms with low computational complexity. IEEE Trans. Circuits Syst. Video Technol. 24(3), 384–393 (2014)
Lu, Q., Xu, N., Fang, X.: Motion-compensated frame interpolation with multiframe-based occlusion handling. J. Disp. Technol. 12(1), 45–54 (2016)
Meyer, S., Wang, O., Zimmer, H., Grosse, M., Sorkinehornung, A.: Phase-based frame interpolation for video. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1410–1418 (2015)
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive separable convolution. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2270–2279 (2017)
Vinh, T.Q., Kim, Y.C., Hong, S.H.: Frame rate up-conversion using forward-backward jointing motion estimation and spatio-temporal motion vector smoothing. In: International Conference on Computer Engineering & Systems, pp. 605–609 (2010)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Yoo, D.G., Kang, S.J., Kim, Y.H.: Direction-select motion estimation for motion-compensated frame rate up-conversion. J. Disp. Technol. 9(10), 840–850 (2013)
Yu, Z., Li, H., Wang, Z., Hu, Z., Chen, C.W.: Multi-level video frame interpolation: exploiting the interaction among different levels. IEEE Trans. Circuits Syst. Video Technol. 23(7), 1235–1248 (2013)
Acknowledgements
This research is partly supported by NSFC, China (No: 61572315, 6151101179) and 973 Plan, China (No. 2015CB856004).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Cai, H., Jiang, H., Huang, X., Yang, J. (2018). New Motion Estimation with Angular-Distance Median Filter for Frame Interpolation. In: Lai, JH., et al. Pattern Recognition and Computer Vision. PRCV 2018. Lecture Notes in Computer Science(), vol 11256. Springer, Cham. https://doi.org/10.1007/978-3-030-03398-9_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-03398-9_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03397-2
Online ISBN: 978-3-030-03398-9
eBook Packages: Computer ScienceComputer Science (R0)