Abstract
We present a novel framework for motion segmentation that combines the concepts of layer-based methods and feature-based motion estimation. We estimate the initial correspondences by comparing vectors of filter outputs at interest points, from which we compute candidate scene relations via random sampling of minimal subsets of correspondences. We achieve a dense, piecewise smooth assignment of pixels to motion layers using a fast approximate graphcut algorithm based on a Markov random field formulation. We demonstrate our approach on image pairs containing large inter-frame motion and partial occlusion. The approach is efficient and it successfully segments scenes with inter-frame disparities previously beyond the scope of layer-based motion segmentation methods. We also present an extension that accounts for the case of non-planar motion, in which we use our planar motion segmentation results as an initialization for a regularized Thin Plate Spline fit. In addition, we present applications of our method to automatic object removal and to structure from motion.
Similar content being viewed by others
References
Ayer, S. and Sawhney, H. 1995. Layered representation of motion video using robust maximum-likelihood estimation of mixture models and mdl encoding. In ICCV 95, pp. 777–784.
Belongie, S., Malik, J., and Puzicha, J. 2002. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Analysis and Machine Intelligence, 24(4):509–522.
Belongie, S. and Wills, J. 2004. Structure from periodic motion. In Spatial Coherence for Visual Motion Analysis, Prague, Czech Republic.
Black, M. and Jepson, A. 1996. Estimating optical flow in segmented images using variable-order parametric models with local deformations. T-PAMI, 18:972–986.
Bookstein, F.L. 1989. Principal warps: Thin-plate splines and decomposition of deformations. IEEE Trans. Pattern Analysis and Machine Intelligence, 11(6):567–585.
Boykov, Y., Veksler, O., and Zabih, R. 1999. Approximate energy minimization with discontinuities. In IEEE International Workshop on Energy Minimization Methods in Computer Vision, pp. 205–220.
Boykov, Y., Veksler, O., and Zabih, R. 2001. Efficient approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12):1222–1239.
Brand, M. 2001. Morphable 3d models from video. In CVPR01, II: pp. 456–463.
Brand, M. and Bhotika, R. 2001. Flexible flow for 3d nonrigid tracking and shape recovery. In CVPR01, I: pp. 315–322.
Chui, H. and Rangarajan, A. 2000. A new algorithm for non-rigid point matching. In Proc. IEEE Conf. Comput. Vision and Pattern Recognition, pp. 44–51.
Cutler, R. and Davis, L. 2000. Robust real-time periodic motion detection, analysis, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8).
Darrell, T. and Pentland, A. 1991. Robust estimation of a multi-layer motion representation. In Proc. IEEE Workshop on Visual Motion, Princeton, NJ.
Donato, G. and Belongie, S. 2002. Approximate thin plate spline mappings. In Proc. 7th Europ. Conf. Comput. Vision, Vol. 2, pp. 531–542.
Duchon, J. 1976. Fonction-spline et esperances conditionnelles de champs gaussiens. Ann. Sci. Univ. Clermont Ferrand II Math., 14:19–27.
Duchon, J. 1977. Splines minimizing rotation-invariant semi-norms in Sobolev spaces. In Constructive Theory of Functions of Several Variables, W. Schempp and K. Zeller (Eds.), Berlin: Springer-Verlag, pp. 85–100.
Fischler, M. and Bolles, R. 1981. Random sample consensus: A paradigm for model fitting with application to image analysis and automated cartography. Commun. Assoc. Comp. Mach., 24:381–395.
Förstner, W. and Gülch, E. 1987. A fast operator for detection and precise location of distinct points, corners and centres of circular features. In Intercommission Conference on Fast Processing of Photogrammetric Data, Interlaken, Switzerland, pp. 281–305.
Freeman, W.T. and Adelson, E.H. 1991. The design and use of steerable filters. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(9):891–906.
Girosi, F., Jones, M., and Poggio, T. 1995. Regularization theory and neural networks architectures. Neural Computation, 7(2):219–269.
Hartley, R.I. and Zisserman, A. 2000. Multiple View Geometry in Computer Vision, Cambridge University Press, ISBN: 0521623049.
Irani, M. and Anandan, P. 1999. All about direct methods. In Vision Algorithms: Theory and Practice, W. Triggs, A. Zisserman and R. Szeliski (Eds.), Springer-Verlag.
Irani, M. and Peleg, S. 1993. Motion analysis for image enhancement: Resolution, occlusion, and transparency. Journal of Visual Communication and Image Representation, 4(4):324–335.
Jones, D. and Malik, J. 1992. Computational framework to determining stereo correspondence from a set of linear spatial filters. Image and Vision Computing, 10(10):699–708.
Kleinberg, J. and Tardos, E. 1999. Approximate algorithms for classification problems with pairwise relationships: Metric labelling and markov random fields. In Proceedings of the IEEE Symposium on Foundations of Computer Science.
Lhuillier, M. and Quan, L. 2002. Match propagation for image-based modeling and rendering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24 (8):1140–1146.
Liu, Y., Collins, R., and Tsin, Y. 2002. Gait sequence analysis using Frieze patterns. In Proc. 7th Europ. Conf. Comput. Vision.
Meinguet, J. 1979. Multivariate interpolation at arbitrary points made simple. J. Appl. Math. Phys. (ZAMP), 5:439–468.
Mikolajczyk, K. and Schmid, C. 2002. An affine invariant interest point detector. In European Conference on Computer Vision, Springer, Copenhagen, pp. 128–142.
Odobez, J.-M. and Bouthemy, P. 1998. Direct incremental model-based image motion segmentation for video analysis. Signal Processing, 66(2):143–155.
Powell, M.J.D. 1995. A thin plate spline method for mapping curves into curves in two dimensions. In Computational Techniques and Applications (CTAC95), Melbourne, Australia.
Sawhney, H.S. and Hanson, A.R. 1993. Trackability as a cue for potential obstacle identification and 3D description. International Journal of Computer Vision, 11(3):237–265.
Seitz, S.M. and Dyer, C.R. 1996. View morphing. In SIGGRAPH, pp. 21–30.
Smola, A. and Schölkopf, B. 2000. Sparse greedy matrix approximation for machine learning. In ICML.
Soatto, S. and Yezzi, A.J. 2002. DEFORMOTION: Deforming motion, shape average and the joint registration and segmentation of images. In European Conference on Computer Vision, Springer, Copenhagen, pp. 32–47.
Szeliski, R. and Coughlan, J. 1994. Hierarchical spline-based image registration. In IEEE Conference on Computer Vision Pattern Recognition, Seattle, Washington, pp. 194–201.
Szeliski, R. and Shum, H.-Y. 1996. Motion estimation with quadtree splines. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(12):1199–1210.
Tomasi, C. and Kanade, T. 1991. Factoring image sequences into shape and motion. In Proc. IEEE Workshop on Visual Motion, IEEE.
Torr, P.H.S. 1998. Geometric motion segmentation and model selection. In Philosophical Transactions of the Royal Society A, J. Lasenby, A. Zisserman, R. Cipolla, and H. Longuet-Higgins (Eds.), Roy Soc, pp. 1321–1340.
Torr, P.H.S., Szeliski, R., and Anandan, P. 1999. An integrated Bayesian approach to layer extraction from image sequences. In Seventh International Conference on Computer Vision, Vol. 2, pp. 983–991.
Torr, P.H.S., Zisserman, A., and Murray, D.W. 1995. Motion clustering using the trilinear constraint over three views. In Europe-China Workshop on Geometrical Modelling and Invariants for Computer Vision, R. Mohr and C. Wu (Eds.), Springer-Verlag, pp. 118–125.
Torresani, L., Bregler, C., and Hertzmann, A. 2003. Learning non-rigid 3d shape from 2d motion. In NIPS 2003.
Torresani, L. and Hertzmann, A. 2004. Automatic non-rigid 3d modeling from video. In ECCV04, Vol. II, pp. 299–312.
Torresani, L., Yang, D., Alexander, G., and Bregler, C. 2001. Tracking and modelling non-rigid objects with rank constraints. In IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, pp. 493–500.
Vidal, R. and Ma, Y. 2004. A unified algebraic approah to 2-d and 3-d motion segmentation. In Proc. European Conf. Comput. Vision, Prague, Czech Republic.
Wahba, G. 1990. Spline Models for Observational Data, SIAM.
Wang, J. and Adelson, E.H. 1993. Layered representation for motion analysis. In Proc. Conf. Computer Vision and Pattern Recognition, pp. 361–366.
Weiss, Y. 1997. Smoothness in layers: Motion segmentation using nonparametric mixture estimation. In Proc. IEEE Conf. Comput. Vision and Pattern Recognition, pp. 520–526.
Wills, J., Agarwal, S., and Belongie, S. 2003. What went where. In Proc. IEEE Conf. Comput. Vision and Pattern Recognition, vol. 1, Madison, WI, June 2003, pp. 37–44.
Wills, J. and Belongie, S. 2004. A feature-based approach for determining long range correspondences. In Proc. European Conf. Comput. Vision, vol. 3, Prague, Czech Republic, pp. 170–182.
Xiao, J., Chai, J., and Kanade, T. 2004. A closed-form solution to non-rigid shape and motion recovery. In Proc. European Conf. Comput. Vision, Prague, Czech Republic.
Xiao, J. and Shah, M. 2004. Motion layer extraction in the presence of occlusion using graph cuts. In CVPR04, Washington, D.C.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wills, J., Agarwal, S. & Belongie, S. A Feature-based Approach for Dense Segmentation and Estimation of Large Disparity Motion. Int J Comput Vision 68, 125–143 (2006). https://doi.org/10.1007/s11263-006-6660-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-006-6660-3