VISIONE at Video Browser Showdown 2022

Amato, Giuseppe; Bolettieri, Paolo; Carrara, Fabio; Falchi, Fabrizio; Gennaro, Claudio; Messina, Nicola; Vadicamo, Lucia; Vairo, Claudio

doi:10.1007/978-3-030-98355-0_52

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13142))

Included in the following conference series:

International Conference on Multimedia Modeling

2030 Accesses
8 Citations

Abstract

VISIONE is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). It uses a full-text search engine as a search backend. In the latest version of our system, we modified the user interface, and we made some changes to the techniques used to analyze and search for videos.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amato, G., et al.: VISIONE at VBS2019. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 591–596. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05716-9_51
Chapter Google Scholar
Amato, G., et al.: The VISIONE video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. J. Imaging 7(5), 76 (2021)
Article Google Scholar
Amato, G., et al.: VISIONE at video browser showdown 2021. In: Lokoč, J., et al. (eds.) MMM 2021. LNCS, vol. 12573, pp. 473–478. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67835-7_47
Chapter Google Scholar
Benavente, R., Vanrell, M., Baldrich, R.: Parametric fuzzy sets for automatic color naming. JOSA A 25(10), 2582–2593 (2008)
Article Google Scholar
Berlin, B., Kay, P.: Basic Color Terms: Their Universality and Evolution. University of California Press, Berkeley (1991)
Google Scholar
Berns, F., Rossetto, L., Schoeffmann, K., Beecks, C., Awad, G.: V3C1 dataset: an evaluation of content characteristics. In: Proceedings of the 2019 on International Conference on Multimedia Retrieval, pp. 334–338. Association for Computing Machinery (2019)
Google Scholar
Boynton, R.M., Olson, C.X.: Salience of chromatic basic color terms confirmed by three measures. Vision. Res. 30(9), 1311–1317 (1990)
Article Google Scholar
Gordo, A., Almazan, J., Revaud, J., Larlus, D.: End-to-end learning of deep visual representations for image retrieval. Int. J. Comput. Vision 124(2), 237–254 (2017)
Article MathSciNet Google Scholar
Heller, S., et al.: Towards explainable interactive multi-modal video retrieval with vitrivr. In: Lokoč, J., et al. (eds.) MMM 2021. LNCS, vol. 12573, pp. 435–440. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67835-7_41
Chapter Google Scholar
Lokoč, J., et al.: Is the reign of interactive search eternal? Findings from the video browser showdown 2020. ACM Trans. Multimed. Comput. Commun. Appl. 17(3), 1–26 (2021)
Article Google Scholar
Messina, N., Amato, G., Esuli, A., Falchi, F., Gennaro, C., Marchand-Maillet, S.: Fine-grained visual textual alignment for cross-modal retrieval using transformer encoders. arXiv preprint arXiv:2008.05231 (2020)
Messina, N., Falchi, F., Esuli, A., Amato, G.: Transformer reasoning network for image-text matching and retrieval. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 5222–5229. IEEE (2021)
Google Scholar
Peška, L., Kovalčík, G., Souček, T., Škrhák, V., Lokoč, J.: W2VV++ BERT model at VBS 2021. In: Lokoč, J., et al. (eds.) MMM 2021. LNCS, vol. 12573, pp. 467–472. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67835-7_46
Chapter Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. arXiv preprint arXiv:2103.00020 (2021)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. CoRR abs/1804.02767 (2018)
Google Scholar
Revaud, J., Almazan, J., Rezende, R., de Souza, C.: Learning with average precision: training image retrieval with a listwise loss. In: International Conference on Computer Vision, pp. 5106–5115. IEEE (2019)
Google Scholar
Rossetto, L., et al.: Interactive video retrieval in the age of deep learning - detailed evaluation of VBS 2019. IEEE Trans. Multimedia 23, 243–256 (2020)
Article Google Scholar
Rossetto, L., Schoeffmann, K., Bernstein, A.: Insights on the V3C2 dataset. arXiv preprint arXiv:2105.01475 (2021)
Sturges, J., Whitfield, T.A.: Salient features of munsell colour space as a function of monolexemic naming and response latencies. Vision. Res. 37(3), 307–313 (1997)
Article Google Scholar
Van De Weijer, J., Schmid, C., Verbeek, J., Larlus, D.: Learning color names for real-world applications. IEEE Trans. Image Process. 18(7), 1512–1523 (2009)
Article MathSciNet Google Scholar
Zhang, H., Wang, Y., Dayoub, F., Sunderhauf, N.: VarifocalNet: an IoU-aware dense object detector. In: Conference on Computer Vision and Pattern Recognition, pp. 8514–8523. IEEE, June 2021
Google Scholar

Download references

Acknowledgements

This work was partially funded by AI4Media - A European Excellence Centre for Media, Society and Democracy (EC, H2020 n. 951911); AI4EU project (EC, H2020, n. 825619); AI4ChSites, CNR4C program (Tuscany POR FSE 2014-2020 CUP B15J19001040004).

Author information

Authors and Affiliations

ISTI-CNR, Via G. Moruzzi 1, 56124, Pisa, Italy
Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Fabrizio Falchi, Claudio Gennaro, Nicola Messina, Lucia Vadicamo & Claudio Vairo

Authors

Giuseppe Amato
View author publications
You can also search for this author in PubMed Google Scholar
Paolo Bolettieri
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Carrara
View author publications
You can also search for this author in PubMed Google Scholar
Fabrizio Falchi
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Gennaro
View author publications
You can also search for this author in PubMed Google Scholar
Nicola Messina
View author publications
You can also search for this author in PubMed Google Scholar
Lucia Vadicamo
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Vairo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nicola Messina .

Editor information

Editors and Affiliations

IT University of Copenhagen, Copenhagen, Denmark
Björn Þór Jónsson
Dublin City University, Dublin, Ireland
Cathal Gurrin
University of Science, VNU-HCM, Ho Chi Minh City, Vietnam
Minh-Triet Tran
University of Bergen, Bergen, Norway
Duc-Tien Dang-Nguyen
National Tsing Hua University, Hsinchu, Taiwan
Anita Min-Chun Hu
Hanoi University of Science and Technology, Hanoi, Vietnam
Binh Huynh Thi Thanh
Median Technologies, Valbonne, France
Benoit Huet

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Amato, G. et al. (2022). VISIONE at Video Browser Showdown 2022. In: Þór Jónsson, B., et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_52

Download citation

DOI: https://doi.org/10.1007/978-3-030-98355-0_52
Published: 15 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98354-3
Online ISBN: 978-3-030-98355-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

VISIONE at Video Browser Showdown 2022