Abstract
High energy physics experiments have been producing a large amount of data at PB or EB level, and there will be ambitious experimental programs in the coming decades. The efficiency of data-intensive researches is closely related to how fast data can be accessed and how many computational resources can be used. Changes in computing technology and large increases in data volume require new computing models. This paper will give an overall introduction to scientific data management technologies and applications in high energy physics. The current data management framework and workflow will be investigated at first. These include data acquisition, data transfer, data storage, data processing, data sharing and data preservation. Then some ongoing research and development on data organization, management and access will be introduced. Finally the EventDB, an event-based big scientific data management system will be introduced. The test on more than ten billion physics events shows the query speed is greatly improved than traditional file-base data management system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
WLCG Homepage. http://wlcg.web.cern.ch/. Accessed 27 Oct 2018
Belov, S., Suo, B., Deng, Z.Y., et al.: Design and operation of the BES-III distributed computing system. Procedia Comput. Sci. 66, 619–624 (2015)
Ayllon, A.A., Salichos, M., Simon, M.K., et al.: FTS3: new data movement service for WLCG. J. Phys. Conf. Ser. 513(3), 032081 (2014)
Takanori, H., Belle, I.I.: Computing at the Belle II experiment. J. Phys: Conf. Ser. 664(1), 012002 (2015)
Karle, A., Ahrens, J., Bahcall, J.N., et al.: IceCube—the next generation neutrino telescope at the South Pole. Nucl. Phys. B-Proc. Suppl. 118, 388–395 (2003)
Apollinari, G., et al.: High-Luminosity Large Hadron Collider (HL-LHC): Technical Design Report V. 0.1. CERN Yellow Reports: Monographs. CERN, Geneva (2017). https://cds.cern.ch/record/2284929
Djurcic, Z., Li, X., Hu, W., et al.: JUNO conceptual design report (2015). https://arxiv.org/abs/1508.07166
He, H.H., LHAASO Collaboration: Design highlights and status of the LHAASO project. In: Proceedings of the 34rd ICRC (2015)
Butler, M., Mount, R., Hildreth, M.: Snowmass 2013 Computing Frontier Storage and Data Management. arXiv preprint arXiv:1311.4580 (2013)
Perret-Gallix, D.: Simulation and event generation in high-energy physics. Comput. Phys. Commun. 147(1), 488–493 (2002)
Gutleber, J., Murray, S., Orsini, L.: Towards a homogeneous architecture for high-energy physics data acquisition systems. Comput. Phys. Commun. 153(2), 155–163 (2003)
Nakahama, Y.: The atlas trigger system: Ready for run-2. J. Phys: Conf. Ser. 664(8), 082037 (2015)
Ratti, C., Thaler, M.A., Weise, W.: Phases of QCD: lattice thermodynamics and a field theoretical model. Phys. Rev. D 73(1), 014019 (2006)
Schwan, P.: Lustre: building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux Symposium, pp. 380–386 (2003)
Peters, A.J., Janyst, L.: Exabyte scale storage at CERN. J. Phys: Conf. Ser. 331(5), 052015 (2011)
Fuhrmann, P., Gülzow, V.: dCache, storage system for the future. In: Nagel, W.E., Walter, W.V., Lehner, W. (eds.) Euro-Par 2006. LNCS, vol. 4128, pp. 1106–1113. Springer, Heidelberg (2006). https://doi.org/10.1007/11823285_116
Presti, G.L., Barring, O., Earl, A., et al.: CASTOR: a distributed storage resource facility for high performance data processing at CERN. In: MSST, vol. 7, pp. 275–280 (2007)
Devision, C.: Fermi National Accelerator Laboratory, “Enstore mass storage system”. http://www-ccf.fnal.gov/enstore/design.html
Watson, R.W., Coyne, R.A.: The parallel I/O architecture of the high-performance storage system (HPSS). In: MSS, p. 27. IEEE (1995)
Alves Jr., A.A., Amadio, G., Anh-Ky, N., et al.: A Roadmap for HEP Software and Computing R&D for the 2020 s. arXiv preprint arXiv:1712.06982 (2017)
Bonacorsi, D., Ferrari, T.: WLCG service challenges and tiered architecture in the LHC era. In: IFAE 2006, pp. 365–368. Springer, Milano (2007). https://doi.org/10.1007/978-88-470-0530-3_68
I Bird. The Challenges of Big (Science) Data. https://indico.cern.ch/event/466934/contributions/2524828/attachments/1490181/2315978/BigDataChallenges-EPS-Venice-080717.pdf
Stewart, G.A., Cameron, D., Cowan, G.A., et al.: Storage and data management in EGEE. In: Proceedings of the Fifth Australasian Symposium on ACSW Frontiers, vol. 68, pp. 69–77. Australian Computer Society, Inc. (2007)
Baud, J.-P., Casey, J.: Evolution of LCG-2 Data Management CHEP, La Jolla, California, March 2004
Barrass, T., Newbold, D., Tuura, L.: The CMS PhEDEx system: a novel approach to robust grid data distribution. In: AHM 2005, 19–22nd September 2005, Nottingham (UK) (2005)
Garonne, V., et al.: Rucio - the next generation of large scale distributed system for ATLAS Data Management. J. Phys.: Conf. Ser. 513, 042021 (2014)
Patton, S., Samak, T., Tull, C.E., et al.: Spade: decentralized orchestration of data movement and warehousing for physics experiments. In: 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM), pp. 1014–1019. IEEE (2015)
Martelli, E., Stancu, S.: Lhcopn and lhcone: status and future evolution. J. Phys: Conf. Ser. 664(5), 052025 (2015)
Akopov, Z., Amerio, S., Asner, D., et al.: Status report of the DPHEP Study Group: Towards a global effort for sustainable data preservation in high energy physics. arXiv preprint arXiv:1205.4667 (2012)
CERN Open Data Portal. http://opendata.cern.ch/. Accessed 27 Oct 2018
Maguire, E., Heinrich, L., Watt, G.: HEPData: a repository for high energy physics data. J. Phys: Conf. Ser. 898(10), 102006 (2017)
Buckley, A., Butterworth, J., Grellscheid, D., et al.: Rivet user manual. Comput. Phys. Commun. 184(12), 2803–2819 (2013)
Barberis, D., Zárate, S.E.C., Cranshaw, J., et al.: The ATLAS EventIndex: architecture, design choices, deployment and first operation experience. J. Phys: Conf. Ser. 664(4), 042003 (2015)
Acknowledgements
This work was supported by the National key Research Program of China “Scientific Big Data Management System” (No.2016YFB1000605) and National Natural Science Foundation of China (No. 11675201 and 11575223).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, G., Cheng, Y. (2019). Scientific Data Management and Application in High Energy Physics. In: Li, J., Meng, X., Zhang, Y., Cui, W., Du, Z. (eds) Big Scientific Data Management. BigSDM 2018. Lecture Notes in Computer Science(), vol 11473. Springer, Cham. https://doi.org/10.1007/978-3-030-28061-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-28061-1_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28060-4
Online ISBN: 978-3-030-28061-1
eBook Packages: Computer ScienceComputer Science (R0)