Abstract
A discovery cloud is a set of automated, cloud-hosted services to which individuals may outsource their routine and not-so-routine research tasks: finding relevant data, inferring links between data, running computational experiments, inferring new knowledge claims, evaluating the credibility of knowledge claims produced by others, designing experiments, and so on. If developed successfully, a discovery cloud can accelerate and democratize access to data and knowledge tools and the collaborative construction of new knowledge. Such systems are also fascinating to consider from a reasoning perspective because they integrate great complexity at multiple levels: the underlying cloud-based hardware and software, for which issues of reliability and responsiveness may be paramount; the knowledge bases and inference engines that sit on that cloud substrate, for which issues of correctness may be less well defined; and the human communities that form around the discovery clouds, and that arguably form as much as part of the cloud as the hardware, software, and data. I raise questions here about what it might mean to reason about such systems. I do not provide any answers.
References
Whitehead, A.N.: Introduction to Mathematics. Williams and Norgate, London (1911)
Murata, T.: Petri nets: properties, analysis and applications. Proc. IEEE 77(4), 541–580 (1989)
Quoc, V.L.: Building high-level features using large scale unsupervised learning. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8595–8598. IEEE (2013)
Koehn, P.: Statistical Machine Translation. Cambridge University Press, Cambridge (2009)
Daniel, D.L., Lipson, H.: Learning symbolic representations of hybrid dynamical systems. J. Mach. Learn. Res. 13(1), 3585–3618 (2012)
Honavar, V.G., Hill, M.D., Yelick, K.: Accelerating science: a computing research agenda. A white paper prepared for the Computing Community Consortium committee of the Computing Research Association (2016). http://cra.org/ccc/resources/ccc-led-whitepapers/
Djorgovski, S.G.: Virtual astronomy, information technology, and the new scientific methodology. In: 7th International Workshop on Computer Architecture for Machine Perception, pp. 125–132. IEEE (2005)
Foster, I., Ananthakrishnan, R., Blaiszik, B., Chard, K., Osborn, R., Tuecke, S., Wilde, M., Wozniak, J.: Networking materials data: accelerating discovery at an experimental facility. In: Joubert, G., Grandinetti, L. (eds.) Big Data and High Performance Computing (in press, 2015)
Gray, J., Szalay, A.S., Thakar, A.R., Kunszt, P.Z., Malik, T., Raddick, J., Stoughton, C., vandenBerg, J.: The SDSS SkyServer - public access to the sloan digital sky server data. In: ACM SIGMOD, pp. 1–11 (2002)
Overbeek, R.A., Disz, T., Stevens, R.L.: The SEED: a peer-to-peer environment for genome annotation. Commun. ACM 47(11), 46–51 (2004)
Overbeek, R., Olson, R., Pusch, G.D., Olsen, G.J., Davis, J.J., Disz, T., Edwards, R.A., Gerdes, S., Parrello, B., Shukla, M., Vonstein, V., Wattam, A.R., Xia, F., Stevens, R.: The SEED and the rapid annotation of microbial genomes using subsystems technology (RAST). Nucleic Acids Res. 42(D1), D206–D214 (2014)
Meyer, F., Paarmann, D., D’Souza, M., Olson, R., Glass, E.M., Kubal, M., Paczian, T., Rodriguez, A., Stevens, R., Wilke, A., Wilkening, J., Edwards, R.A.: The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinform. 9(1), 386 (2008)
Szalay, A.S.: From simulations to interactive numerical laboratories. In: 2014 Winter Simulation Conference, pp. 875–886. IEEE Press (2014)
O’Mullane, W., Li, N., Nieto-Santisteban, M., Szalay, A., Thakar, A., Gray, J.: Batch is back: CasJobs, serving multi-TB data on the Web. In: IEEE International Conference on Web Services, pp. 33–40. IEEE (2005)
Chong, F., Carraro, G.: Architecture strategies for catching the long tail. MSDN Library, Microsoft Corporation, pp. 9–10 (2006)
Dubey, A., Wagle, D.: Delivering software as a service. The McKinsey Quarterly, May 2007
Foster, I., Vasiliadis, V., Tuecke, S.: Software as a service as a path to software sustainability. Technical report (2013). doi:10.6084/m9.figshare.791604
Lawton, G.: Developing software online with platform-as-a-service technology. Computer 41(6), 13–15 (2008)
Foster, I.: Globus online: accelerating and democratizing science through cloud-based services. IEEE Internet Comput. 15(3), 70–73 (2011)
Madhavan, K.P.C., Beaun, D., Shivarajapura, S., Adams, G.B., Klimeck, G.: nanoHUB.org serving over 120,000 users worldwide: its first cyber-environment assessment. In: 10th IEEE Conference on Nanotechnology (IEEE-NANO), pp. 90–95. IEEE (2010)
Goff, S.A., Vaughn, M., McKay, S., Lyons, E., Stapleton, A.E., Gessler, D., Matasci, N., Wang, L., Hanlon, M., Lenards, A., et al.: The iPlant collaborative: cyberinfrastructure for plant biology. Front. Plant Sci. 2 (2011)
Foster, I.: Service-oriented science. Science 308(5723), 814–817 (2005)
Foster, I., Chard, K., Tuecke, S.: The discovery cloud: accelerating and democratizing research on a global scale. In: International Conference on Cloud Engineering (2016)
Chard, K., Tuecke, S., Foster, I.: Efficient and secure transfer, synchronization, and sharing of big data. IEEE Cloud Comput. 1(3), 46–55 (2014)
Ananthakrishnan, R., Chard, K., Foster, I., Tuecke, S.: Globus platform-as-a-service for collaborative science applications. Concurrency Comput.: Pract. Exp. 27(2), 290–305 (2015)
Evans, J.A., Foster, J.G.: Metaknowledge. Science 331(6018), 721–725 (2011)
Rzhetsky, A., Foster, J.G., Foster, I.T., Evans, J.A.: Choosing experiments to accelerate collective discovery. Proc. Natl. Acad. Sci. 112(47), 14569–14574 (2015)
Mead, C.: Neuromorphic electronic systems. Proc. IEEE 78(10), 1629–1636 (1990)
Goecks, J., Nekrutenko, A., Taylor, J., et al.: Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11(8), R86 (2010)
Deelman, E., Singh, G., Mei-Hui, S., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Karan, V., Berriman, G.B., Good, J., et al.: Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci. Program. 13(3), 219–237 (2005)
Wilde, M., Foster, I., Iskra, K., Beckman, P., Zhang, Z., Espinosa, A., Hategan, M., Clifford, B., Raicu, I.: Parallel scripting for applications at the petascale and beyond. Computer 11, 50–60 (2009)
Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucleic Acids Res. 34(suppl 2), W729–W732 (2006)
Van der Aalst, W.M.P.: The application of Petri nets to workflow management. J. Circuits, Syst. Comput. 8(01), 21–66 (1998)
Simonet, A., Fedak, G., Ripeanu, M.: Active data: a programming model to manage data life cycle across heterogeneous systems and infrastructures. Future Gener. Comput. Syst. 53, 25–42 (2015)
Simonet, A., Chard, K., Fedak, G., Foster, I.: Using active data to provide smart data surveillance to e-science users. In: 23rd Euromicro International Conference on Parallel, Distributed and Network-Based Processing, pp. 269–273. IEEE (2015)
Acknowledgements
I am grateful to the organizers of Petri Nets 2016 for the opportunity to contribute this article to the proceedings. This work is supported in part by the US Department of Energy contract DE-AC02-06CH11357.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Foster, I. (2016). Reasoning About Discovery Clouds. In: Kordon, F., Moldt, D. (eds) Application and Theory of Petri Nets and Concurrency. PETRI NETS 2016. Lecture Notes in Computer Science(), vol 9698. Springer, Cham. https://doi.org/10.1007/978-3-319-39086-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-39086-4_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39085-7
Online ISBN: 978-3-319-39086-4
eBook Packages: Computer ScienceComputer Science (R0)