Skip to main content

Surfacing Data Change in Scientific Work

  • Conference paper
  • First Online:
Information in Contemporary Society (iConference 2019)

Abstract

Data are essential products of scientific work that move among and through research infrastructures over time. Data constantly changes due to evolving practices and knowledge, requiring improvisational work by scientists to determine the effects on analyses. Today for end users of datasets much of the information about changes, and the processes leading to them, is invisible—embedded elsewhere in the work of a collaboration. Simultaneously scientists use increasing quantities of data, making ad hoc approaches to identifying change difficult to scale effectively. Our research investigates data change by examining how scientists make sense of change in datasets being created and sustained by the collaborative infrastructures they engage with. We examine two forms of change, before examining how trust and project rhythms influence a scientist’s notion that the newest available data are the best. We explore the opportunity to design tools and practices to support user examinations of data change and surface key provenance information embedded in research infrastructures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This work is part of the Deduce project (http://deduce.lbl.gov). The goal of the Deduce project is to develop methods and tools that support data change exploration and management in the context of data analysis pipelines.

  2. 2.

    http://www.sdss.org/, http://www.desi.lbl.gov/.

  3. 3.

    http://watershed.lbl.gov/, http://ameriflux.lbl.gov.

References

  1. Birnholtz, J.P., Bietz, M.J.: Data at work: supporting sharing in science and engineering. In: Proceedings of the 2003 International ACM SIGGROUP Conference on Supporting Group Work, GROUP 2003, pp. 339–348. ACM, New York (2003). https://doi.org/10.1145/958160.958215

  2. Borgman, C.L.: Big Data, Little Data, No Data: Scholarship in the Networked World. MIT Press, Cambridge (2015)

    Book  Google Scholar 

  3. Dourish, P., Gómez Cruz, E.: Datafication and data fiction: narrating data and narrating with data. Big Data Soc. 5(2) (2018). https://doi.org/10.1177/2053951718784083

  4. Edwards, P.N.: A Vast Machine: Computer Models, Climate Data, and the Politics of Global. MIT Press, Cambridge (2010)

    Google Scholar 

  5. Edwards, P.N., Jackson, S.J., Bowker, G.C., Knobel, C.P.: Understanding infrastructure: dynamics, tensions, and design. Workshop report, University of Mighican (2007). http://hdl.handle.net/2027.42/49353

  6. Edwards, P.N., Mayernik, M.S., Batcheller, A.L., Bowker, G.C., Borgman, C.L.: Science friction: data, metadata, and collaboration. Soc. Stud. Sci. 41(5), 667–690 (2011). https://doi.org/10.1177/0306312711413314

    Article  Google Scholar 

  7. Faniel, I., Jacobsen, T.: Reusing scientific data: How earthquake engineering researchers assess the reusability of colleagues’ data. Comput. Support. Coop. Work (CSCW) 19(3), 355–375 (2010). https://doi.org/10.1007/s10606-010-9117-8

    Article  Google Scholar 

  8. Gerson, E.M.: Reach, Bracket, and the Limits of Rationalized Coordination: Some Challenges for CSCW Resources, Co-Evolution and Artifacts, Computer Supported Cooperative Work, pp. 193–220. Springer, London (2008). https://doi.org/10.1007/978-1-84628-901-9

  9. Gitelman, L., Jackson, V.: Introduction. In: Gitelman, L. (ed.) “Raw Data” is an Oxymoron. Infrastructure Series, pp. 1–14. MIT Press, Cambridge (2013)

    Google Scholar 

  10. Jirotka, M., Lee, C.P., Olson, G.M.: Supporting scientific collaboration: methods, tools and concepts. Comput. Support. Coop. Work (CSCW) 22(4–6), 667–715 (2013). https://doi.org/10.1007/s10606-012-9184-0

    Article  Google Scholar 

  11. Karasti, H., Blomberg, J.: Studying infrastructuring ethnographically. Comput. Support. Coop. Work 27(2), 233–265 (2018). https://doi.org/10.1007/s10606-017-9296-7

    Article  Google Scholar 

  12. Kitchin, R.: The Data Revolution: Big Data, Open Data, Data Infrastructures and their Consequences. Sage, London (2014)

    Google Scholar 

  13. Paine, D., Lee, C.P.: Who has plots? contextualizing scientific software, practice, and visualizations. In: Proceedings of the ACM on Human-Computer Interaction 1(CSCW) (2017). https://doi.org/10.1145/3134720

  14. Paine, D., Sy, E., Piell, R., Lee, C.P.: Examining data processing work as part of the scientific data lifecycle: Comparing practices across four scientific research groups. In: iConference 2015 (2015). http://hdl.handle.net/2142/73644

  15. Pipek, V., Karasti, H., Bowker, G.C.: A preface to ‘infrastructuring and collaborative design’. Comput. Support. Coop. Work (CSCW) 26(1), 1–5 (2017). https://doi.org/10.1007/s10606-017-9271-3

    Article  Google Scholar 

  16. Plantin, J.C.: Data cleaners for pristine datasets: visibility and invisibility of data processors in social science. Sci. Technol. Hum. Values 44(1), 52–73 (2019). https://doi.org/10.1177/0162243918781268

    Article  Google Scholar 

  17. Rahm, E., Do, H.H.: Data cleaning: problems and current approaches. IEEE Data Eng. Bull. 23(4), 3–13 (2000)

    Google Scholar 

  18. Rawson, K., Munoz, T.: Against cleaning. Curating Menus 6 (2016). http://curatingmenus.org/articles/against-cleaning/

  19. Rolland, B., Lee, C.P.: Beyond trust and reliability: reusing data in collaborative cancer epidemiology research. In: Proceedings of the 2013 Conference on Computer Supported Cooperative Work, CSCW 2013, pp. 435–444. ACM, New York (2013). https://doi.org/10.1145/2441776.2441826

  20. Star, S.L., Ruhleder, K.: Steps toward an ecology of infrastructure: design and access for large information spaces. Inf. Syst. Res. 7(1), 24 (1996)

    Article  Google Scholar 

  21. Star, S.L., Strauss, A.: Layers of silence, arenas of voice: the ecology of visible and invisible work. Comput. Support. Coop. Work (CSCW) 8, 9–30 (1999)

    Article  Google Scholar 

  22. Stodden, V., et al.: Enhancing reproducibility for computational methods. Science 354(6317), 1240–1241 (2016). https://doi.org/10.1126/science.aah6168

    Article  Google Scholar 

  23. Strauss, A.: The articulation of project work: an organizational process. Sociol. Q. 29(2), 163–178 (1988)

    Article  MathSciNet  Google Scholar 

  24. Thomer, A.K., Wickett, K.M., Baker, K.S., Fouke, B.W., Palmer, C.L.: Documenting provenance in noncomputational workflows: research process models based on geobiology fieldwork in yellowstone national park. J. Assoc. Inform. Sci. Technol. 69(10), 1234–1245 (2018). https://doi.org/10.1002/asi.24039

    Article  Google Scholar 

  25. Vertesi, J., Dourish, P.: The value of data: considering the context of production in data economies. In: Proceedings of the ACM 2011 Conference on Computer Supported Cooperative Work, CSCW 2011, pp. 533–542. ACM, New York (2011). https://doi.org/10.1145/1958824.1958906

  26. Weiss, R.S.: Learning From Strangers: The Art and Method of Qualitative Interview Studies. The Free Press, New York (1995)

    Google Scholar 

Download references

Acknowledgements

The authors thank the members of the Deduce project, the study participants, and the anonymous reviewers of this work. This work is supported by the U.S. Department of Energy, Office of Science and Office of Advanced Scientific Computing Research (ASCR) under Contract No. DE-AC02-05CH11231.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Drew Paine .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Paine, D., Ramakrishnan, L. (2019). Surfacing Data Change in Scientific Work. In: Taylor, N., Christian-Lamb, C., Martin, M., Nardi, B. (eds) Information in Contemporary Society. iConference 2019. Lecture Notes in Computer Science(), vol 11420. Springer, Cham. https://doi.org/10.1007/978-3-030-15742-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-15742-5_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-15741-8

  • Online ISBN: 978-3-030-15742-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics