The challenges for creators of specific science gateways are manifold, and the expertise needed for well-designed science gateways is very diverse. The sustainability of science gateways is crucial to serve communities effectively, efficiently and reliably. One measure to achieve greater sustainability of science gateways is establishing on-campus teams. Researchers are served more efficiently since the support by experienced developers reduces individual project investments, and a team can bring the diversity of required expertise for a well-designed science gateway. This paper goes into detail about the challenges and the benefits of on-campus groups and of sharing resources across a campus. We provide four successful cases, describe the services of the Science Gateways Community Institute (SGCI) to support the process in building such groups, and recommend strategies for using free campus resources.
Associate Director, Community Engagement & Exchange, U of Michigan/Science Gateways Community Institute
I help people creating advanced digital resources for research and education connect their projects with helpful services, expertise, and information. Ask me how the Science Gateways Community Institute can support your projects--at no cost--to better leverage the people and money... Read More →
Consequently, software preservation has become a new focus of the scientific world; commercial and academic projects have sprouted to fill the void for software preservation. There are emerging standards for packaging content in projects such as Popper, Research Objects, and DataMill. Websites such as RunMyCode, MyExperiment, Research Compendia, Zenodo, Open Science Framework, and two independent projects both called Datahub have been created to host scientific code and data. There are also software frameworks that a researcher can leverage to build their own software, such as torch.ch and GenePattern. There are tools to capture and replay the dynamic execution of software such as ReproZip, CDE, and Sumatra. Another strategy are curated collections of software such as Madagascar, including Taverna, Galaxy, Wings, VisTrails, and Kepler. A plethora of literate programming solutions to combine code and prose such as knitr, SOLE, Jupyter, Zeppelin, Collage, Binder, and Beaker Notebook have appeared. Services have sprung up to run science code in the cloud, like Chameleon Cloud, NanoHub, and two commercial services: defunct Wind River Helix Lab Cloud and newcomer Code Ocean. Finally, there are tools that describe a software environment and generate virtual machines on the fly such as Umbrella, Simulocean, and OCCAM. In fact, the problem has shifted somewhat from a reproducibility crisis to an issue of there being too many solutions and knowing which ones to use!
It is obvious that there is every motivation to create software archives that can keep digital artifacts for scientific research running. Yet, we have no means to evaluate these archival systems. The broader scientific software community needs to put effort into defining what quality means for both artifacts and archives. We will define such a criteria, look at existing archives and their drawbacks, and how it shows through experience that this metric is appropriate.
In this work we catalog patterns, practices and trends we have seen from our experiences deploying science gateways at the NERSC supercomputing center. We cover the following topics: Sharing Data Over The Web, Web Frameworks for Science, Web IDEs and Interactive HPC, REST APIs, Authentication and Authorization, Edge Services, Data Transfer Services, Cloud based Portals, and Containers. This is our attempt to share what we have learned with the community, and to identify key aspects of science gateway deployment and development.