Keywords

1 Introduction

Open Science is increasingly recognized as the avenue that brings the knowledge created by the scientific community and the companies to society, promoting their recognition and the socio-economic impact of research on society. The paradigm of open science has been promoted in recent years, focused mainly on data and publications. Policies and recommendations from multiple fora (e.g. https://ec.europa.eu/research/openscience/index.cfm?pg=home, https://project-open-data.cio.gov/) have been strongly pushing towards a new way to conduct research and share knowledge [1, 2]. Funding agencies, such as the H2020 program, are promoting open science, recognizing its impact on innovation and therefore on the economy and the quality of life.

While these efforts have gradually been successful and accepted by the scientific community for publication and for access to scientific data, most of the highly specialized software development has been far away from the desired openness. In the field of ocean modelling, the current trend is for community-based, open-source codes (examples of popular open models originated in Europe include DELFT3D, MOHID and TELEMAC, and in the USA, ADCIRC, SCHISM and ROMS). However, most IT water platforms that integrate these models to provide services to the community are not freely available for usage or even for improvement and reuse for other purposes. These platforms are fundamental to end-users and stakeholders for their management chores but are also important for the general public activities.

The forecast system development area is still an example of little openness, mostly due to the difficulty to operate coastal forecast systems outside research institutes infrastructures. Handing over these systems to water management authorities is a very difficult task, as expensive hardware and specialized human resources are needed to keep them in operation. Recently, the integration and availability of computing resources provided through the European Open Science Cloud (EOSC) allowed a first opportunity to open coastal forecast systems to the coastal community.

OPENCoastS, a new Web platform developed under the concept of open science, is an open service integrated in the thematic services and marketplace of the project EOSC-Hub [3, 4]. It aims at generating on-demand coastal forecast systems with minimal user intervention, and is applicable worldwide. The platform guides the user through seven simple steps towards the generation of an operational forecast system in any coastal region. The user only has to provide an unstructured grid of the study area and information on the river flow (if needed). The platform includes: (i) the definition of boundary conditions, selected from several options available, (ii) the model parameterization of the simulations and (iii) the choice of the online stations for model validation, automatically identified by the platform. The platform also includes interfaces for visualizing the results and managing the forecasts.

The open science paradigm, which was at the foundation of the OPENCoastS service, facilitated by the availability of resources in the EOSC community, is reinforced herein with a new complementary service: a computational grid repository. The availability of computational grids remains the major limitation for the uptake of this service by the coastal stakeholder community and the general public, in spite of the training efforts by the OPENCoastS development team and the availability of free grid-generation codes. The creation of a public, open repository of computational grids proposed here, implemented through organized and easy-to-access technologies, shared by the expert numerical modelers across the globe, will strengthen the accessibility of OPENCoastS to all coastal actors.

This paper is organized as follows. Section 2 provides an overview of the OPENCoastS service, along with a description of its interconnection with global ocean and atmospheric forecast providers. The grid repository is described in Sect. 3. The use of this repository is illustrated in Sect. 4, for the Algarve coast circulation prediction. The paper closes with some final considerations.

2 Overview of the OPENCoastS Platform

2.1 Goals and Main Characteristics

The OPENCoastS service builds on-demand circulation forecast systems for user-selected regions of the coast and maintains them running operationally for the time frame defined by the user. This daily service generates forecasts of water levels and depth-averaged velocities over the spatial region of interest, based on numerical simulations of all relevant physical processes. This service takes advantage of two e-infrastructures for computational and storage resources: the National Distributed Computing Infrastructure – INCD (part of the Portuguese Roadmap for Infrastructures) and IFCA (Institute of Physics of Cantabria, Spain). OPENCoastS is supported by the EGI computational resources (European Grid Initiative), through the H2020 EOSC-Hub project, being one of its thematic services (https://www.eosc-hub.eu/catalogue/OPENCoastS). Linkage to EUDAT storage services is underway.

The architecture of the OPENCoastS service includes (Fig. 1):

Fig. 1.
figure 1

The OPENCoastS architecture and building blocks.

  • the user interface component, a web-based portal;

  • the computation component, where simulation results are generated and post-processed;

  • the archive component, responsible for preserving all relevant data.

The service is composed generically by a frontend (the web platform) and a backend (that deals with all computing tasks). The frontend was developed with the Django Python Web framework using libraries such as: matplotlib, shapely, netCDF4 and numpy, and is supported by a PostGIS spatial object-relational database. The frontend viewer application also uses the ncWMS2 software [5] to serve the forecasts outputs, composed of UGRID-compliant NetCDF files, as Web Map Services (WMS). The OPENCoastS service is available at https://opencoasts.ncg.ingrid.pt.

The backend is responsible for generating the forecast results, handling all tasks of the simulation chain established for each forecast deployment. The simulation chain is updated and operated daily, to produce 48 h predictions, based on the previous status. Each simulation involves a suite of files that remain unchanged throughout the deployment life span, such as the computational grid supplied by the user, and the daily files, such as the forcing conditions to be applied at the boundaries (oceanic, riverine and atmospheric). The platform is strongly anchored on the Water Information Forecast Framework, WIFF [6], which simplifies the assembly and execution of the recurring tasks needed for every simulation. WIFF is a generic forecasting platform, adaptable to any geographical location, which integrates a set of numerical models that run periodically. Forecast systems typically entail computationally demanding tasks. Usually, these tasks are offloaded to computational resources that can handle them. This approach is necessary to cope with very demanding jobs, but it also introduces additional difficulties, such as the heterogeneity of the execution environments and the underprivileged access to the resources. These constraints are overcome in OPENCoastS by employing an underprivileged container technology (udocker [7]) which offers a homogeneous environment without requiring administrative permissions. Udocker allows software to be encapsulated together with all its dependencies and to be executed independently of the Linux distribution used by the host systems. This set of processes, denoted as container, is thus placed in a fully isolated environment, with a given amount of resources, such as CPU or RAM. These features are possible through the use of control groups and namespaces isolation, advanced features of modern Linux kernels.

2.2 OPENCoastS Linkage to Large-Scale Ocean and Atmosphere Forecasts and Data Sources

In order to provide local forecasts for any coastal region in the world, this service is connected to several oceanic and atmospheric forecast providers. This linkage is necessary to provide reliable and accurate predictions at local scales forced by robust systems that are integrated with global monitoring networks through data assimilation. The main providers for forcings in OPENCoastS are summarized in Table 1.

Table 1. OPENCoastS forcing providers.

Users can select the configuration that best fits their purposes in terms of physical processes (tides, tides + storm surges, tides + storm surges), model resolution and area of coverage. Simple comparison with elevation in-situ data is provided after a user selection of data stations from EMODnet-Physics, allowing the user to download data and model results.

Several applications of OPENCoastS have exploited these options at several places in the world. [4] discusses in more detail some of the impacts associated with the choice of the atmospheric and ocean forcing. Other providers of large-scale ocean and atmospheric forecasts whose results are freely available are currently under consideration, in particular those that distribute their products through web services.

3 A Grid Repository for Sharing Knowledge of the Coast

Over the past year, the OPENCoastS platform has been used by over 150 users, from all continents, and forecasts have been produced for over 120 different systems, with an average of 9 activated systems per month. A considerable effort was placed in training, through both hands-on courses at specialized conferences and wide-scope in-situ training in several countries, broadcasted simultaneously by web streaming and video conferencing (the course material is available at http://opencoasts.lnec.pt/index_en.php#eventos). About 90% of current users originate from the research and academic community, and only 10% of them originate from the coastal management community. The general public is not represented at all in the user community.

To further promote the uptake of the OPENCoastS service by the coastal management community as well as the general public, two strategies can be followed:

  1. 1.

    to provide the option of open access to specific forecast deployments

  2. 2.

    to provide free access to the computational grids necessary to make a deployment.

The first strategy is one of the next steps in the OPENCoastS development. It will allow users to access model results from a list of available open deployments. This option minimizes work for end-users and the general public when these shared deployments are publicized on an individual basis, but cannot be properly organized and used as part of a global repository for grid sharing. The indexing and organization of this deployment list, towards facilitating the search for a specific geographical area, is not possible as each deployment has specific metadata chosen by its owner. Also, the conditions for the forecast setup are also selected by its owner, preventing users that access these forecast’s outputs to customize the outputs to their needs. It is however relevant for deployment sharing within teams.

The second strategy, introduced herein, gives all the necessary tools for a user to set up his/hers deployment based on an openly shared grid. By making a repository of computational grids available, along with the necessary metadata on the geographical reference system and vertical positioning of each grid, this solution provides full freedom while overcoming the barrier of building a finite element grid for a particular coastal area. Search mechanisms as well as geographic location are available, making the finding of the site’s grid simple and fast.

In order to publish the available grids as Open Data, a grid repository was created using Github: https://github.com/LNEC-GTI/OPENCoastS-Grids (Fig. 2). Among other advantages, this approach provides a versioning system for files, allowing to keep track of the shared datasets. Within this repository a hierarchical organization, country by country, is provided, followed by the estuary/coastal area name. An example is provided in Fig. 3 for a shared grid of the Guadiana estuary in Portugal and Spain. Multiple grids for the same system can be stored in subsequent directories, as different versions or name identifiers. In order to preserve this repository, facilitate the access to the shared grids and provide them with a DOI, a link to this GitHub repository was made in Zenodo (Fig. 4). Each grid corresponds to one release within the repository, each with its own identifier (Fig. 5).

Fig. 2.
figure 2

The GitHub grid repository for OPENCoastS.

Fig. 3.
figure 3

The GitHub grid repository entrance for the Guadiana grid.

Fig. 4.
figure 4

The Zenodo listing of the GitHub grid repository entries, supported by grid identification.

Fig. 5.
figure 5

Sample shared grid (for the Guadiana estuary, Portugal) available in Zenodo from the GitHub grid repository, with a unique identifier.

After downloading the desired grid to their drive, the users can then set up their forecast systems using OPENCoastS at https://opencoasts.ncg.ingrid.pt. The information on the origin of the grid can be stored in OPENCoastS deployment metadata and be shared with others in the future if the user allows it. Through this solution, sharing (in a searchable, organized way) and preserving grids is accomplished in a citable way that gives credit to the authors providing the grids for public use. Moreover, this solution sets the stage for the future option within OPENCoastS to facilitate the access to predictions under a specific deployment for other people besides the deployment owner. Similarly, a bathymetry uploader can be added to the repository service to allow the update of the bathymetry over time.

4 Demonstration of the OPENCoastS Sharing Service at the Algarve Coast

4.1 Grid Generation

The main input the user has to provide to the OPENCoastS service is a triangular unstructured grid. Although many grid generators are available to create this type of grids, the inexperienced user may find the generation of such grids to be a daunting task. Here, the generation of a grid is illustrated using the Algarve coast (southern Portugal) as an example. Two freeware codes are used herein to generate the grid: XMGREDIT, a semi-automatic grid generator designed for coastal applications [12]; and NICEGRID, a post-processor to automatically improve grid quality [13].

The first step in grid generation is the definition of the domain. The southern Portuguese coast is approximately 140 km long, from the mouth of the Guadiana estuary to the East to the Cape of St. Vincent to the West (Fig. 6). To the East, the domain was extended along the Spanish coast up to an area where the coastline is approximately straight; to the West, the domain was cut a few kilometers before Cape of St. Vincent. These choices aimed at minimizing potential numerical problems. To the South, the boundary was defined as a circular arch with an 80 km radius. The choice of a circular open boundary also aims at minimizing numerical problems due to discontinuities.

Fig. 6.
figure 6

Grid and bathymetry of the Algarve coast.

Nodes were placed automatically using XMGREDIT. The grid spacing was set to 250 m between the coastline and the 50 m isobath, 500 m at the 100 m isobath, 2000 m at the open boundary, and followed a smooth variation in between. Once the set of nodes were defined, they were triangulated (i.e., triangular elements were defined based on those nodes) to provide a preliminary grid. Although this preliminary grid has the desired grid spacing, many elements are highly skewed. The grid was then automatically improved using NICEGRID. This code reduces skewness by moving, adding and deleting nodes, targeting a smooth transition between element sizes and quasi-equilateral triangles. The final grid has about 50 thousand nodes (Fig. 6), and its generation took about 2–3 h.

4.2 Sharing the Grid Through the GitHub/Zenodo Repository

Using the procedure described above, this grid was made available in GitHub (Fig. 7) with the corresponding metadata in the README.md file. A link was created in Zenodo, through a release in GitHub (Fig. 8), providing a DOI (https://doi.org/10.5281/zenodo.2579135). This grid can be downloaded to a local disk, and then uploaded in the OPENCoastS configuration assistant as illustrated below.

Fig. 7.
figure 7

Grid of the Algarve coast integrated in the repository.

Fig. 8.
figure 8

Release in GitHub, allowing for the linkage with Zenodo.

4.3 Using OPENCoastS to Build a New Forecast Deployment from a Shared Grid

A new forecast system was then implemented for the Algarve coast using the OPENCoastS service. To deploy this system, the following steps were performed in the Configuration Assistant: (1) the model selection, (2) the input of the horizontal grid (Fig. 9); (3) the definition of the boundary conditions (Fig. 10), (4) the definition of stations; (5) the definition of input parameters (e.g. time step); and (6) the definition of spatially-varying parameters (e.g. friction coefficient). After the completion of these steps, a summary of the deployment is presented and the user can submit the forecast. The Algarve coast forecast deployed herein is forced by tides from FES2014 at the ocean boundary and by the atmospheric forecasts from ARPEGE at the surface. Two time steps were tested: the one suggested within the OPENCoastS application (360 s) and a user-defined time step (60 s). Several virtual stations were selected along the Algarve coast (Fig. 11), providing information about the water levels and velocities.

Fig. 9.
figure 9

Configuration Assistant - Step 2, domain definition.

Fig. 10.
figure 10

Configuration Assistant - Step 3, boundary conditions.

Fig. 11.
figure 11

Outputs Viewer, with map and time series of the selected stations views.

Forecast results are available from the Outputs Viewer, where the user can generate maps and time series of water levels and velocities (Fig. 11). The user can also download the model forecasts for comparison with data from other sources. A comparison between the data from the Lagos tidal gauge (ftp://ftp.dgterritorio.pt/Maregrafos/Lagos) is presented, illustrating the good quality of OPENCoastS predictions (Fig. 12). Both forecasts provided feasible results but further data (in particular velocities data) would be required to assess which setup (regarding the time step) provides more accurate results.

Fig. 12.
figure 12

Comparison between the water level forecasts at the Lagos station provided by OPENCoastS (time step = 60 s) and data from the Lagos tidal gauge.

5 Final Considerations

The OPENCoastS forecast service was enhanced with a new strategy to facilitate its uptake by users that are unfamiliar with numerical models but need coastal predictions, through the creation of an open repository for computational grids. This repository combines GitHub for file organization and availability, and Zenodo for indexing, preservation and digital identification through a DOI. This approach targets an enhancement of knowledge on coastal conditions among coastal managers and the general public, and the sharing of tools from the research community to the society at large. Simultaneously, since each grid will have its own digital identifier (DOI), the efforts of the research community can be broadly recognized in publications, sites and other dissemination fora. It is the expectation of the OPENCoastS team that the research community can now contribute to enrich this repository and provide computational grids for coastal systems worldwide.

This repository completes OPENCoastS v1, reaching all the requirements that were set up at design stage [3, 4]:

  1. 1

    - broadly available: this service is available through a broad access platform, powered by INCD and IFCA computational resources;

  2. 2

    - Simplicity and usability: through several training events and the availability of the present repository, all coastal community users can now benefit from the service;

  3. 3

    - Comprehensive: all forecast tasks are dealt with the configuration assistant, forecasts manager and viewer. A pool of over 150 users and over 120 deployments working over the past year support this achievement;

  4. 4

    - Accurate and flexible: the current computational engine – the model SCHISM [14] – has proven robust and accurate, integrated with the several options for atmospheric and ocean forcings and for input customizations;

  5. 5

    - Modular: OPENCoastS was built in a modular way, to provide support to grow from the current 2DH barotropic physics to other processes.

OPENCoastS is being extended to other processes. In the near future, both 3D baroclinic and coupled wave-current simulations will be offered to the users. In the long run, extensions to water quality and morphodynamic problems are also envisioned.