Uncertainty Propagation in Telemac 2D Dam Failures Modelling and Downstream Hazard Potential Assessment

this work addresses uncertainty propagation in TELEMAC 2D models with respect to two major types of risks in river hydrodynamics: flood hazard and dam failures. The studied case is a TELEMAC 2D model that extends over approximately 14.4 km with a river length of 41 km including 3 major tributaries to the main river and 3 dams. The implementation of the uncertainty propagation approach would not have been feasible and accomplished without the open source platform SALOME-HYDRO and the TELAPY module (PYTHON API) of the TELEMACMASCARET SYSTEM. The first step consisted of quantifying uncertain parameters for the acquired hydraulic model and defining adequate probability distributions based on expert judgment and previous specific studies that have been provided by EDF. A sensitivity analysis based on Morris screening method was then carried out to reduce the number of uncertain factors. Uncertainty propagation algorithms such as Monte Carlo and Polynomial Chaos expansion were used to estimate the maximum water depths and velocities, as well as their statistical moments such as the mean and variance and the Sobol indices of the considered parameters. The use of parallelism proved to be necessary to optimize the computation time. The final results are then used to assess the flood casualties and the flood damages. This second estimation is based on the FLOODRISK plugin of QGIS.


INTRODUCTION
Numerical river hydraulics is based on the discretization of partial differential equations (Saint-Venant or Navier-Stokes) which include simplifying assumptions, input data such as rating curves, bathymetry, hydrographs, and parameters including uncertainties that may influence the results. In the current configuration, most parameters are calibrated a posteriori to ensure a good accuracy and representation of the flow dynamics. However, if the calibration could not be carried out due to lack of data, then the validation of the results is based on expert judgment and is subject to high uncertainties. Thus, uncertainty quantification can prove to be a valuable decision making tool since it can determine confidence intervals and whether model outputs will comply with the regulatory requirements (e.g. design requirements) given the random variation in inputs.
In this thesis, the uncertainty propagation methodology presented in Fig.1 is followed [1]. Three main steps are identified: Step A consists in defining the model, the statistical quantity of interest and the corresponding criteria (e.g. criteria on failure probability). The model description is similar to a classical deterministic approach as it defines the inputs and outputs of the model.
Step B consists in quantifying sources of uncertainties on model input parameters which will be described by adequate probability distributions. The result of this step is a random vector of all uncertain variables which is represented by the joint probability distribution of all marginal distributions and a copula that describes the dependence between the variables.
Step C consists in propagating uncertainties on the input through the model. In most cases, a sensitivity analysis (step C´) is required to assess the influence and the importance of input parameters with respect to the randomness of the output. Figure 1 General framework for uncertainty propagation studies [2] One objective of this work is to apply the previously defined steps; adopted by EDF R&D [2], on a TELEMAC 2D hydraulic model that includes dam failures scenarios through the conception of user-friendly PYTHON scripts. The main methods tested in this work are the classical algorithm of Monte Carlo using different sampling techniques such as Quasi-Monte Carlo with low discrepancy sequences, the Morris screening method for sensitivity analysis and the Polynomial Chaos Expansion to build surrogate models.
The model used to test this method of uncertainty propagation is part of an incremental damage study in which submersion waves are simulated for a reference no dam break scenario and an adverse dam break scenario, increments are calculated by comparing the results of both scenarios and flood damages are assessed.

A. Hydraulic TELEMAC 2D model
The geographical location of the modeled river is confidential. The river is delimited by a city downstream which represent a major vulnerability to flood risk in case of dam breaks. The model extends over approximately 14.4 km 2 with a river length of 41 km and it includes 3 major tributaries to the main river. The mesh contains approximately 185 000 nodes. The DEM of the model is given in Fig.2.
The simulated discharge correspond to a return period of 5000 years and it deterministic value is estimated by the Schadex method [3]. A factor 10 is considered by assumption for the return period of the tributaries discharges; i.e. the discharges considered for the tributaries are associated with a return period of 500 years.
The dams are modeled by prescribing rating curves on the upstream boundaries and prescribing discharges on the downstream boundaries. A dam failure correspond to the event of exceeding a dam stability threshold defined based on expert judgment. The rating curves are switched once the break occurs via TELAPY PYTHON script and user FORTRAN file. The solution of Ritter is considered to represent the rating curve associated with the break [4]. The model also includes 7 bridges which are modelled as drag forces in the FORTRAN subroutine Dragfo. The bridges are considered unstable if the water level upstream of the bridge is higher than the bridge deck level.
In terms of computation time, one run of the model takes approximately 1h10min on 56 processors. This was found to be the optimal number of processors for the studied model. For this study, the variables of interest are the maximal free surface elevation and the maximal water level (for the flood damages quantification). The statistical quantities of interest are the statistical moments; mainly the mean and the variance, as well as sensitivity measures such as Sobol indices.

B. Quantification of uncertain parameters
The quantification of uncertain parameters is carried out based on the categories of the model input data: Flow discharges: they correspond to the peak flood discharges at the main river and the 3 tributaries. For this model and based on expert judgment, only the discharges of the main river and one of the tributaries are considered uncertain. Their probability distribution is the truncated normal distribution with a mean equal to the deterministic peak flood discharge estimated with the Schadex method. Variances of approximately 5% and 25% of the means were respectively taken for the main river and the tributary.
Strickler coefficients: the hydraulic model is divided into 5 areas with different Strickler coefficients. All these coefficients are considered uncertain following a uniform distribution. Since the model was not calibrated, the bounds for these distributions were estimated based on literature values [5] and are given in Table 1. Dam failure thresholds: the occurrence of a dam break is defined as the event of the hydraulic head upstream of the dam exceeding a specific stability threshold. The latter is considered uncertain following a truncated normal distribution. The parameters of the probability distributions of dam failure thresholds for the 3 dams included in the model were defined relying on expert judgment and previous EDF studies. The values of these thresholds are confidential.
Drag coefficients for bridges: based on expert judgment, one of the seven bridges is considered stable since the vulnerabilities are located upstream of the bridge. The other six bridges which are modelled as drag forces, are unstable if the water level exceeds the bridge deck level. The drag coefficients are then considered uncertain since they are empirically estimated based on the shape and material of the bridge. They follow a uniform distribution.
Dams rating curves coefficients: the equations used to assess the rating curves include several empirical coefficients that are uncertain and that follow uniform distributions.
Ritter coefficient: the dam break is described by the dam break solution of Ritter with a deterministic value of 0.209 for coefficient of Ritter. It follows a truncated normal distribution with the value 0.3 (spillway overflow coefficient used in the deterministic case to represent dam breaks) as maximal bound.
The quantification step resulted in 27 uncertain parameters that are assumed to be independent.

C. Sensitivity analysis and uncertainty propagation
Given the large number of quantified uncertain parameters, the Morris screening method is tested in order to reduce the problem dimensionality. This method was first introduced in [ The classical algorithm of Monte Carlo is implemented to propagate uncertainties. Convergence of Monte Carlo and Quasi Monte Carlo are studied [8]. The aim is the computation of statistical moments such as the mean and the variance of the maximal water depth and maximal velocity which will be used to assess the flood damages.

D. Dam break scenarios
The scenarios considered in this study are: Break scenario: total and instantaneous dam breaks triggered by the break of the 1 st upstream dam.
Reference scenario: no dam breaks E. Implementation using the APIs and the Clusters The sensitivity analysis and uncertainty propagation methods were implemented using the C++/PYTHON library OPENTURNS [9] designed for the treatment of uncertainties. It coupling with the hydraulic model (i.e. TELEMAC 2D) is facilitated by the use of the TELAPY module [10] which allows to set and run TELEMAC instances via PYTHON.
Since a large number of simulations are going to be executed, the optimization of the computation time using the available EDF clusters is deemed necessary to accomplish this study. If the TELEMAC model only is parallelized, a minimal time of approximately 48 days 14 hours is required for 1000 simulations. However, if the 1000 simulations are also parallelized according to Fig.3, then a minimal time of approximately 23h20min is sufficient for all the simulations.

A. Morris screening method results
For the Morris screening method, several numbers of trajectories were tested as shown in Table 2. The results for the Morris screening method were not coherent with the expert judgment. In fact, the dam break thresholds and the upstream discharge that were expected to have the major influence on the results based on expert judgment were found to have minimal to zero influence. This is due to the constant delta that is chosen for all parameters even though their values and their types differ significantly. A possible solution to this problem would be to perform an iso-probabilistic transformation on the set of input parameters before generating the samples. Thus, to reduce the set of uncertain parameters, we finally used expert judgment.

B. Uncertainty propagation and sensitivity analysis 1. Convergence
Given the incoherent results of the Morris method, the Monte Carlo algorithm is performed using random sampling First, the convergence of the dispersion coefficient ( ) and the mean of Monte Carlo is graphically analysed. The mean is bounded by its 95% confidence interval. The dispersion coefficient and the mean of the maximal surface elevation estimated at node 151080 (node displayed in Fig. 2) and shown in Fig. 4 and Fig.5 suggests that the convergence of Monte Carlo is obtained at approximately 7000 simulations. Quasi Monte Carlo converges more rapidly at approximately 3500 simulations. Second, a convergence study based on a criteria set on the coefficients of variation of the mean and the variance of the Monte Carlo samples is performed. The criteria were defined a priori (based on the precision deemed acceptable for the model at hand) and are given in (2) and (3). The results displayed in Fig. 6 and Fig. 7 and estimated on 2 nodes: node 151080 and node 72177 located on the downstream agricultural floodplain (Fig.2), confirm that the convergence of Monte Carlo is obtained for approximately 7000 simulations.

Polynomial Chaos Expansion based on LARS
This method is tested here since it allows the computation of Sobol indices with smaller samples than Monte Carlo or Saltelli algorithm [14]. The theoretical number of simulations required to construct a surrogate model of degree 4 is given by: This number implies a computation time of approximately 47 days on the Eole cluster.
Although the available number of Monte Carlo simulations (~7200) is inferior to the theoretical number required for a PCE (4), this method was still tested with this sample. A crossvalidation was then performed on the constructed surrogate model using a validation Monte Carlo sample of size 1000.
First, all 7200 simulations were used to construct surrogate models of different degrees in order to find the optimum precision. The reference values are the mean and variance of the 7200. The differences between the means and the variances are approximately and respectively (as shown in Fig.8 and Fig.9). These errors are acceptable and the degree 4 is thus retained for an accuracy study based on the approximation accuracy coefficient computed using (5).
Where is the Leave-One-Out error. This error is a special case of K-fold error estimate where the number of folds is chosen equal to the cardinality N of the experimental design . Let's denote the surrogate model of the real model M, the surrogate model built from the experimental design with the i-th sample being set aside, and the empirical covariance of the response sample Y. The Leave-One-Out error can be calculated using (6).  A cross validation is performed for the PCE surrogate model of degree 4 constructed with 400 Monte Carlo simulations using 1000 samples. Fig.11 indicates that the surrogate model gives a good approximation even if the sample size is less than the required theoretical number of simulations calculated in (4).

Sobol' indices using PCE surrogate models
The surrogate models that have been built using polynomial chaos expansion can also be used to perform a sensitivity analysis by computing Sobol indices. In fact, first order Sobol indices have been calculated using a PC model of degree 4 built with all 7200 Monte Carlo samples. In Fig.12, the conveyance coefficients of dam j are denoted , where is the number of conveyance coefficients of dam j. The drag coefficients of bridges are denoted , the Strickler coefficients are denoted and the surface elevation dam failure thresholds are denoted . and refer respectively to the discharges of the main river and the tributary. The results shown in Fig.12 are more coherent with the expert judgment than those of the Morris method. The Strickler coefficient of forest areas stands out as the most influential parameter. This could be explained by the location of the node used for the computation of these indices (node 151080) or by the fact that the surrogate model still needs to be refined.

Post-treatment and damages assessment
Statistical moments such as the mean and the variance of the maximal water depth of the Monte Carlo output sample have been computed. For visualization purposes, these moments are integrated in a result MED file using HERMES APIs, MEDCOUPLING and MEDLOADER [11]. An example of the addition of the mean of the maximal surface elevation from 7200 Monte Carlo simulations to the results file is given in Fig.13.
Downstream flood damages were quantified using 2 methods: Ramsbottom or Flood risk to People method [12]: used to quantify casualties based on a danger factor.
FLOODRISK plugin of QGIS [13]: used to evaluate economic damages based on depth-damage curves. The FLOODRISK plugin was used to quantify the damages downstream of the last dam using 7200 simulations. The resulting vulnerability map is given in Fig.14.
The vulnerabilities are mainly concentrated downstream of the last dam.

IV. PERSPECTIVES AND LIMITATIONS
The results presented here remain indicatory of the prospects uncertainty propagation and sensitivity analysis methods can provide. In fact, a 2 nd scenario describing total and consecutive dam breaks (any dam can be the first to fail) is considered and its results still need to be post-treated. Monte Carlo was not exploited yet to calculate Sobol indices since it requires a larger number of simulations [14]. Other nodes, for which the same methods can be applied, located along the streamline or the floodplain can be taken into account. Hence, maps of Sobol indices can be created. Another perspective is the quantification of casualties using the Ramsbottom method, as well as automating the FLOODRISK method without using the QGIS interface.
As for the limitations of the study, the convergence criteria were defined for the specific model at hand. Hence, one should modify and adapt these criteria based on expert judgment and regulatory requirements.
All the methods and the steps of this study were carried out using PYTHON scripts. However, this methodology can be partially implemented using the SALOME-HYDRO platform except for the user FORTRAN and the TELAPY scripts that handle specific aspects of the hydraulic simulation (i.e. dam breaks here). SALOME-HYDRO only lacks the launching parallelization which should be soon available.

V. CONCLUSIONS
The implementation of the uncertainty propagation approach would not have been feasible and accomplished without the open source platform SALOME-HYDRO, OPENTURNS and the Figure 13 Integration of the mean of surface elevation to the MED result file TELAPY module (PYTHON API) of the TELEMAC-MASCARET SYSTEM. The main difficulty -which is generally common to probabilistic models that treat uncertainty propagation -remains the optimization of computation time with respect to the number of simulations considered and the run-time of the TELEMAC case; whether it is in parallel or sequential mode.
The main objectives of this study; which were the application of the uncertainty propagation methods on a hydraulic model at an engineering scale and the conception of user-friendly PYTHON scripts that makes such parametric studies within the reach of any TELEMAC user, were accomplished. From an engineering point of view, the quantification of sources of uncertainties and their representation with suitable probability distributions is the step that takes the longest time to complete since it mostly relies on expert judgment.