rwgen.RainfallModel.fit

RainfallModel.fit(fitting_method='default', parameter_bounds=None, fixed_parameters=None, n_workers=1, output_filenames='default', fit_nsrp=True, fit_shuffling=False, random_seed=None, pdry_iterations=2, use_pooling=True)

Fit model parameters.

Depends on self.reference_statistics attribute. Sets self.parameters and self.fitted_statistics. Also writes parameters and fitted_statistics` output files.

Parameters:

fitting_method (str) – Flag to indicate fitting method. Using 'default' will fit each month or season independently. Other options under development.
parameter_bounds (dict or str or pandas.DataFrame) – Dictionary containing tuples of upper and lower parameter bounds by parameter name. Alternatively the path to a parameter bounds file or an equivalent dataframe (see Notes).
fixed_parameters (dict or str or pandas.DataFrame) – Dictionary containing fixed parameter values by parameter name. Alternatively the path to a parameters file or an equivalent dataframe (see Notes).
n_workers (int) – Number of workers (cores/processes) to use in fitting. Default is 1.
output_filenames (str or dict) – Either key/value pairs indicating output file names, 'default' to use {‘statistics’: ‘fitted_statistics.csv’, ‘parameters’: ‘parameters.csv’} or None to indicate that no output files should be written.
fit_nsrp (bool) – Indicates whether to fit NSRP parameters.
fit_shuffling (bool) – Indicates whether to fit the “delta” parameter that controls the probability of selecting more/less similar storms during shuffling, as well the parameters of the periodic monthly AR1 model.
random_seed (int or numpy.random.SeedSequence) – For reproducibility in fitting (currently for delta only).
pdry_iterations (int) – Number of iterations to use to correct bias between fitted (analytical) dry probability and simulated dry probability. Default is 2.
use_pooling (bool) – Indicates whether to used pooled statistics in NSRP fitting for a spatial model.

Notes

The parameters used by the model are:

lamda - reciprocal of the mean waiting time between adjacent storm origins [h-1]

beta - reciprocal of the mean waiting time for raincell origins after storm origin [h-1]

eta - reciprocal of the mean duration of raincells [h-1]

nu - mean number of raincells per storm (specified for point model only) [-]

theta - mean intensity of raincells [h mm-1]

gamma - reciprocal of mean radius of raincells (spatial model) [km-1]

rho - spatial density of raincell centres (spatial model) [km-2]

Note also that:

For a spatial model, the mean number of raincells overlapping a given location is related to rho and gamma, such that nu can be inferred.

If using intensity_distribution='weibull', theta is the scale parameter and an additional parameter (kappa) is introduced as the shape parameter.

The parameter_bounds argument can be specified by a dictionary like dict(beta=(0.02, 1.0)). For more control a dataframe (or .csv file) can be passed. For example, if a model has two (6-month) seasons (using arbitrary example numbers):

Season	Parameter	Lower_Bound	Upper_Bound
1	Beta	0.02	0.1
2	Beta	0.1	1.0

If a parameter(s) should be fixed across all seasons then it can be set as e.g. dict(beta=0.1, theta=1). Otherwise a table can be provided like:

Season	Lamda	Beta
1	0.015	0.05
2	0.012	0.04

Note that if a parameter is not found in the parameter_bounds argument then default bounds will be used. Similarly, if it is not found in fixed_parameters it is assumed that the parameter must be fitted.

The current default parameter values for a point (gauge/site) model are below. -1 indicates that they are applied across all months/seasons). The values are largely from the RainSim V3.1 documentation:

Season	Parameter	Lower_Bound	Upper_Bound
-1	lamda	0.00001	0.02
-1	beta	0.02	1.0
-1	nu	0.1	30.0
-1	eta	0.1	60.0
-1	theta	0.25	100.0
-1	kappa	0.5	1.0

And for a spatial model:

Season	Parameter	Lower_Bound	Upper_Bound
-1	lamda	0.001	0.05
-1	beta	0.02	0.5
-1	rho	0.0001	0.05
-1	eta	0.1	12.0
-1	gamma	0.01	500.0
-1	theta	0.25	100.0
-1	kappa	0.5	1.0

Fitting can be speeded up significantly with n_workers > 1. The maximum n_workers should be less than or equal to the number of cores or logical processors available.