Skip to contents

Low level function to validate input `data`. Returns list of dataframes containing fails/warnings/replacements and passing observed values. The function detects which model (gis or physical) and area (GB & NI) and applies validation rules as required. Here is a summary of checks:

  1. Input 'data' exists.

  2. Input is dataframe.

  3. Required columns are present.

  4. Columns have correct class.

  5. Conditional columns if present are correct class.

  6. Where multiple classes are allowed, convert columns to standardised class for example integer to numeric.

  7. Assess if model 'gis' (model 44) or 'physical' (model 1) based on input columns.

  8. Assess if input 'gb' or 'ni' based grid reference.

  9. Additional columns/variables calculated for example mean temperate.

  10. Logs failures and warnings applied using `validation_rules` table.

  11. Replace values if input values is zero (or close to zero) to avoid log10(0) and related errors.

  12. Returns dataframe with passing values, notes on warnings, fails, replacements. And model and area parameters.

Usage

rict_validate(
  data = NULL,
  row = FALSE,
  stop_if_all_fail = TRUE,
  area = NULL,
  crs = NULL
)

Arguments

data

dataframe of observed environmental variables

SITE

Site identifier

Waterbody

Water body identifier

...
row

Boolean - if set to `TRUE` returns the row number from the input file (data) for each check. This makes linking checks (fails/warns etc) to the associated row in the input data easier. This is more relevant if multiple samples from the same site and year are input as separate row to the `rict_validate`. In this case, SITE and YEAR are not enough to link validation checks to specific rows in the input data.

stop_if_all_fail

Boolean - if set to `FALSE` the validation function will return empty dataframe for valid `data`. This is useful if you want to run validation checks without stopping process.

area

Area is by detected by default from the NGR, but you can provide the area parameter either 'iom', 'gb, 'ni' for testing purposes.

crs

optionally set crs to `29903` for Irish projection system.

Value

List of dataframes and other parameters:

data

Dataframe of input data that passes validation rules

checks

Dataframe listing fails, warnings and replacements

model

Returns model detected based on columns in input file

area

Returns area detected base don Grid Reference in input file

Examples

if (FALSE) {
validations <- rict_validate(demo_observed_values, row = TRUE)
}