Helpers in QC checks for individual reports module¶
- marine_qc.qc_individual_reports._do_daytime_check(year, month, day, hour, lat, lon, time_since_sun_above_horizon, mode)[source]
Determine if the sun was above the horizon a specified time before the report.
- Parameters:
year (
1D np.ndarrayofint) – Year(s) of observation.month (
1D np.ndarrayofint) – Month(s) of observation (1-12).day (
1D np.ndarrayofint) – Day(s) of observation.hour (
1D np.ndarrayoffloat) – Hour(s) of observation (minutes as decimal).lat (
1D np.ndarrayoffloat) – Latitude(s) of observation in degrees.lon (
1D np.ndarrayoffloat) – Longitude() of observation in degree.time_since_sun_above_horizon (
float) – Maximum time sun can have been above horizon (or below) to still count as night. Original QC test had this set to 1.0 i.e. it was night between one hour after sundown and one hour after sunrise.mode (
{"day", "night"}) – If “day”, check if the sun is above the horizon. If “night”, check if the sun is below the horizon.
- Return type:
- Returns:
np.ndarrayofint–Returns 2 (or array/sequence/Series of 2s) if any of do_position_check, do_date_check, or do_time_check returns 2.
Returns 1 (or array/sequence/Series of 1s) if any of do_position_check, do_date_check, or do_time_check returns 1 or if it is night (sun below horizon an hour ago).
Returns 0 if it is day (sun above horizon an hour ago).
- Raises:
ValueError – If mode is not in valid list [“day”, “night”].
Helpers in multiple checks module¶
- marine_qc.multiple_checks._get_function(name)[source]
Return the function of a given name or raises a NameError.
- marine_qc.multiple_checks._get_requests_from_params(params, func, data)[source]
Get requests from func or data using params.
Given a dictionary of key value pairs where the keys are parameters in the function, func, and the values are columns or variables in data, create a new dictionary in which the keys are the parameter names (as in the original dictionary) and the values are the numbers extracted from data.
- Parameters:
params (
MappingorNone) – Dictionary. Keys are parameter names for the function func, and values are the names of columns or variables in data.func (
Callable) – Function for which the parameters will be checked.data (
pd.Seriesorpd.DataFrame) – DataSeries or DataFrame containing the data to be extracted.
- Return type:
- Returns:
Mapping– Dictionary containing the key value pairs where the keys are as in the input dictionary and the values are extracted from the corresponding columns of data.- Raises:
ValueError – If one of the dictionary keys from params is not a valid argument in func.
NameError – If one of the dictionary values from params is not a column or variable in data.
- marine_qc.multiple_checks._prepare_functions(config, data, preprocessed=None, execute=False)[source]
Prepare functions defined in a configuration dictionary.
- Parameters:
config (
Mapping[str,Mapping[str,Any]]) – Dictionary describing functions, their inputs, and arguments.data (
pd.DataFrameorpd.Series) – Data used to extract requested parameters.preprocessed (
Mapping[str,Any], optional) – Previously computed preprocessed variables (used for QC functions).execute (
bool, default:False) – If True, execute the functions and return their results. If False, return function references and resolved arguments.
- Return type:
- Returns:
Mapping[str,Any]– If execute=True, returns a dict mapping names to results. If execute=False, returns a dict mapping names to dicts: {“function”: callable, “requests”: dict, “kwargs”: dict}.
- marine_qc.multiple_checks._apply_qc_to_masked_rows(qc_func, args, kwargs, data_index, mask)[source]
Apply a QC function to masked rows and return a Series aligned to
data_index.- Parameters:
qc_func (
Callable) – QC function to execute.args (
Mapping[str,Any]) – Keyword arguments constructed from requests.kwargs (
Mapping[str,Any]) – Additional keyword arguments, typically from preprocessed variables.data_index (
pd.Index) – Full index of the dataset for aligning the QC result.mask (
pd.Series) – Boolean mask indicating which rows the QC function applies to.
- Return type:
- Returns:
pd.Series– A Series indexed bydata_indexcontaining QC results for masked rows and default values elsewhere.
- marine_qc.multiple_checks._normalize_groupby(data, groupby)[source]
Return iterable of (name, group_df) pairs, trimming invalid rows.
- Parameters:
data (
pd.Seriesorpd.DataFrame) – Hashable input data.groupby (
DataFrameGroupByorobject) – A groupby object or column(s) to group by. If None, the full DataFrame is returned as a single group.
- Return type:
- Returns:
list[tuple[Any,pd.DataFrame]]– A list of tuples containing the group name (or None) and the corresponding DataFrame slice.
- marine_qc.multiple_checks._normalize_input(data, return_method)[source]
Validate the return method and ensure the input is a DataFrame.
Converts a Series to a single-column DataFrame and tracks if the original input was a Series.
- Parameters:
data (
pd.Seriesorpd.DataFrame) – Hashable input data.return_method (
{'all', 'passed', 'failed'}) – Specifies which rows to return; must be one of ‘all’, ‘passed’, or ‘failed’.
- Return type:
- Returns:
tupleof(pd.DataFrame,bool)–Normalized DataFrame version of the input.
Boolean indicating if the original input was a Series.
- marine_qc.multiple_checks._prepare_all_inputs(data, qc_dict, preproc_dict)[source]
Build all inputs required for QC execution.
This includes preporcessed variables, resolved QC function arguments, an initial boolean mask, and an empty results table.
- Parameters:
- Return type:
- Returns:
tupleof(Mapping,pd.Series,pd.DataFrame)–QC inputs dictionary: {qc_name: {function, requests, kwargs}}.
Initial boolean mask Series (all True).
Empty results DataFrame with shape (n_rows, n_qcs).
- marine_qc.multiple_checks._group_iterator(data, groupby)[source]
Yield groups of a DataFrame as (group_name, group_df) pairs.
If groupby is None, yields the entire DataFrame as a single group. Otherwise, yields each group as returned by _normalize_groupby.
- Parameters:
- Yields:
tupleof(Any,pd.DataFrame)– Tuples containing the group key (or None) and the corresponding DataFrame for that group.- Return type:
- marine_qc.multiple_checks._run_qc_engine(data, qc_inputs, groups, return_method)[source]
Execute QC checks on the provided data groups and collect the results.
Each QC function is applied to the corresponding group, respecting a shared mask that propagates pass/fail status. The results are stored in a DataFrame aligned with the original data.
- Parameters:
data (
pd.Seriesorpd.DataFrame) – Hashable input data.qc_inputs (
Mapping) – Dictionary of QC inputs, each containing: {“function”: callable, “requests”: dict, “kwargs”: dict}.groups (iterable) – Iterable of (group_name, group_df) pairs, as returned by _group_iterator.
return_method (
{"all", "passed", "failed"}, default:"all") – If “all”, return QC dictionary containing all requested QC check flags. If “passed”: return QC dictionary containing all requested QC check flags until the first check passes. Other QC checks are flagged as unstested (3). If “failed”: return QC dictionary containing all requested QC check flags until the first check fails. Other QC checks are flagged as unstested (3).
- Return type:
- Returns:
pd.DataFrame– DataFrame of QC results with the same index as data and columns corresponding to QC names.
- marine_qc.multiple_checks._do_multiple_check(data, groupby=None, qc_dict=None, preproc_dict=None, return_method='all')[source]
Internal entry point for performing QC checks on data.
Prepares inputs, constructs groups, and executes the QC engine for individual, sequential, or grouped checks.
- Parameters:
data (
pd.Seriesorpd.DataFrame) – Hashable input data.groupby (
str,iterableofstr, orpandas GroupBy, optional) – Specifies how the data should be grouped before applying QC functions. If a string or iterable of strings,data.groupbyis called on those keys. If apandas.DataFrameGroupByobject is provided, its groups are used directly. Any groups that contain indices not present indataare automatically trimmed. IfNone, the entire inputdatais treated as a single group.qc_dict (
Mapping, optional) – Nested QC dictionary. Keys represent arbitrary user-specified names for the checks. The values are dictionaries which contain the keys “func” (name of the QC function), “names” (input data names as keyword arguments, that will be retrieved from data) and, if necessary, “arguments” (the corresponding keyword arguments). For more information see Examples.preproc_dict (
Mapping, optional) – Nested pre-processing dictionary. Keys represent variable names that can be used by qc_dict. The values are dictionaries which contain the keys “func” (name of the pre-processing function), “names” (input data names as keyword arguments, that will be retrieved from data), and “inputs” (list of input-given variables). For more information see Examples.return_method (
{"all", "passed", "failed"}, default:"all") – If “all”, return QC dictionary containing all requested QC check flags. If “passed”: return QC dictionary containing all requested QC check flags until the first check passes. Other QC checks are flagged as unstested (3). If “failed”: return QC dictionary containing all requested QC check flags until the first check fails. Other QC checks are flagged as unstested (3).
- Return type:
- Returns:
pd.DataFrameorpd.Series– A DataFrame (or Series if the input was a Series) whose columns correspond to the QC names inqc_dictand whose values contain QC flags for each row. Flags depend on the QC functions used.
Helpers in external climatology module¶
- marine_qc.external_clim._select_point(i, da_slice, lat_arr, lon_arr, lat_axis, lon_axis)[source]
Select nearest grid point value for a single lat/lon pair.
- Parameters:
i (
int) – Index of the latitude/longitude pair.da_slice (
xr.DataArray) – DataArray slice to sample from.lat_arr (
SequenceNumberType) – Array of latitude values.lon_arr (
SequenceNumberType) – Array of longitude values.lat_axis (
str) – Name of the latitude dimension in da_slice.lon_axis (
str) – Name of the longitude dimension in da_slice.
- Return type:
- Returns:
tupleof(int,float)– Index i and the selected grid-point value.
Helpers in spherical geometry module¶
- marine_qc.spherical_geometry._geod_inv(lon1, lat1, lon2, lat2)[source]
Compute forward azimuth, back azimuth, and distance between two points using an ellipsoidal model.
- Parameters:
lon1 (
SequenceNumberType) – Longitude of the first point in degrees.lat1 (
SequenceNumberType) – Latitude of the first point in degrees.lon2 (
SequenceNumberType) – Longitude of the second point in degrees.lat2 (
SequenceNumberType) – Latitude of the second point in degrees.
- Return type:
- Returns:
tupleof(np.ndarray,np.ndarray,np.ndarray)– A tuple containing: - Forward azimuth(s) from point 1 to point 2 in degrees. - Back azimuth(s) from point 2 to point 1 in degrees. - Distance(s) between the points in meters. The outputs have the same shape as the broadcasted inputs.
Helpers in statistical functions module¶
Helpers in plotting module¶
- marine_qc.plot_qc_outcomes._get_colours_labels(qc_outcomes)[source]
Get color lebels.
- marine_qc.plot_qc_outcomes._make_plot(xvalue, yvalue, flags, xlim, ylim, xlabel, ylabel, filename)[source]
Make plot.
- Parameters:
xvalue (
np.ndarray) – Array of x values.yvalue (
np.ndarray) – Array of y values.flags (
np.ndarray) – Array containing the QC outcomes, with 0 meaning pass and non-zero entries indicating failure.xlim (
listoffloatorNone) – If not None: set xlim for plotting.ylim (
listoffloatorNone) – If not None: set ylim for plotting.xlabel (
str) – Name of the x axis.ylabel (
str) – Name of the y axis.filename (
strorNone) – Filename to save the figure to. If None, the figure is saved with a standard name.
- Return type:
Figure- Returns:
Figure– The main figure obkect created by plt.subplots().
Static methods of buoy tracking QC classes¶
- marine_qc.buoy_tracking_qc.SSTTailChecker._parse_rep(lat, lon, ostia, ice, bgvar, dates)
Process a report.
- marine_qc.buoy_tracking_qc.SSTTailChecker._preprocess_reps(self)
Process the reps and calculate the values used in the QC check.
- marine_qc.buoy_tracking_qc.SSTTailChecker._do_long_tail_check(self, forward=True)
Perform the long tail check.
- marine_qc.buoy_tracking_qc.SSTTailChecker._do_short_tail_check(self, first_pass_ind, last_pass_ind, forward=True)
Perform the short tail check.
- marine_qc.buoy_tracking_qc.SSTBiasedNoisyChecker._parse_rep(lat, lon, ostia, ice, bgvar, dates, background_err_lim)
Extract QC-relevant variables from a marine report.
- Parameters:
lat (
float) – Latitude of the observation to be parsed.lon (
float) – Longitude of the observation to be parsed.ostia (
float) – Background SST field value.ice (
float) – Ice concentration field value.bgvar (
float) – Background variance field value.dates (
datetime) – Date and time of the observation to be parsed.background_err_lim (
float) – Background error variance beyond which the SST background is deemed unreliable (degC squared or K squared).
- Return type:
- Returns:
float,float,float,bool,bool,bool– Returns the background SST value, ice value, background SST variance, a flag that indicates a good match, and a flag that indicates if the background variance is valid, and a flag that indicates if the observation is valid overall.
- marine_qc.buoy_tracking_qc.SSTBiasedNoisyChecker._preprocess_reps(self)
Fill SST anomalies, background errors, and flags for invalid background values.
This method processes each observation to compute sea surface temperature (SST) anomalies, background error standard deviations, and flags any missing or invalid background values. It also checks whether the time series is sorted and sets a mask flag if any background variances are masked.
- marine_qc.buoy_tracking_qc.SSTBiasedNoisyChecker._long_record_qc(self)
Perform the long record check.
- Return type:
- marine_qc.buoy_tracking_qc.SSTBiasedNoisyChecker._short_record_qc(self)
Perform the short record check.
- Return type:
Internal data type aliases¶
- marine_qc.PandasNAType = <class 'pandas.api.typing.NAType'>¶
NA (“not available”) missing value indicator.
Warning
Experimental: the behaviour of NA can still change without warning.
The NA singleton is a missing value indicator defined by pandas. It is used in certain new extension dtypes (currently the “string” dtype).
See also
numpy.nanFloating point representation of Not a Number (NaN) for numerical data.
isnaDetect missing values for an array-like object.
notnaDetect non-missing values for an array-like object.
DataFrame.fillnaFill missing values in a DataFrame.
Series.fillnaFill missing values in a Series.
Examples
>>> pd.NA <NA>
>>> True | pd.NA True
>>> True & pd.NA <NA>
>>> pd.NA != pd.NA <NA>
>>> pd.NA == pd.NA <NA>
>>> True | pd.NA True
- marine_qc.PandasNaTType = <class 'pandas.api.typing.NaTType'>¶
(N)ot-(A)-(T)ime, the time equivalent of NaN.
NaT is used to denote missing or null values in datetime and timedelta objects in pandas. It functions similarly to how NaN is used for numerical data. Operations with NaT will generally propagate the NaT value, similar to NaN. NaT can be used in pandas data structures like Series and DataFrame to represent missing datetime values. It is useful in data analysis and time series analysis when working with incomplete or sparse time-based data. Pandas provides robust handling of NaT to ensure consistency and reliability in computations involving datetime objects.
See also
NANA (“not available”) missing value indicator.
isnaDetect missing values (NaN or NaT) in an array-like object.
notnaDetect non-missing values.
numpy.nanFloating point representation of Not a Number (NaN) for numerical data.
Examples
>>> pd.DataFrame([pd.Timestamp("2023"), np.nan], columns=["col_1"]) col_1 0 2023-01-01 1 NaT
- marine_qc.ScalarIntType = int | numpy.integer | pandas.api.typing.NAType | None¶
Represent a union type
E.g. for int | str
- marine_qc.ScalarFloatType = float | numpy.floating | pandas.api.typing.NAType | None¶
Represent a union type
E.g. for int | str
- marine_qc.ScalarNumberType = int | numpy.integer | pandas.api.typing.NAType | None | float | numpy.floating¶
Represent a union type
E.g. for int | str
- marine_qc.ScalarDatetimeType = datetime.datetime | numpy.datetime64 | pandas.Timestamp | pandas.api.typing.NaTType | None¶
Represent a union type
E.g. for int | str
- marine_qc.SequenceIntType = collections.abc.Sequence[int | numpy.integer | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.integer]] | pandas.Series | numpy.ndarray¶
Represent a union type
E.g. for int | str
- marine_qc.SequenceFloatType = collections.abc.Sequence[float | numpy.floating | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.floating]] | pandas.Series | numpy.ndarray¶
Represent a union type
E.g. for int | str
- marine_qc.SequenceNumberType = collections.abc.Sequence[int | numpy.integer | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.integer]] | pandas.Series | numpy.ndarray | collections.abc.Sequence[float | numpy.floating | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.floating]]¶
Represent a union type
E.g. for int | str
- marine_qc.SequenceDatetimeType = collections.abc.Sequence[datetime.datetime | numpy.datetime64 | pandas.Timestamp | pandas.api.typing.NaTType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.datetime64]] | pandas.Series | numpy.ndarray¶
Represent a union type
E.g. for int | str
- marine_qc.ValueIntType = int | numpy.integer | pandas.api.typing.NAType | None | collections.abc.Sequence[int | numpy.integer | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.integer]] | pandas.Series | numpy.ndarray¶
Represent a union type
E.g. for int | str
- marine_qc.ValueFloatType = float | numpy.floating | pandas.api.typing.NAType | None | collections.abc.Sequence[float | numpy.floating | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.floating]] | pandas.Series | numpy.ndarray¶
Represent a union type
E.g. for int | str
- marine_qc.ValueNumberType = int | numpy.integer | pandas.api.typing.NAType | None | collections.abc.Sequence[int | numpy.integer | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.integer]] | pandas.Series | numpy.ndarray | float | numpy.floating | collections.abc.Sequence[float | numpy.floating | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.floating]]¶
Represent a union type
E.g. for int | str
- marine_qc.ValueDatetimeType = datetime.datetime | numpy.datetime64 | pandas.Timestamp | pandas.api.typing.NaTType | None | collections.abc.Sequence[datetime.datetime | numpy.datetime64 | pandas.Timestamp | pandas.api.typing.NaTType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.datetime64]] | pandas.Series | numpy.ndarray¶
Represent a union type
E.g. for int | str
- marine_qc.ClimArgType = int | numpy.integer | pandas.api.typing.NAType | None | collections.abc.Sequence[int | numpy.integer | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.integer]] | pandas.Series | numpy.ndarray | float | numpy.floating | collections.abc.Sequence[float | numpy.floating | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.floating]] | marine_qc.external_clim.Climatology | str | os.PathLike[str] | xarray.core.dataarray.DataArray | xarray.core.dataset.Dataset¶
Represent a union type
E.g. for int | str
- marine_qc.ClimIntType = int | numpy.integer | pandas.api.typing.NAType | None | collections.abc.Sequence[int | numpy.integer | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.integer]] | pandas.Series | numpy.ndarray | marine_qc.external_clim.Climatology¶
Represent a union type
E.g. for int | str
- marine_qc.ClimFloatType = float | numpy.floating | pandas.api.typing.NAType | None | collections.abc.Sequence[float | numpy.floating | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.floating]] | pandas.Series | numpy.ndarray | marine_qc.external_clim.Climatology¶
Represent a union type
E.g. for int | str
- marine_qc.ClimNumberType = int | numpy.integer | pandas.api.typing.NAType | None | collections.abc.Sequence[int | numpy.integer | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.integer]] | pandas.Series | numpy.ndarray | float | numpy.floating | collections.abc.Sequence[float | numpy.floating | pandas.api.typing.NAType | None] | numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.floating]] | marine_qc.external_clim.Climatology¶
Represent a union type
E.g. for int | str
- marine_qc.ClimInputType = str | os.PathLike[str] | xarray.core.dataarray.DataArray | xarray.core.dataset.Dataset¶
Represent a union type
E.g. for int | str