Skip to content

Dataset

KOHDataset(field_dataset, sim_dataset) dataclass

A class to handle the simulation and observation data for the KOH framework.

Parameters:

  • field_dataset (Dataset) –

    The observation data of size \(M \times P\).

  • sim_dataset (Dataset) –

    The simulation data of size \(N \times P+Q\). The first \(P\) columns are the input data, and the last \(Q\) columns are the calibration parameters.

Xc property

Returns the input data for the simulation output.

Xf property

Returns the input data for the field observations.

d property

Returns the field observations stacked above the simulation output. Note this is the opposite of the KOH paper. For no good reason really. Should this be changed? What are the numerical implications?

n property

Number of observations.

y property

Returns the simulation output.

z property

Returns the field observations.

X(theta)

Returns the input data for the field observations and simulation output.

Source code in .tox/docs/lib/python3.12/site-packages/kohgpjax/dataset.py
def X(self, theta):
    r"""Returns the input data for the field observations and simulation output."""
    return jnp.vstack((self.Xf_theta(theta), self.Xc))

Xf_theta(theta)

Returns the input data for the field observations and calibration parameters.

Source code in .tox/docs/lib/python3.12/site-packages/kohgpjax/dataset.py
def Xf_theta(self, theta):
    r"""Returns the input data for the field observations and calibration parameters."""
    _check_theta_shape(theta, self.num_calib_params)
    theta = theta.reshape(1, -1)
    theta = jnp.repeat(theta, self.num_field_obs, axis=0)
    return jnp.hstack((self.Xf, theta))

__post_init__()

Checks that the shapes of \(z\), \(y\), \(X_f\) and \(X_s\) are compatible

Source code in .tox/docs/lib/python3.12/site-packages/kohgpjax/dataset.py
def __post_init__(self) -> None:
    r"""Checks that the shapes of $z$, $y$, $X_f$ and $X_s$ are compatible"""
    _check_shapes(self.sim_dataset, self.field_dataset)

    self.num_sim_obs = self.sim_dataset.y.shape[0]  # n
    self.num_field_obs = self.field_dataset.y.shape[0]  # N
    self.num_variable_params = self.field_dataset.X.shape[1]  # P
    self.num_calib_params = (
        self.sim_dataset.X.shape[1] - self.num_variable_params
    )  # Q

__repr__()

Returns a string representation of the KOHDataset instance.

Source code in .tox/docs/lib/python3.12/site-packages/kohgpjax/dataset.py
def __repr__(self) -> str:
    r"""Returns a string representation of the KOHDataset instance."""
    repr = (
        f"KOHDataset(\n"
        f"  Datasets:\n"
        f"    Field data = {self.field_dataset},\n"
        f"    Simulation data = {self.sim_dataset}\n"
        f"  Attributes:\n"
        f"    No. field observations = {self.num_field_obs},\n"
        f"    No. simulation outputs = {self.num_sim_obs},\n"
        f"    No. variable params = {self.num_variable_params},\n"
        f"    No. calibration params = {self.num_calib_params},\n"
        f")"
    )
    return repr

get_dataset(theta)

Returns the dataset with the field observations and the simulator observations concatenated with the given theta.

Source code in .tox/docs/lib/python3.12/site-packages/kohgpjax/dataset.py
def get_dataset(self, theta) -> Dataset:
    r"""Returns the dataset with the field observations and the simulator observations
    concatenated with the given theta."""
    return Dataset(X=self.X(theta), y=self.d)