simcats_datasets.loading.load_ground_truth

Functions for providing ground truth data to be used with the Pytorch Dataset class.

For examples of the different ground truth types, please have a look at the notebook Examples_Pytorch_SimcatsDataset.

Every function must accept a h5 File or path for a simcats_dataset as input, provide an option to use only specific_ids and allow disabling the progress_bar. Output type depends on the ground truth type. Could for example be a pixel mask or defined start end points of lines. Please look at load_zeros_masks for a reference.

@author: f.hader

Module Contents

Functions

load_zeros_masks

Load no/empty ground truth data (arrays with only zeros).

load_tct_masks

Load Total Charge Transition (TCT) masks as ground truth data.

load_tct_by_dot_masks

Load Total Charge Transition (TCT) masks with transitions labeled by affected dot as ground truth data.

load_idt_masks

Load Inter-Dot Transition (IDT) masks as ground truth data.

load_ct_masks

Load Charge Transition (CT) masks as ground truth data.

load_ct_by_dot_masks

Load Charge Transition (CT) masks with transitions labeled by affected dot as ground truth data.

load_tc_region_masks

Load Total Charge (TC) region masks as ground truth data.

load_tc_region_minus_tct_masks

Load Total Charge (TC) region minus Total Charge Transition (TCT) masks as ground truth data (TCTs are basically excluded from the regions).

load_c_region_masks

Load Charge (C) region masks as ground truth data (CTs are basically excluded from the TC regions).

Module Implementation Details

simcats_datasets.loading.load_ground_truth.load_zeros_masks(file, specific_ids=None, progress_bar=True)

Load no/empty ground truth data (arrays with only zeros). Used for loading sets without ground truth. This is helpful to e.g. load experimental datasets without labels with the pytorch SimcatsDataset class to analyze train results with the same Interface as for simulated data.

Parameters:
  • file (Union[str, h5py.File]) – The file to read the data from. Can either be an object of the type h5py.File or the path to the dataset. If you want to do multiple consecutive loads from the same file (e.g. for using th PyTorch SimcatsDataset without preloading), consider initializing the file object yourself and passing it, to improve the performance.

  • specific_ids (Union[range, List[int], numpy.ndarray, None]) – Determines if only specific ids should be loaded. Using this option, the returned values are sorted according to the specified ids and not necessarily ascending. If set to None, all data is loaded. Default is None.

  • progress_bar (bool) – Determines whether to display a progress bar. Default is True.

Returns:

List of arrays containing only zeros as ground truth data

Return type:

List[numpy.ndarray]

simcats_datasets.loading.load_ground_truth.load_tct_masks(file, specific_ids=None, progress_bar=True)

Load Total Charge Transition (TCT) masks as ground truth data.

Parameters:
  • file (Union[str, h5py.File]) – The file to read the data from. Can either be an object of the type h5py.File or the path to the dataset. If you want to do multiple consecutive loads from the same file (e.g. for using th PyTorch SimcatsDataset without preloading), consider initializing the file object yourself and passing it, to improve the performance.

  • specific_ids (Union[range, List[int], numpy.ndarray, None]) – Determines if only specific ids should be loaded. Using this option, the returned values are sorted according to the specified ids and not necessarily ascending. If set to None, all data is loaded. Default is None.

  • progress_bar (bool) – Determines whether to display a progress bar. Default is True.

Returns:

Total Charge Transition (TCT) masks

Return type:

List[numpy.ndarray]

simcats_datasets.loading.load_ground_truth.load_tct_by_dot_masks(file, specific_ids=None, progress_bar=True, lut_entries=1000)

Load Total Charge Transition (TCT) masks with transitions labeled by affected dot as ground truth data.

Parameters:
  • file (Union[str, h5py.File]) – The file to read the data from. Can either be an object of the type h5py.File or the path to the dataset. If you want to do multiple consecutive loads from the same file (e.g. for using th PyTorch SimcatsDataset without preloading), consider initializing the file object yourself and passing it, to improve the performance.

  • specific_ids (Union[range, List[int], numpy.ndarray, None]) – Determines if only specific ids should be loaded. Using this option, the returned values are sorted according to the specified ids and not necessarily ascending. If set to None, all data is loaded. Default is None.

  • progress_bar (bool) – Determines whether to display a progress bar. Default is True.

  • lut_entries (int) – Number of lookup-table entries to use for tct_bezier. Default is 1000.

Returns:

Total Charge Transition (TCT) masks

Return type:

List[numpy.ndarray]

simcats_datasets.loading.load_ground_truth.load_idt_masks(file, specific_ids=None, progress_bar=True)

Load Inter-Dot Transition (IDT) masks as ground truth data. In comparison to the Total Charge Transition (TCT) masks, only inter-dot transitions are included.

Parameters:
  • file (Union[str, h5py.File]) – The file to read the data from. Can either be an object of the type h5py.File or the path to the dataset. If you want to do multiple consecutive loads from the same file (e.g. for using th PyTorch SimcatsDataset without preloading), consider initializing the file object yourself and passing it, to improve the performance.

  • specific_ids (Union[range, List[int], numpy.ndarray, None]) – Determines if only specific ids should be loaded. Using this option, the returned values are sorted according to the specified ids and not necessarily ascending. If set to None, all data is loaded. Default is None.

  • progress_bar (bool) – Determines whether to display a progress bar. Default is True.

Returns:

Inter-Dot Transition (IDT) masks

Return type:

List[numpy.ndarray]

simcats_datasets.loading.load_ground_truth.load_ct_masks(file, specific_ids=None, progress_bar=True)

Load Charge Transition (CT) masks as ground truth data. In comparison to the Total Charge Transition (TCT) masks, the inter-dot transitions are included.

Parameters:
  • file (Union[str, h5py.File]) – The file to read the data from. Can either be an object of the type h5py.File or the path to the dataset. If you want to do multiple consecutive loads from the same file (e.g. for using th PyTorch SimcatsDataset without preloading), consider initializing the file object yourself and passing it, to improve the performance.

  • specific_ids (Union[range, List[int], numpy.ndarray, None]) – Determines if only specific ids should be loaded. Using this option, the returned values are sorted according to the specified ids and not necessarily ascending. If set to None, all data is loaded. Default is None.

  • progress_bar (bool) – Determines whether to display a progress bar. Default is True.

Returns:

Charge Transition (CT) masks

Return type:

List[numpy.ndarray]

simcats_datasets.loading.load_ground_truth.load_ct_by_dot_masks(file, specific_ids=None, progress_bar=True, lut_entries=1000, try_directly_loading_from_file=True)

Load Charge Transition (CT) masks with transitions labeled by affected dot as ground truth data. In comparison to the Total Charge Transition (TCT) masks, the inter-dot transitions are included.

Parameters:
  • file (Union[str, h5py.File]) – The file to read the data from. Can either be an object of the type h5py.File or the path to the dataset. If you want to do multiple consecutive loads from the same file (e.g. for using th PyTorch SimcatsDataset without preloading), consider initializing the file object yourself and passing it, to improve the performance.

  • specific_ids (Union[range, List[int], numpy.ndarray, None]) – Determines if only specific ids should be loaded. Using this option, the returned values are sorted according to the specified ids and not necessarily ascending. If set to None, all data is loaded. Default is None.

  • progress_bar (bool) – Determines whether to display a progress bar. Default is True.

  • lut_entries (int) – Number of lookup-table entries to use for tct_bezier. Default is 1000.

  • try_directly_loading_from_file (bool) – Specifies if the loader should try to find the masks in the h5 file before falling back to calculating them (not all datasets include these masks). Default is True.

Returns:

Charge Transition (CT) masks

Return type:

List[numpy.ndarray]

simcats_datasets.loading.load_ground_truth.load_tc_region_masks(file, specific_ids=None, progress_bar=True)

Load Total Charge (TC) region masks as ground truth data.

Parameters:
  • file (Union[str, h5py.File]) – The file to read the data from. Can either be an object of the type h5py.File or the path to the dataset. If you want to do multiple consecutive loads from the same file (e.g. for using th PyTorch SimcatsDataset without preloading), consider initializing the file object yourself and passing it, to improve the performance.

  • specific_ids (Union[range, List[int], numpy.ndarray, None]) – Determines if only specific ids should be loaded. Using this option, the returned values are sorted according to the specified ids and not necessarily ascending. If set to None, all data is loaded. Default is None.

  • progress_bar (bool) – Determines whether to display a progress bar. Default is True.

Returns:

Total Charge (TC) region masks

Return type:

List[numpy.ndarray]

simcats_datasets.loading.load_ground_truth.load_tc_region_minus_tct_masks(file, specific_ids=None, progress_bar=True)

Load Total Charge (TC) region minus Total Charge Transition (TCT) masks as ground truth data (TCTs are basically excluded from the regions).

Parameters:
  • file (Union[str, h5py.File]) – The file to read the data from. Can either be an object of the type h5py.File or the path to the dataset. If you want to do multiple consecutive loads from the same file (e.g. for using th PyTorch SimcatsDataset without preloading), consider initializing the file object yourself and passing it, to improve the performance.

  • specific_ids (Union[range, List[int], numpy.ndarray, None]) – Determines if only specific ids should be loaded. Using this option, the returned values are sorted according to the specified ids and not necessarily ascending. If set to None, all data is loaded. Default is None.

  • progress_bar (bool) – Determines whether to display a progress bar. Default is True.

Returns:

Total Charge (TC) region minus Total Charge Transition (TCT) masks

Return type:

List[numpy.ndarray]

simcats_datasets.loading.load_ground_truth.load_c_region_masks(file, specific_ids=None, progress_bar=True)

Load Charge (C) region masks as ground truth data (CTs are basically excluded from the TC regions).

Parameters:
  • file (Union[str, h5py.File]) – The file to read the data from. Can either be an object of the type h5py.File or the path to the dataset. If you want to do multiple consecutive loads from the same file (e.g. for using th PyTorch SimcatsDataset without preloading), consider initializing the file object yourself and passing it, to improve the performance.

  • specific_ids (Union[range, List[int], numpy.ndarray, None]) – Determines if only specific ids should be loaded. Using this option, the returned values are sorted according to the specified ids and not necessarily ascending. If set to None, all data is loaded. Default is None.

  • progress_bar (bool) – Determines whether to display a progress bar. Default is True.

Returns:

Charge (C) region masks

Return type:

List[numpy.ndarray]