sorcha.readers.CombinedDataReader

The CombinedDataReader class supports loading the entire input data for the simulator post processing by using individuals reader classes to read individual input files and combining the data into a single table.

The CombinedDataReader object reads the data in blocks to limit memory usage. For each blocks, it uses two stages: 1) It reads a range of individual rows from the primary_reader. By default this

reader is the first auxiliary data reader, but can be set to the ephemeris reader. This reader is used to extract a list of object IDs for this block.

  1. For each of the readers (ephemeris and auxiliary data) load in all the rows corresponding to the object IDs extracted in stage 1.

For example, if the ephemeris file is used as the primary reader, the algorithm will load data in blocks of the ephemeris rows and join in the auxiliary data for just the object IDs on those rows. It is not guaranteed to include all rows for the current objects.

Classes

CombinedDataReader

Module Contents

class CombinedDataReader(ephem_primary=False, **kwargs)[source]
ephem_reader = None[source]
aux_data_readers = [][source]
block_start = 0[source]
ephem_primary = False[source]
add_ephem_reader(new_reader)[source]

Add a new reader for ephemeris data.

Parameters:

new_reader (ObjectDataReader) -- The reader for a specific input file.

add_aux_data_reader(new_reader)[source]

Add a new object reader that corresponds to an auxiliary input data type..

Parameters:

new_reader (ObjectDataReader) -- The reader for a specific input file.

check_aux_object_ids()[source]

Checks the ObjIDs in all of the auxiliary data readers to make sure both files contain exactly the same ObjIDs.

read_block(block_size=None, verbose=False, **kwargs)[source]

Reads in a set number of rows from the input, performs post-processing and validation, and returns a data frame.

Parameters:
  • block_size (integer, optional) -- the number of rows to read in. Use block_size=None to read in all available data. Default = None

  • verbose (boolean, optional) -- Use verbose logging. Default = False

  • **kwargs (dictionary, optional) -- Extra arguments

Returns:

res_df -- dataframe of the combined object data.

Return type:

pandas dataframe

read_aux_block(block_size=None, verbose=False, **kwargs)[source]

Reads in a set number of rows from the input, performs post-processing and validation, and returns a data frame.

This function DOES NOT include the ephemeris data in the returned data frame. It is to be used when generating the ephemeris during the execution of Sorcha.

Parameters:
  • block_size (integer, optional) -- the number of rows to read in. Use block_size=None to read in all available data. Default = None

  • verbose (boolean, optional) -- use verbose logging. Default = False

  • **kwargs (dictionary, optional) -- Extra arguments

Returns:

res_df -- dataframe of the combined object data, excluding any ephemeris data.

Return type:

pandas dataframe