slitflow.data module
- class Data(info_path=None)[source]
Bases:
objectBasic Data super class.
All analysis classes should be subclasses of this class. In this class,
run()executesprocess()to all split data.- reqs
List of Data objects required to run
process()static method of this class.- Type:
list of
Data
- data
List of result data calculated by
process().- Type:
list of data such as
pandas.DataFrameornumpy.ndarray
- n_worker
Number of CPU used by
process(). This number is defined by cpu_count *slitflow.CPU_RATE. This attribute is used duringrun_mp().- Type:
- memory_limit
Max usage of memory. This value is defined by
slitflow.MEMORY_LIMIT. This attribute prevents crashing memory during loading data and calculation.- Type:
- MEMORY_LIMIT = 0.9
- CPU_RATE = 0.7
- set_split(split_depth)[source]
Split info index and data.
This method can be used to overwrite
split_depth.
- set_reqs(reqs=None, param=None)[source]
Preparation of required data.
This step strongly depends on the analysis type. Frequently used processes are in
slitflow.setreqs.
- set_info(param={})[source]
Convert input information to Info object.
This method creates columns and parameters information. The columns information is used to handle data structure. The parameter dictionaries are set as param of
process(). This method is called beforerun(). Implemented in subclass.- Parameters:
param (dict, optional) – Parameters for columns or params.
- set_index()[source]
Create index structure of this analysis data.
This step strongly depends on the analysis type. Frequently used processes are in
slitflow.setindex.
- run_mp(reqs=None, param=None)[source]
Execute run method using multiple CPU.
This method uses
ProcessPoolExecutor.