Config
There are four configuration objects for each Band’s component.
Each configuration field is optional or required. If a field is optional, then it is guaranteed that the default value exists. If a field is required, a configuration cannot be generated by RuntimeConfigBuilder
without specifying the field.
Enumeration Types
BandSchedulerType
kFixedDeviceFixedWorker
kRoundRobin
kShortestExpectedLatency
kFixedDeviceGlobalQueue
kHeterogeneousEarliestFinishTime
kLeastSlackTimeFirst
kHeterogeneousEarliestFinishTimeReserved
CPUMaskFlags
kAll
kLittle
kBig
kPrimary
DeviceFlags
kCPU
kGPU
kDSP
kNPU
SubgraphPreparationType
kNoFallbackSubgraph
kFallbackPerWorker
kUnitSubgraph
kMergeUnitSubgraph
ProfileConfig
online
[type:bool
, default:true
]: Profile online if true, offline if false.num_warmups
[type:int
, default:1
]: The number of warmup runs before profile.num_runs
[type:int
, default:1
]: The number of runs for profilecopy_computation_ratio
[type:std::vector<int>
, default:[30000, ...]
]: The ratio of computation to input-output copy. Used for latency estimation. The size of the list should be the same as the number of devices.smoothing_factor
[type:float
, default:0.1
]: The momentum to reflect current profiled data.<updateed_profile> = <smoothing_factor> * <curr_profile> + (1. - <smoothing_factor>) * <prev_profile>
.profile_data_path
[type:std::string
, default:""
]: The input path to the file for offline profile results. If not specified, this will be ignored and will not generate the result file.
PlannerConfig
schedule_window_size
[type:int
, default:INT_MAX
]: The size of window that scheduler will use.schedulers
[type:std::vector<SchedulerType>
, required]: The types of schedulers. IfN
schedulers are specified,N
queues will be generated.cpu_mask
[type:CPUMaskFlags
, default:kAll
]: CPU masks to set CPU affinity.log_path
[type:std::string
, default:""
]: The output path to the file for planner’s log. If not specified, this will be ignored and will not generate the result file.
WorkerConfig
workers
[type:std::vector<DeviceFlags>
, default:[kCPU, kGPU, ...]
]: The list of target devices. By default, one worker per device is generated.cpu_masks
[type:std::vector<CPUMaskFlags>
, default:[kAll, kAll, ...]
]: CPU masks to set CPU affinity. The size of the list must be the same as the size ofworkers
.num_threads
[type:std::vector<int>
, default:[1, 1, ...]
]: The number of threads. The size of the list must be the same as the size ofworkers
.allow_worksteal
[type:bool
, default:false
]: Work-stealing is enabled if true, disabled if false.availability_check_interval_ms
[type:int
, default:30_000
]: The interval for checking availability of devices. Used for detecting thermal throttling.
RuntimeConfig
RuntimeConfig
containsProfileConfig
,PlannerConfig
andWorkerConfig
.minimum_subgraph_size
[type:int
, default:7
]: The minimum subgraph size. If candidate subgraph size is smaller than this, the subgraph will not be created.subgraph_preparation_type
[type:SubgraphPreparationType
, default:kMergeUnitSubgraph
]: For fallback schedulers, determine how to generate candidate subgraphs.cpu_mask
[type:CPUMaskFlags
, default:kAll
]: The CPU mask for Band Engine.
RuntimeConfigBuilder
API
RuntimeConfigBuilder
delegates all builder that inherits ConfigBuilder
.
It is a friend
class of all the other ConfigBuilder
classes, so make sure to not change their members in RuntimeConfigBuilder
.
Exmaple Usage
RuntimeConfigBuilder b;
auto config = b.AddOnline(false) // Default was `true`
.AddSmoothingFactor(0.3) // Default was `0.1`
.AddSchedulers({SchedulerType::kRoundRobin, SchedulerType::kLeastSlackTimeFirst}) // Required field.
.Build();
Methods
All Add*
methods are idempotent, i.e. multiple calls behaves the same as a single call.
AddOnline(bool online)
AddNumWarmups(int num_warmups)
AddNumRuns(int num_runs)
AddCopyComputationRatio(std::vector<int> copy_computation_ratio)
AddSmoothingFactor(float smoothing_factor)
AddProfileLogPath(std::string profile_data_path)
AddPlannerLogPath(std::string planner_log_path)
AddScheduleWindowSize(int schedule_window_size)
AddSchedulers(std::vector<SchedulerType> schedulers)
AddPlannerCPUMask(CPUMaskFlags cpu_masks)
AddWorkers(std::vector<DeviceFlags> workers)
AddWorkerCPUMasks(std::vector<CPUMaskFlags> cpu_masks)
AddWorkerNumThreads(std::vector<int> num_threads)
AddAllowWorkSteal(bool allow_worksteal)
AddAvailabilityCheckIntervalMs(int32_t availability_check_interval_ms)
AddMinimumSubgraphSize(int minimum_subgraph_size)
AddSubgraphPreparationType(SubgraphPreparationType subgraph_preparation_type)
AddCPUMask(CPUMaskFlags cpu_mask)