5.9 KiB
http://creativecommons.org/licenses/by/3.0/legalcode
Multiple PodManager Scheduler
This proposal describes adding new scheduler service into valence to determine how to dispatch compose operation to the appropriate Pod manager.
https://blueprints.launchpad.net/openstack-valence/+spec/valence-multipodm-scheduler
Problem description
Valence will support multiple Pod managers on the backend instead of one single instance to improve its scalability. It requires valence to provide a scheduling service to determine how to dispatch each compose operations on the appropriate Pod manager. The scheduler should filter out the inappropriate Pod Manager without requested hardware resource and rank the priority for the remaining Pod manager with different algorithms. For different scheduling goal, it should allow admin to plugin new algorithms.
Proposed change
The valence scheduler runs as a separate process alongside the other valence services such as the API server. Its interface to the API server is accepting the request proprieties of each compose operation, and it does a posts to controller to indicate where the composition should be scheduled.
The scheduler is divided into two layers from high level: - Scheduler framework: The main() entry that does service initialization and calls the scheduler algorithm. - Scheduling algorithm: The scheduling algorithm that assigns target Pod manager for each compose operation.
The Scheduler tries to find a PODM for each compose operation, one at a time. - First it applies a set of "filter functions" to filter out inappropriate nodes. If the compose operation specifies resource requests, then the scheduler will filter out PODM that don't have at least that much resources available. - Second, it applies a set of "priority functions" that rank the PODM that weren't filtered out in the first step. The "priority functions" may vary for different scenarios. For example, it tries to spread all composed node across all PODM. - Finally, the PODM with the highest priority is chosen. If there are multiple such PODM, then one of them is chosen at random.
For given compose operations:
+---------------------------------------------+
| Schedulable PODM: |
| |
| +--------+ +--------+ +--------+ |
| | PODM 1 | | PODM 2 | | PODM 3 | |
| +--------+ +--------+ +--------+ |
| |
+-------------------+-------------------------+
|
|
v
+-------------------+-------------------------+
Filters function: PODM 3 doesn't have enough
resource
+-------------------+-------------------------+
|
|
v
+-------------------+-------------------------+
| remaining PODM: |
| +--------+ +--------+ |
| | PODM 1 | | PODM 2 | |
| +--------+ +--------+ |
| |
+-------------------+-------------------------+
|
|
v
+-------------------+-------------------------+
Priority function: PODM 1: p=5
PODM 2: p=3
+-------------------+-------------------------+
|
|
v
select max{PODM priority} = PODM 1
Both filters function and Priority function should be configurable to allow admin to choose proper algorithm for different scenarios, like disable all algorithms and let scheduler randomly choose one.
Alternatives
Make scheduler as a valence module instead of standalone service. This solution will be more simple but tight couple with other services, which will bring more overhead if scheduler service need to be upgraded or restarted.
Data model impact
None
REST API impact
Be default, scheduler will determine the target POD manager for each compose operation. However, valence should also allow user to specify the target POD manager. So a new parameter is needed for node composition request.
` /v1/nodes/: POST : add a new param to let user specify a POD manager for compose operation.
`
Driver API impact
None
Security impact
None
Other end user impact
User can specify the target POD manager for compose operation if needed.
Scalability impact
The valence scalability will be significantly improved by supporting dispatch compose operations on multiple POD manager.
Performance Impact
The scheduler will bring more complexity and overhead, which might add latency into valence response one compose operation. Given the compose operations on the data center will not be so frequently as launch VM/continer, so the scheduler will not be the performance bottleneck in the current stage.
Other deployer impact
The admin should deploy and start scheduler process alongside other valence services.
Developer impact
None
Valence GUI / Horizon impact
None
Implementation
Assignee(s)
- Primary assignee:
-
Lin Yang
Work Items
- Implement the framework of scheduler service.
- Implement the default algorithms for both filter and priority steps.
- Add unit tests.
Dependencies
None
Testing
- Add unit tests for service framework and scheduling algorithms.
Documentation Impact
None
References
None