Module 2 · 04 / 09
The scheduler, decoded.
A pod is just a request until the scheduler agrees. Understanding how it decides is the difference between guessing and knowing why your workload landed where it did.
When you create a pod, it does not go straight to a node. It goes into a queue. The scheduler pulls it off, runs it through two phases — filtering and scoring — and only then binds it to a node. Everything else is detail on top of those two steps.
Filtering throws out every node that cannot run the pod: not enough CPU, a taint the pod does not tolerate, a node selector that does not match. Scoring ranks whatever survives, so the pod lands on the best fit, not merely a legal one.
There are no dumb questions
“If two nodes score the same, which one wins?”
Ties break at random by design. If you need a pod to land somewhere specific, that is a job for affinity rules or a node selector — not for hoping the scheduler picks the same node twice.
Taints, tolerations, affinity
A taint is a node saying keep out unless you have a reason to be here. A toleration is the pod's reason. Node affinity is softer — a preference the scorer weighs rather than a hard gate. Reach for requiredDuringScheduling when it must hold, and the preferred variant when it is a nice-to-have.
A pod requests 2 CPU and tolerates the gpu=true taint. Node A has 1 CPU free, node B has 4 CPU and the gpu taint, node C has 3 CPU and no taint. Which node wins, and why?
Node C wins. B is filtered out only if the pod lacks the toleration — here it has it, but C is the cleaner fit and A never survives filtering. When you internalise the two-phase model, placement stops being magic.