Kubernetes autoscaler - NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added)
I had the wrong parameters defined on the autoscaler.
I had to modify the node-group-auto-discovery
and nodes
parameters.
- ./cluster-autoscaler - --cloud-provider=aws - --namespace=default - --scan-interval=25s - --scale-down-unneeded-time=30s - --nodes=1:20:terraform-eks-demo20190922161659090500000007--terraform-eks-demo20190922161700651000000008 - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/example-job-runner - --logtostderr=true - --stderrthreshold=info - --v=4
When installing the cluster autoscaler it is not enough to simply apply the example config, e.g.:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
As documented in the user guide, that config has placeholder for your eks cluster name in the value for node-group-auto-discovery
, and you must either replace it before applying, or update it after deploying.
I ran into this as well. I didn't see this super well documented where you think it would be. Here is the detailed explanation on the main README.md
AWS - Using auto-discovery of tagged instance groups
Auto-discovery finds ASGs tags as below and automatically manages thembased on the min and max size specified in the ASG.
cloudProvider=aws
only.
- Tag the ASGs with keys to match
.Values.autoDiscovery.tags
, by default:k8s.io/cluster-autoscaler/enabled
andk8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>
- Verify the IAM Permissions
- Set
autoDiscovery.clusterName=<YOUR CLUSTER NAME>
- Set
awsRegion=<YOUR AWS REGION>
- Set
awsAccessKeyID=<YOUR AWS KEY ID>
andawsSecretAccessKey=<YOUR AWS SECRET KEY>
if you want to use AWS credentials directly insteadof an instancerole
$ helm install my-release autoscaler/cluster-autoscaler-chart --set autoDiscovery.clusterName=<CLUSTER NAME>
My issue was that I did not specify both tags instead only specifying the k8s.io/cluster-autoscaler/enabled
tag. This makes sense now that I think about it as if you have multiple k8s clusters in the same account the cluster-autoscaler will need to know which ASG to actually scale.
I mistakenly added these as node labels k8s.io/cluster-autoscaler/enabled
and k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>
But they should actually be tags on the nodes in the worker groups.
Specifically, if you're using the AWS EKS module in Terraform -
workers_group_defaults = { tags = [{ key = "k8s.io/cluster-autoscaler/enabled" value = "TRUE" propagate_at_launch = true },{ key = "k8s.io/cluster-autoscaler/${var.cluster_name}" value = "owned" propagate_at_launch = true }] }