Trying to understand what values to use for resources and limits of multiple container deployment

memory kubernetes cpu digital-ocean autoscaling

How to determine what values to use for my cpu and memory requests and limits fields. Mainly due to variable replica count, i.e. do I need to account for maximum number of replicas each using their resources or for deployment in general, do I plan it per pod basis or for each container individually

Requests and limits are the mechanisms Kubernetes uses to control resources such as CPU and memory.

Requests are what the container is guaranteed to get. If a container requests a resource, Kubernetes will only schedule it on a node that can give it that resource.
Limits, on the other hand, make sure a container never goes above a certain value. The container is only allowed to go up to the limit, and then it is restricted.

The number of replicas will be determined by the autoscaler on the ReplicaController.

when I deploy my file my deployment is either stuck in a Pending state, or keeps restarting multiple times until it gets terminated.

pending state means that there is not resources available to schedule new pods.
restarting may be triggered by other issues, I'd suggest you to debug it after solving the scaling issues.

My horizontal pod autoscaler also reports targets as <unknown>/80%, but I believe it is due to me removing resources from my deployment, as it was not working.

You are correct, if you don't set the request limit, the % desired will remain unknown and the autoscaler won't be able to trigger scaling up or down.
Here you can see algorithm responsible for that.
Horizontal Pod Autoscaler will trigger new pods based on the request % of usage on the pod. In this case whenever the pod reachs 80% of the max request value it will trigger new pods up to the maximum specified.

For a good HPA example, check this link: Horizontal Pod Autoscale Walkthrough

But How does Horizontal Pod Autoscaler works with Cluster Autoscaler?

Horizontal Pod Autoscaler changes the deployment's or replicaset's number of replicas based on the current CPU load. If the load increases, HPA will create new replicas, for which there may or may not be enough space in the cluster.
If there are not enough resources, CA will try to bring up some nodes, so that the HPA-created pods have a place to run. If the load decreases, HPA will stop some of the replicas. As a result, some nodes may become underutilized or completely empty, and then CA will terminate such unneeded nodes.

NOTE: The key is to set the maximum replicas for HPA thinking on a cluster level according to the amount of nodes (and budget) available for your app, you can start setting a very high max number of replicas, monitor and then change it according to the usage metrics and prediction of future load.

Take a look at How to Enable the Cluster Autoscaler for a DigitalOcean Kubernetes Cluster in order to properly enable it as well.

If you have any question let me know in the comments.

CodeHunter

Trying to understand what values to use for resources and limits of multiple container deployment

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last