POD Definition - Deploying to DC/OS POD Definition - Deploying to DC/OS docker docker

POD Definition - Deploying to DC/OS


I may not know the answer to the issues you are running into but I think I may be able to share some pointers to help debug this.

First of all, if you are unable to view logs from the DC/OS UI, you can also go to <cluster_url>/mesos and find the simple_docker task under Completed Tasks . It would show up as TASK_FAILED. Click on the Sandbox link on the right and then check stderr and stdout files for the task. There might be some clues there as to why it failed.

Another place to look can be to note the Agent IP from the Mesos UI where the task failed. SSH into the node and run sudo journalctl -u dcos-mesos-slave to see agent logs and try to find the logs corresponding to the failing task

One difference between the running the application as a Pod and a the App definition you shared is that your app definition is using DOCKER as the containerizer for the task while Pods use MESOS containerizer. I noticed that you are using a private docker registry for your docker images. One possibility is that if your private registry's certificate is not trusted by Mesos but docker is configured already to trust it:

<copy the certificate(s) to /var/lib/dcos/pki/tls/certs>cd /var/lib/dcos/pki/tls/certsfor file in *.crt; do ln -s \"$file\" \"$(openssl x509 -hash -noout -in \"$file\")\".0; done

This would need to be done on each agent node.

If its not a certificate issue, it could be docker registry credential issues. If the docker registry you are using requires authentication then you can specify docker credential at install time (assuming advanced install method) using : https://docs.mesosphere.com/1.11/installing/production/advanced-configuration/configuration-reference/#cluster-docker-credentials