Monitoring Modern Container Infrastructure
This post was written by a guest contributor, Yair Cohen, PM at Datadog.
As part of our continuing commitment to open standards and supporting a broad and varied ecosystem, we’re pleased to announce that Datadog has extended its cloud monitoring and analytics support to Oracle Cloud Infrastructure (OCI) Container Engine for Kubernetes (OKE).
OKE is a service that helps you deploy, manage, and scale Kubernetes clusters in the cloud. With OKE, organizations can build dynamic containerized applications by incorporating Kubernetes with services running on their Oracle Cloud Infrastructure.
Now, you can use the Datadog Agent to get comprehensive visibility into your Kubernetes clusters on Oracle Cloud Infrastructure. After you enable the Kubernetes integration, you can visualize your OKE container infrastructure, monitor live processes, and track key metrics from all your pods and containers in one place.
How Oracle’s Container Engine for Kubernetes Works
Oracle Cloud Infrastructure Container Engine for Kubernetes (OKE) provides a CLI and Console (browser-based interface) for creating and managing Kubernetes clusters. You can set up OKE to automatically provision and launch Kubernetes clusters based on a custom configuration or through a quick cluster option in the Console.
When OKE launches a cluster, it creates master and worker nodes in a node pool along with all of the network resources needed for that cluster, including a virtual cloud network (VCN). You can view more details about the cluster and nodes in the OKE Console, as shown in the following example.
Because OKE is a managed service, you can easily modify your cluster and download your cluster’s kubeconfig file to perform additional management tasks with kubectl, including deploying the Datadog Agent.
Monitor Your OKE Clusters with Datadog
Monitoring Kubernetes is crucial to understanding the health of your dynamic, distributed environment. After you deploy the Datadog Agent on your OKE cluster, you can track the load on your clusters, pods, and individual nodes to get better insights into how to provision and deploy your resources. In addition to monitoring your nodes, pods, and containers, the Agent can also collect and report metrics from the services running in your cluster, so that you can:
- Explore your OKE clusters with dashboards
- Monitor containers and processes in real time
- Automatically track and monitor containerized services
Deploying the Agent as a DaemonSet is the most straightforward (and recommended) method, because it ensures that the Agent runs as a pod on every node in your cluster and that each new node automatically has the Agent installed. You can also configure the Agent to collect process data, traces, and logs by adding a few extra lines to the Agent’s manifest.
Explore a High-Level View of Your OKE Clusters
Datadog includes several built-in Kubernetes dashboards so you can monitor your Kubernetes Controller Manager and Scheduler, and get a high-level view of your pods and kubelets. These dashboards automatically track key Kubernetes events and metrics, including:
- The number of running pods per node
- The most CPU intensive pods
- The number of running containers
You can clone any built-in dashboard and modify it to monitor the data that’s most important to you.
Monitor Containers and Processes in Real Time
With Datadog’s Live Container view, you can get detailed insights into your containers’ resource consumption, logs, and health in real time. Regardless of the size of your Kubernetes deployments on Oracle Cloud Infrastructure, you can quickly drill down to inspect any container (or a group of containers) when you need to troubleshoot an issue.
Datadog organizes your data with tags, including metadata that is automatically extracted from Kubernetes (for example, pod_name, container_id, docker_image, kube_deployment). These tags help you search, filter, and group all of the containers in your OKE cluster in Datadog.
The following example shows all the containers running from a single deployment, grouped by host.
If you select a specific container, you can view its resource metrics (at two-second resolution), running processes, and logs. From there, you can pivot to a host dashboard and to other related logs and processes collected by the Agent. This provides you with more context as you monitor your systems and debug issues.
If one of your containers begins consuming too many resources, you can look into its processes to determine which one is causing the problem through Datadog’s Live Process view. For example, you can track all running processes for a single OKE node.
This view enables you to easily visualize resource consumption (for example, total CPU, RSS memory) for processes within a single container, using any of the tags that are automatically pulled from Kubernetes.
Autodiscover Containerized Services
As your infrastructure grows, tracking its many moving pieces becomes more difficult as containers are created (or destroyed) and move across nodes. The Datadog Agent’s Autodiscovery feature helps you stay on top of the containerized services running in your dynamic environment. After it’s enabled, the Datadog Agent automatically collects and reports on data from the services running on the containers in your OKE cluster (for example, Cassandra, Redis, PostgreSQL), even as they move across your infrastructure. And, if a container shuts down or is destroyed, the Agent disables checks for those containers. Autodiscovery helps you monitor only the resources that you need so you can be confident that the data you see in Datadog is the latest.
Know What’s Happening with Your OKE Clusters
With Datadog, you can get deep visibility into all of the Kubernetes clusters running on Oracle Cloud Infrastructure, alongside other technologies running in your environment. If you already use Datadog, check out the Kubernetes documentation to learn more about how you can seamlessly monitor your OKE clusters alongside the rest of your infrastructure. Want to experience Datadog on OKE for yourself? Sign up for an OCI trial account, then request a free trial of Datadog.