Connect to Kubernetes Prometheus

Overview

Finout’s solution for managing container costs is completely agentless, minimizing security risks and performance overhead. It automatically identifies Kubernetes waste across any Kubernetes resource. Finout integrates with Prometheus to support both single-cluster and centralized monitoring solutions (Thanos, Mimir, and VictoriaMetrics,). See the centralized cluster solutions for more information.

The Finout agentless Kubernetes integration consists of three steps:

Configure an S3 bucket to which the Prometheus Kubernetes data will be exported. In this step, it is also required to set read-only permissions to Finout to read your Prometheus files.

Note: It is highly recommended to use the same S3 bucket and permissions that are set for your AWS Cost and Usage Report (CUR) files.

Add a policy to a K8s node worker to enable the CronJob to write permissions to your S3 bucket.
Create a CronJob to export the data from Prometheus to your S3 bucket.

1. Connect a S3 Bucket

The S3 bucket consists of the K8s data extracted from Prometheus. This bucket will have read-only permissions for Finout.

Important: If you have already connected an S3 bucket to Finout for the AWS Cost and Usage Report (CUR), you can use the same S3 bucket with the same ARN role already configured. The role and bucket information are automatically filled out in the Finout console. If you want to configure a different S3 bucket or haven’t set up one yet, go to Grant Finout Access to an S3 bucket.

2. Add the K8s Worker Node Role Policy

If you have more than one cluster, repeat steps 2 and 3 for all of your clusters.

Attach the following policy to the K8s worker node role. This enables the CronJob to write the K8S data from Prometheus into your S3 Bucket.

'{
   "Version": "2012-10-17",
   "Statement": [
       {
           "Sid": "VisualEditor0",
           "Effect": "Allow",
           "Action": "s3:ListBucket",
           "Resource": "arn:aws:s3:::<BUCKET_NAME>",
           "Condition": {
               "StringEquals": {
                   "s3:delimiter": "/"
               },
               "StringLike": {
                   "s3:prefix": "k8s/prometheus*"
               }
           }
       },
       {
           "Sid": "VisualEditor1",
           "Effect": "Allow",
           "Action": [
               "s3:PutObject",
               "s3:GetObject",
               "s3:DeleteObject"
           ],
           "Resource": "arn:aws:s3:::<BUCKET_NAME>/k8s/prometheus/*"
       }
   ]
}'

Start a version of kube-state-metrics (most likely 2.0.0+). To fetch data from the Prometheus-kube-state-metrics daemon set, include the following flag to export all labels to Prometheus (or change the pattern to your required pattern): --metric-labels-allowlist=pods=[*],nodes=[*] Do this by adding an arg to the kube-state-metrics container, for example: spec: containers:
1. args:
2. --port=8080
3. --metric-labels-allowlist=pods=[*],nodes=[*]

3. Create and Configure the CronJob

Copy the CronJob configuration below to a file (for example, cronjob.yaml) This is an example of a CronJob that schedules a Job every 30 minutes. The job queries Prometheus with a 5-second delay between queries so as not to overload your Prometheus stack.
Run the command in a namespace of your choice, preferably the one where the Prometheus stack is deployed: kubectl create -f <filename>
Trigger the job (instead of waiting for it to start):
kubectl create job --from=cronjob/finout-prometheus-exporter-job finout-prometheus-exporter-job -n <namespace>
The job automatically fetches Prometheus data from 3 days ago up to the current time.

Additional YAML Configuration Fields

Query Data

QUERY_<QUERY_NAME> Add a custom query by setting the value of this environment variable.
TIME_FRAME Defines the time frame window for a query in seconds. Default: 3600 (1 hour).
BACKFILL_DAYS Specifies the number of days the exporter retrieves historical samples. Default: 3d. To configure these fields, add them to the configuration file under the env section and use the name/value format for each desired field.
For example:
```
env:
  - name: QUERY_<QUERY_NAME>
    value: "<custom_query_value>"
  - name: TIME_FRAME
    value: "3600"
  - name: BACKFILL_DAYS
    value: "3d"
```
Ensure that the field names and corresponding values are correctly specified to apply the desired configuration.

Fields Required for Centerlized Clusters (e.g. Thanos, VictoriaMetrics, and Mimir)

Relevant for all tools: CLUSTER_LABEL_NAME - Define the environment variable to match the label used in your cluster.
Note: - For accounts using centralized clusters, and the CLUSTER_LABEL_NAME is already named “cluster,” no configuration changes are needed. - If your cluster has a unique name, you must define the environment variable accordingly.
VictoriaMetrics - Define the PROMETHEUS_BEARER_AUTH_TOKEN environment variable.
Mimir - Define the PROMETHEUS_X_SCOPE_ORGID environment variable.

Note: After the CronJob is configured and deployed, contact Finout Support to complete the centralized cluster setup.

CronJob Yaml Configuration Example

The K8s internal address for the Prometheus server must be structured as follows: <PROMETHEUS_SERVICE>.<NAMESPACE>.svc.cluster.local


apiVersion: batch/v1
kind: CronJob
metadata:
 name: finout-prometheus-exporter-job
spec:
 successfulJobsHistoryLimit: 1
 failedJobsHistoryLimit: 1
 concurrencyPolicy: Forbid
 schedule: "*/30 * * * *"
 jobTemplate:
   spec:
     template:
       spec:
         containers:
           - name: finout-prometheus-exporter
             image: finout/finout-metrics-exporter:1.24.0
             imagePullPolicy: Always
             env:
               - name: S3_BUCKET
                 value: "<BUCKET_NAME>"
               - name: S3_PREFIX
                 value: "k8s/prometheus"
               - name: CLUSTER_NAME
                 value: "<CLUSTER_NAME>"
               - name: HOSTNAME
                 value: "<PROMETHEUS_SERVICE>.<NAMESPACE>.svc.cluster.local"
               - name: PORT
                 value: 9090
         restartPolicy: OnFailure

Supported InitContainer Metrics

Definitions:

Main container - The primary container in a Kubernetes pod that runs the core application or service. It typically represents the business logic or workload the pod is designed to execute.
InitContainer (Sub-Container) - A Kubernetes native sub-container that runs before or alongside the main container to handle setup or support tasks like config, logging, or proxying.

InitContainers are a type of Kubernetes sidecar cluster that improve cost allocation by capturing resource usage from additional containers, making Kubernetes idle calculations more accurate. It functions as a native Kubernetes sub-container that runs before or alongside the main container, typically handling setup or support tasks such as configuration, logging, or proxying. This update uses the latest cronjob image version to detect Init Containers and include their usage in cost calculations.

To export InitContainers metrics using Finout cronjob:

Update your cronjob image version to: image: finout/finout-metrics-exporter:1.30.0

The updated cost data will appear in the Finout after 2 days.

Configuration Result After configuring the final step, the CronJob starts exporting data from Prometheus into your S3 bucket. After the integration, the Kubernetes cost data may take up to 24 hours to become available in your Finout account.

If this process fails, guidance is provided in the Finout console to help resolve the problem, or review the Troubleshooting FAQs as this will assist you to easily fix any errors.

Grant Finout Access to a S3 Bucket

If you want to export the Kubernetes data into a different S3 Bucket than the one with the AWS Cost and Usage Report (CUR) files, follow the steps below:

Go to your newly created role.
Click Add permissions and select Create inline policy.

Select JSON, and paste the following (change the <BUCKET_NAME> to the bucket you created or to your existing CUR bucket):

"Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "tag:GetTagKeys"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": "arn:aws:s3:::<BUCKET_NAME>/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": "arn:aws:s3:::<BUCKET_NAME>"
        }
    ]
}

Click Next until the review screen appears, and name the policy finest-access-policy_K8S.
Click Create policy to create your policy for the IAM role.
Fill in all the details in the Finout console.

FAQs

The following troubleshooting FAQs explain common integration issues and how to easily resolve them.

What should I do if Finout cannot assume the AWS role?

Finout could not assume the role that was created on your AWS console, review the role created for Finout and verify the trust policy. Also, verify that the external ID is the one provided by Finout.

What should I do if Finout cannot access the provided S3 bucket?

Review the relevant policy on the bucket according to the documentation and verify the region.

What should I do if Finout says the S3 bucket doesn't exist?

Finout could not locate the provided S3 bucket. Double check that the bucket provided exists.

What should I do if Finout cannot read from my S3 bucket?

Make sure that the relevant read permissions for Finout were created. If the account that writes the Prometheus files into the S3 bucket is not the same account that created the ARN role for Finout, generate a new ARN role for Finout from the former account (that is, the account that writes into the S3 bucket) and send it to Finout, following the instructions below.

If you already created a trust policy for Finout when setting up your AWS connection, you should use the same trust policy (Read more about how to Grant Finout access to your S3 bucket).

Use the following policy for your ARN role:

"Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "tag:GetTagKeys"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": "arn:aws:s3:::<BUCKET_NAME>/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": "arn:aws:s3:::<BUCKET_NAME>"
        }
    ]
}

How do I ensure Kubernetes monitoring is compatible with Finout and Prometheus?

To ensure your Kubernetes monitoring is compatible with Finout, you need a Prometheus-compatible system (such as VictoriaMetrics, Thanos, Cortex, M3) and a functioning Kubernetes cluster for deploying the Finout cronjob. Follow these integration steps:

Validate that your system exports the correct metrics, including memory, CPU, network usage, and node info.
Deploy the Finout cronjob in your Kubernetes cluster.
Configure connectivity with the necessary environment variables.
Ensure Finout has the required permissions to access your metrics export.

For detailed instructions and requirements, refer to the Finout integration documentation.

Where is the finout/finout-metrics-exporter image hosted?

The finout/finout-metrics-exporter image is hosted on Docker Hub.

What should I do if the CronJob cannot find the Prometheus host?

Make sure the host and port are correct (using kubectl get services) and configure these as the HOST / PORT ENV VARS in the CronJob.

What steps should I take if my CronJob can't write to the S3 bucket?

Apply the correct policy on the node worker or the pod. See instructions here.

What should I do if memory metrics no longer appear in Finout dashboards?

After updating or adjusting Prometheus configurations, memory metrics may no longer appear in Finout dashboards.

Immediately following the identification of this issue, ensure you’re familiar with the best practices for configuring and reloading Prometheus by consulting the official Prometheus documentation.

Verify the recording rule: Check that the "node_namespace_pod_container:container_memory_working_set_bytes" recording rule is correctly set up in your Prometheus environment.
Add or update the rule: If the rule is absent or improperly configured, add or amend it within your Prometheus configuration settings.
Reload the configuration: Implement the new or modified rule by reloading Prometheus’s configuration.
1. Access the configuration: Open the Prometheus configuration, either directly through the file or via the UI.
2. Edit the rules: In the recording rules section, verify the presence of "node_namespace_pod_container:container_memory_working_set_bytes". If it’s not there, insert or correct it.
3. Apply the changes: Reload or restart Prometheus to apply and activate the rule.

This approach ensures you directly address the issue with missing memory metrics in Finout, supported by a foundational understanding of Prometheus configuration and management principles.

Why does the Finout exporter time out with large Kubernetes clusters, and what should I do when it happens?

When all clusters share the same Prometheus disk space settings, larger clusters can quickly use up disk space due to their higher metrics output, limiting data retention to a few hours. This can lead to the Finout exporter timing out, particularly when fetching the previous day's metrics. To address this, adjust the metric retention settings according to each cluster's size and output, ensuring sufficient disk space for longer data retention.

Can I use different monitoring tools for different Kubernetes clusters within a single AWS environment when integrating with Finout?

When integrating with Finout, you can effectively monitor different Kubernetes clusters using specific monitoring tools, such as Datadog and Prometheus, all within the same AWS environment. This method supports the development of customized monitoring strategies that cater to each cluster's particular needs and budgetary considerations. For example, a cluster dedicated to customer-facing applications might benefit from Datadog's sophisticated capabilities, while a cluster utilized for internal operations might find Prometheus to be a more budget-friendly option.

To ensure successful integration with Finout in such a setup:

Verify that each cluster operates independently without shared node resources.
Configure each cluster's monitoring tool properly to ensure seamless data capture and integration.

Why are some deployments missing from the monitoring dashboard?

Deployments may not appear in the monitoring dashboard if specific metrics are missing due to retention policies or configurations not capturing them. It's crucial to ensure all essential metrics are retained. Reviewing and adjusting the configuration to capture and retain these metrics can improve visibility and ensure comprehensive monitoring.

What should I do if the metrics exporter ran out of memory?

The metrics exporter might face out-of-memory (OOM) errors with insufficient resources, especially when dealing with a lot of data. This limitation can prevent the exporter from functioning properly. Allocating more resources based on its needs can solve this. By fine-tuning memory allocation to match the data workload, the exporter can run smoothly and without breaks.

What should I do if the exporter is slow or failing due to high load?

To address performance issues with the exporter, reduce the values for TIME_FRAME and/or BACKFILL_DAYS. Lowering these values decreases the data volume processed, improving speed and reliability. See the CronJob Yaml configuration example.

How can I collect additional metrics that Finout doesn’t include by default?

Use QUERY_<QUERY_NAME> to add a custom query. Enter the desired metric name as the value to start collecting it. See the CronJob Yaml configuration example.

How do I integrate Prometheus with Finout using Google GKE or Azure AKS?

Finout currently supports ingesting metrics from AWS S3 only. You have two options:

Use your own AWS S3 bucket – Push your Kubernetes monitoring metrics to an AWS S3 bucket hosted in your infrastructure and follow the standard integration steps.
Use Finout's AWS S3 bucket – Push your metrics to an AWS S3 bucket hosted by Finout. To set this up, contact us at [email protected].

Is centralized clusters supported for Promethues integration (e.g., Thanos, VictoriaMetrics, and Mimir,)?

Yes, to support centralized clusters, you need to configure the CronJob with the appropriate cluster label:

You must define the CLUSTER_LABEL_NAME environment variable to match your cluster’s label, see Finout's centralized cluster solutions.
If your cluster is already labeled "cluster", no changes are needed.

Once the CronJob is configured and deployed, please contact Finout Support to complete the centralized cluster setup.

What should I do if I'm using both single cluster and centralized cluster (e.g., Thanos, Mimir, VictoriaMetrics) in my account?

Finout supports both standalone Prometheus clusters and centralized or aggregated setups like Thanos.

To support both configurations:

Create two Kubernetes Prometheus cost centers and deploy two cronjobs, one for each type, per the instructions above.
Configure exporters to write to separate S3 paths using:
- S3_BUCKET
- S3_PREFIX
Update security policies to allow access to all relevant prefixes or buckets (if using a single bucket).

PreviousKubernetes NextKubernetes - How Finout Calculates K8s Costs

Last updated 1 month ago

Was this helpful?