Connect to Kubernetes Prometheus
Last updated
Last updated
Finout’s solution for managing container costs is completely agentless. It reduces security risks and performance overhead by automatically identifying Kubernetes waste across any Kubernetes resource.
The Finout agentless Kubernetes integration consists of three steps:
Configure an S3 bucket to which the Prometheus Kubernetes data will be exported. In this step, it is also required to set read-only permissions to Finout to read your Prometheus files.
Note: It is highly recommended to use the same S3 bucket and permissions that are set for your AWS Cost and Usage Report (CUR) files.
Add a policy to a K8s node worker to enable the CronJob to write permissions to your S3 bucket.
Create a CronJob to export the data from Prometheus to your S3 bucket.
The S3 bucket consists of the K8s data extracted from Prometheus. This bucket will have read-only permissions for Finout.
Important: If you have already connected an S3 bucket to Finout for the AWS Cost and Usage Report (CUR), you can use the same S3 bucket with the same ARN role already configured. The role and bucket information are automatically filled out in the Finout console. If you want to configure a different S3 bucket or haven’t set up one yet, go to Grant Finout Access to an S3 bucket.
If you have more than one cluster, repeat steps 2 and 3 for all of your clusters.
Attach the following policy to the K8s worker node role. This enables the CronJob to write the K8S data from Prometheus into your S3 Bucket.
Start a version of kube-state-metrics (most likely 2.0.0+). To fetch data from the Prometheus-kube-state-metrics daemon set, include the following flag to export all labels to Prometheus (or change the pattern to your required pattern):
--metric-labels-allowlist=pods=[*],nodes=[*]
Do this by adding an arg to the kube-state-metrics container, for example:
spec:
containers:
args:
--port=8080
--metric-labels-allowlist=pods=[*
],nodes=[*
]
Copy the CronJob configuration below to a file (for example, cronjob.yaml) This is an example of a CronJob that schedules a Job every 30 minutes. The job queries Prometheus with a 5-second delay between queries so as not to overload your Prometheus stack.
Run the command in a namespace of your choice, preferably the one where the Prometheus stack is deployed: kubectl create -f <filename>
Trigger the job (instead of waiting for it to start):
kubectl create job --from=cronjob/finout-prometheus-exporter-job finout-prometheus-exporter-job -n <namespace>
The job automatically fetches Prometheus data from 3 days ago up to the current time.
Additional fields that can be configured to the yaml file:
QUERY_<QUERY_NAME>
Add a custom query by setting the value of this environment variable.
TIME_FRAME
Defines the time frame window for a query in seconds.
Default: 3600
(1 hour).
BACKFILL_DAYS
Specifies the number of days the exporter retrieves historical samples.
Default: 3d
.
To configure these fields, add them to the configuration file under the env
section and use the name/value format for each desired field.
For example:
Ensure that the field names and corresponding values are correctly specified to apply the desired configuration.
The K8s internal address for the Prometheus server must be structured as follows: <PROMETHEUS_SERVICE>.<NAMESPACE>.svc.cluster.local
After configuring the final step, the CronJob starts exporting data from Prometheus into your S3 bucket. After the integration, the Kubernetes cost data may take up to 24 hours to become available in your Finout account.
If this process fails, guidance is provided in the Finout console to help resolve the problem, or review the Troubleshooting FAQs as this will assist you to easily fix any errors.
If you want to export the Kubernetes data into a different S3 Bucket than the one with the AWS Cost and Usage Report (CUR) files, follow the steps below:
Go to your newly created role.
Click Add permissions and select Create inline policy.
Select JSON, and paste the following (change the <BUCKET_NAME> to the bucket you created or to your existing CUR bucket):
Click Next until the review screen appears, and name the policy finest-access-policy_K8S.
Click Create policy to create your policy for the IAM role.
Fill in all the details in the Finout console.
The following troubleshooting FAQs explain common integration issues and how to easily resolve them.
Make sure that the relevant read permissions for Finout were created. If the account that writes the Prometheus files into the S3 bucket is not the same account that created the ARN role for Finout, generate a new ARN role for Finout from the former account (that is, the account that writes into the S3 bucket) and send it to Finout, following the instructions below.
If you already created a trust policy for Finout when setting up your AWS connection, you should use the same trust policy (Read more about how to Grant Finout access to your S3 bucket).
Use the following policy for your ARN role:
To ensure your Kubernetes monitoring is compatible with Finout, you need a Prometheus-compatible system (such as VictoriaMetrics, Thanos, Cortex, M3) and a functioning Kubernetes cluster for deploying the Finout cronjob. Follow these integration steps:
Validate that your system exports the correct metrics, including memory, CPU, network usage, and node info.
Deploy the Finout cronjob in your Kubernetes cluster.
Configure connectivity with the necessary environment variables.
Ensure Finout has the required permissions to access your metrics export.
For detailed instructions and requirements, refer to the Finout integration documentation.
Apply the correct policy on the node worker or the pod. See instructions here.
After updating or adjusting Prometheus configurations, memory metrics may no longer appear in Finout dashboards.
Immediately following the identification of this issue, ensure you’re familiar with the best practices for configuring and reloading Prometheus by consulting the official Prometheus documentation.
Verify the recording rule: Check that the "node_namespace_pod_container:container_memory_working_set_bytes"
recording rule is correctly set up in your Prometheus environment.
Add or update the rule: If the rule is absent or improperly configured, add or amend it within your Prometheus configuration settings.
Reload the configuration: Implement the new or modified rule by reloading Prometheus’s configuration.
Access the configuration: Open the Prometheus configuration, either directly through the file or via the UI.
Edit the rules: In the recording rules section, verify the presence of "node_namespace_pod_container:container_memory_working_set_bytes"
. If it’s not there, insert or correct it.
Apply the changes: Reload or restart Prometheus to apply and activate the rule.
This approach ensures you directly address the issue with missing memory metrics in Finout, supported by a foundational understanding of Prometheus configuration and management principles.
When all clusters share the same Prometheus disk space settings, larger clusters can quickly use up disk space due to their higher metrics output, limiting data retention to a few hours. This can lead to the Finout exporter timing out, particularly when fetching the previous day's metrics. To address this, adjust the metric retention settings according to each cluster's size and output, ensuring sufficient disk space for longer data retention.
When integrating with Finout, you can effectively monitor different Kubernetes clusters using specific monitoring tools, such as Datadog and Prometheus, all within the same AWS environment. This method supports the development of customized monitoring strategies that cater to each cluster's particular needs and budgetary considerations. For example, a cluster dedicated to customer-facing applications might benefit from Datadog's sophisticated capabilities, while a cluster utilized for internal operations might find Prometheus to be a more budget-friendly option.
To ensure successful integration with Finout in such a setup:
Verify that each cluster operates independently without shared node resources.
Configure each cluster's monitoring tool properly to ensure seamless data capture and integration.
Deployments may not appear in the monitoring dashboard if specific metrics are missing due to retention policies or configurations not capturing them. It's crucial to ensure all essential metrics are retained. Reviewing and adjusting the configuration to capture and retain these metrics can improve visibility and ensure comprehensive monitoring.
The metrics exporter might face out-of-memory (OOM) errors with insufficient resources, especially when dealing with a lot of data. This limitation can prevent the exporter from functioning properly. Allocating more resources based on its needs can solve this. By fine-tuning memory allocation to match the data workload, the exporter can run smoothly and without breaks.
To address performance issues with the exporter, reduce the values for TIME_FRAME
and/or BACKFILL_DAYS
. Lowering these values decreases the data volume processed, improving speed and reliability. See the CronJob Yaml configuration example.
Use QUERY_<QUERY_NAME>
to add a custom query. Enter the desired metric name as the value to start collecting it. See the CronJob Yaml configuration example.
Still need help? Please feel free to reach out to our team at support@finout.io.