Troubleshooting
Troubleshooting helps you identify and resolve common issues that may occur when setting up or maintaining Finout’s Prometheus integrations across both per-cluster and Centralized Prometheus Monitoring tools setups. It provides guidance on diagnosing data gaps, configuration errors, and connectivity problems to ensure your Kubernetes cost data is exported and processed correctly. If you need additional assistance, please contact Finout at [email protected].
What should I do if Finout cannot assume the AWS role?
Review the IAM role created for Finout to verify both the trust policy and that the external ID matches the one provided by Finout.
What should I do if Finout cannot access the provided S3 bucket?
Review the relevant policy on the bucket according to the documentation and verify the Cost Center Configured region.
What should I do if Finout cannot read from my S3 bucket?
Ensure that the necessary read permissions for Finout are in place.
If the account that writes the Prometheus files to the S3 bucket differs from the account that created the Finout IAM role, generate a new ARN role from the account that owns the S3 bucket and shares the Prometheus files. Then, send the new ARN role to Finout following the instructions below.
If you already created a trust policy for Finout when setting up your AWS connection, you should use the same trust policy (Read more about how to grant Finout access to your S3 bucket).
Use the following policy for your ARN role:
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"tag:GetTagKeys"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*"
],
"Resource": "arn:aws:s3:::<BUCKET_NAME>/*"
},
{
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*"
],
"Resource": "arn:aws:s3:::<BUCKET_NAME>"
}
]
}What should I do if the CronJob cannot find the Prometheus host?
Ensure that the host and port are correct (using kubectl get services) and configure these as the HOSTNAME / PORT ENV VARS in the CronJob.
What steps should I take if my CronJob can't write to the S3 bucket?
Apply the correct policy on the node worker or the pod.
What should I do if memory metrics no longer appear in Finout dashboards?
After updating or adjusting Prometheus configurations, memory metrics may no longer appear in Finout dashboards.
Immediately following the identification of this issue, ensure you’re familiar with the best practices for configuring and reloading Prometheus by consulting the official Prometheus documentation.
Verify the recording rule: Check that the "
node_namespace_pod_container:container_memory_working_set_bytes" recording rule is correctly set up in your Prometheus environment.Add or update the rule: If the rule is absent or improperly configured, add or amend it within your Prometheus configuration settings.
Reload the configuration: Implement the new or modified rule by reloading Prometheus’s configuration.
Access the configuration: Open the Prometheus configuration, either directly through the file or via the UI.
Edit the rules: In the recording rules section, verify the presence of "
node_namespace_pod_container:container_memory_working_set_bytes". If it’s not there, insert or correct it.Apply the changes: Reload or restart Prometheus to apply and activate the rule.
What causes the Finout exporter to time out on large Kubernetes clusters, and how can I resolve it?
When all clusters share the same Prometheus disk space settings, larger clusters can quickly consume disk space due to their higher metric output, limiting data retention to a few hours. This can lead to the Finout exporter timing out, particularly when fetching the previous day's metrics. To address this, adjust the metric retention settings according to each cluster's size and output, ensuring sufficient disk space for longer data retention.
What should I do if the metrics exporter runs out of memory?
The metrics exporter may encounter out-of-memory (OOM) errors due to insufficient resources, particularly when handling large amounts of data. This can prevent the exporter from functioning properly. Allocating additional resources based on its workload helps resolve the issue. By fine-tuning memory allocation to match the data volume, the exporter can run smoothly without interruptions.
What should I do if the exporter is slow or failing due to high load?
To address performance issues with the exporter, reduce the values for TIME_FRAME and/or BACKFILL_DAYS. Lowering these values decreases the data volume processed, improving speed and reliability.
How can I collect additional metrics that Finout doesn’t include by default?
Use QUERY_<QUERY_NAME> to add a custom query. Enter the desired metric name as the value to start collecting it.
For more information, please see the FAQ section.
Last updated
Was this helpful?