Cost Archiver
Overview
Finout offers a straightforward solution for managing long-term cost data storage by allowing you to archive data in an AWS S3 bucket you own. This setup consolidates your company’s cloud spending data from various cost centers into a structured and easily accessible format.
The Archiver feature enhances data management by introducing a partitioned data structure in S3, enabling faster access with reduced processing overhead. The data format is in Parquet, which improves processing efficiency. Additionally, a manifest file is included with each archiver run, providing visibility into data export status. This design supports a scalable system capable of handling large data volumes with minimal delays while maintaining alignment with Finout’s data delivery SLAs.
Cost Archiver Workflow:
Archiver files structure
Cost data will be updated once a day in the following structure:
Archiver/<archiver_name>/v2/year=<year>/month=<month>/day=<day>/*.snappy.parquet
Cost data structure
time
Date (YYYY-MM-DDT00:00:00.000Z)
service
String
The type of service the cost is related to, for example, “EC2” (in AWS) or “Compute Engine” (in GCP).
region
String
The region the cost is associated with.
cost_center_type
String
AWS / GCP / K8s / Snowflake
costs
dict<string, double>
All the following costs: amortized_cost, net_amortized_cost, blended_cost, unblended_cost, net_unblended_cost, uncovered_cost, reservation_cost, savings_plan_cost, spot_cost
metadata
dict<string, string>
All metadata relevant to this cost line.
Manifest file structure
The new manifest file will provide transparency and allow clients to monitor data export consistency. This file will be maintained under destination-archiver/version=2/manifest.json and will document each archiver run, detailing the date and scope of data exported, run ID, and the export timestamp.
{
"latest_export_time":
"exports": [
{
"archiver_id": "<archiver_id>",
"run_id": "<external_exporter_run_id>",
"export_time": "<Current time in UTC in ISO format>",
"expored_partitions": ["<year=yyyy>/month=<mm>/day=<dd>"]
},
{
"archiver_id": "<archiver_id>",
"run_id": "<external_exporter_run_id>",
"export_time": "<Current time in UTC in ISO format>",
" expored_partitions": ["<year=yyyy>/month=<mm>/day=<dd>"]
} ,
....
]
}
Setting Up Cost Archiver
Set up Cost Archiver in Finout to efficiently store and manage your organization’s cost data in an AWS S3 bucket, enabling structured, scalable, and easily accessible long-term storage.
1. Create a New Bucket
To output cost data, Finout requires write and delete permissions to an S3 bucket. To do so, create a new bucket or a new folder within an existing S3 bucket used for your Archiver.
2. Obtain an External ID from Finout
Get the external ID to Grant Finout Access to Your Archiver Bucket.
In Finout, navigate to Settings > Cost Centers > Add cost center. The Connect Accounts page appears.
Click Connect Now.
Copy the External ID and continue to grant Finout access to your Archiver bucket (Save for step 4).
3. Grant Finout Access to your Archiver Bucket
Once the Archiver bucket is created, grant Finout access to the bucket by creating an IAM role. This can be done manually or by using CloudFormation.
Prerequisite: Obtain an External ID from Finout. (Step 2)
To grant access using CloudFormation:
Add the necessary information above, as well as the “external-id” (obtained in step 2) and the bucket name created for your Archiver (step 1).
Click Next and then Submit. You are brought to the Stack details page.
Click Output and copy the ARN IAM role value to add in Finout (step 4).
To grant access manually:
Navigate to creating a new cross-account role in IAM , sign in the your AWS account, and then create a role for another AWS account.
In the account ID, enter: 277411487094.
Paste the External ID and enter the “
external-id
” (Obtained in step 2).Click Next. The Review step appears.
Add a Role name:
FinoutArchiverRole
and then configure the role. A new role is created.Go to your new role in Summary.
Copy and save the Role ARN.
Click on Add permissions and choose Create inline policy. Choose JSON format and paste the following JSON:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:PutObject", "s3:PutObjectAcl", "s3:ListMultipartUploadParts", "s3:ListBucket", "s3:DeleteObject" ], "Resource": [ "arn:aws:s3:::finout-data-archiver/*", "arn:aws:s3:::finout-data-archiver" ] } ] }
Click Next until the Review step, and name it
finout-access-policy
.Click Create policy. Your IAM role is finalized and created.
4. Adding AWS Bucket Details to Finout
After creating your bucket in AWS and granting Finout access, send the following bucket details to Finout support.
Role ARN -The Amazon Resource Name (ARN) specifies the role.
Bucket Name -This is the name under which Finout will write archiver data.
S3 Path Prefix - This is the folder in S3 in which the archiver files are written.
FAQs
How does Finout ensure data accuracy and reliability with the manifest file?
Finout ensures data accuracy and reliability through automated systems that monitor and reprocess any failed exports, eliminating the need for manual intervention. While the manifest file enhances transparency by providing visibility into data export processes, Finout’s robust validation mechanisms ensure consistent and accurate data without requiring clients to check the manifest manually.
How does Cost Archiver handle data updates over time?
Cost Archiver is designed to handle retrospective data updates, which are common with sources like AWS CUR that may reflect changes for several months. It ensures that past data is updated to include late adjustments, with the manifest file clearly indicating which dates have been modified. Finout’s automation ensures data completeness and reliability throughout this process.
What mechanisms are in place for maintaining data consistency with partitioned files?
Cost Archiver performs automatic checks to identify and update partitioned files whenever source data changes. Finout updates relevant partitions and logs these updates in the manifest file. This approach guarantees consistency and integrity across the partitioned data structure.
Last updated
Was this helpful?