Relational Virtual Tags
Last updated
Last updated
Relational Virtual Tag enables the breakdown of shared infrastructure costs across multiple dimensions from a single telemetry while preserving the relationships between them.
Imagine your company is managing cloud costs through Finout’s MegaBill. The company uses various shared infrastructures, such as Airflow, which runs on shared cloud instances and runs tasks which can serve multiple teams, applications and workflows. Since resources like EC2 instances are not tied to a single team or workflow, accurately allocating costs becomes challenging.
With Relational Virtual Tags, you can establish connections between multiple attributes—such as Airflow's DAG ID, Team, and Environment—ensuring that costs are properly allocated while maintaining the relationships between these attributes. This advanced tagging method provides deeper insights into shared infrastructure usage and cost allocation.
Meet the Miller family. They track their household expenses and bills meticulously, and electricity usage is no exception. Mike relies on Finout’s MegaBill to monitor costs and consumption, which provides him with the family’s total electricity bill and usage.
One day, he noticed that their electricity costs were steadily increasing, even though their consumption habits hadn’t noticeably changed. Curious to uncover the cause, Mike began digging deeper into the data.
Electricity Usage per Room
Mike set up a telemetry and used Virtual Tag Reallocation to break down electricity usage by room. The results were revealing as the kitchen was the biggest consumer of electricity.
Electricity Usage per Device in the Kitchen
This information was insufficient to pinpoint the root cause of the high costs. Mike decided to analyze the electricity consumption of each kitchen appliance. His findings showed that the dishwasher was the biggest consumer of electricity.
Electricity Usage for the Dishwasher per Household Member
This was still not enough to pinpoint what caused the spike in the dishwasher's electric cost. He installed a motion sensor to monitor the household members’ interactions with the dishwasher and integrated this new data into his analysis. The results were surprising as Olivia, his daughter, was responsible for the highest dishwasher usage.
This family's story reflects the challenges companies face when trying to break down shared cloud infrastructure costs across multiple dimensions. It highlights how deeper data analysis and segmentation can reveal hidden cost drivers.
What is the Challenge with Virtual Tag Reallocation?
Complex and Manual Process: Breaking down costs across multiple dimensions is possible today, but it requires a lot of manual effort. Users must create multiple virtual tags, starting from the lowest level (Household Member → Device → Room).
Virtual Tag Reallocation cannot preserve relationships between dimensions:
You can determine electricity usage by device.
You can determine electricity usage by household member.
But you can’t determine the electricity usage of each device by each household member. For Example: Let's assume that the dishwasher's electricity cost is $40. Breaking it down by household member would assign random costs because regular virtual tags lack the ability to preserve the relationship between the device (the dishwasher) and its usage by each household member, making it difficult to identify the precise sources of cost discrepancies.
Imagine a company is using AWS to manage its cloud costs through Finout’s MegaBill. This company relies on a shared infrastructure, such as Airflow, which operates on cloud instances (like EC2 in AWS) to manage tasks, run DAGs, and store metadata.
Airflow, like Mike’s electricity bill, is a shared resource that requires careful allocation across multiple workflows (DAGs), teams, and developers. The challenge lies in accurately breaking down and reallocating these shared infrastructure costs.
One day, the company noticed an increase in Airflow costs and needed to pinpoint the source, just as Mike used data to understand the drivers of his electricity usage.
This is where the Relational Virtual Tag solution comes in!
Relational Virtual Tags Enables:
Streamlined Setup and Efficiency: Relational Virtual Tags replace the manual creation of multiple virtual tags with a single, unified configuration that uses one telemetry to build all the necessary building blocks to break down a shared infrastructure cost.
Dynamic Cost Allocation and Relationship Preservation: Relational Virtual Tags maintain relationships between dimensions (e.g., team, workflow, developer) and enable dynamic, proportional cost reallocation. This prevents inaccurate cost assignments, improves financial transparency, and provides deeper insights for more informed decision-making and better cost management.
For example: The company has two teams - "Data" and "App".
The company wants to determine the cost of each DAG for the App team. They aim to break down the total cost of the App team by DAG ID for better visibility and allocation.
This is the company's telemetry:
The company created a relational virtual tag, and now they filter by team “App” and group by “DAG_ID.”
Let’s assume that the App team cost is $100. The breakdown would be as follows:
By leveraging Relational Virtual Tags, the company transformed a seemingly unmanageable rise in shared infrastructure costs into clear, actionable data. This deeper level of visibility empowered the company to identify inefficiencies, distribute costs fairly across teams, and optimize their use of shared resources—just as Mike did with his electricity bill.
Spark costs are often lumped together in shared infrastructure, making attributing expenses to specific queries or projects difficult. Without proper tracking, teams struggle to pinpoint cost drivers. Relational Virtual Tags solve this by linking queries to projects, enabling precise cost reallocation.
RDS instances are shared across multiple databases, environments, and teams, making it hard to distribute costs accurately. Without visibility, production, staging, and development usage may be misallocated. Relational Virtual Tags create structured relationships, ensuring fair cost distribution across Database Names, Environments, and Teams.
ClickHouse workloads are distributed across multiple query types, clusters, and applications, making it challenging to track and allocate costs accurately. Without a structured breakdown, teams struggle to optimize resource usage and manage expenses efficiently. Relational Virtual Tags preserve the relationships between these dimensions, ensuring cost allocation across Query Type, Cluster, and Application.
Prerequisite: A telemetry must be set up before Relational Virtual Tag configuration.
To create a relational virtual tag:
In Finout, navigate to Virtual Tags.
Click Create New and then Relational Virtual Tag. The Create Relational Virtual Tag step appears.
Add a name.
Under Cost Filter Selection, select from the following:
Key type:
Select your Telemetry-Based Reallocation.
Pick the Allocation Keys: Choose the Allocation Keys from your selected telemetry. This will allow for the segmenting of the Telemetry data and the calculation of the relevant ratio based on the keys and values.
Select Telemetry filters:
Preview the Relational Virtual Tag. You can simulate the graph preview by choosing a group-by and filters.
Click Save. The relational virtual tag is created.
Result: You can view your Relational Virtual Tag keys and values in the MegaBill filters component, just like any other keys and values.
When using Relational Virtual Tag keys or values in MegaBill or any Finout object, you can only filter or group by keys and values from the same Relational Virtual Tag or a connected Virtual Tag.
A Relational Virtual Tag can handle up to 50,000 unique values, with any additional values automatically grouped under “Others.”
You can create an unlimited number of relational virtual tags.
Each relational virtual tag can break down the cost of a single infrastructure (single cost rule).
How is a Relational Virtual Tag different from a Virtual Tag with telemetry-based reallocation?
Unlike Virtual Tag Reallocation, which allocates costs based on a single key from a single telemetry, Relational Virtual Tags support multiple key allocations (up to 10) while preserving the relationships between them.
What happens if I try to group by a Relational Virtual Tag and an unrelated tag?
Grouping by a Relational Virtual Tag while filtering by an unrelated tag (e.g., Airflow DAG ID with an AWS Region) is not supported.
Can I create a Relational Virtual Tag via API?
No, Relational Virtual Tags cannot be created through the Finout API.
Is there a limit on the number of unique values a Relational Virtual Tag can handle?
There is a limit: The system supports up to 50,000 unique values, with any additional values grouped under “Others.”
Can I change a Virtual Tag and convert it to a Relational Virtual Tag?
No, you need to create a new Relational Virtual Tag.
Cloud service:
Operator:
Values:
Note: Click to add another filter.
Select a Telemetry:
Let's go back to the . Here's how the Relational Virtual Tag configuration would be set up: