Data Explorer

Data Explorer Overview

The Data Explorer feature is a powerful insight-driven tool for data analysis, designed to provide users with maximum flexibility and comprehensiveness when it comes to multi-dimensional report generation. With the ability to create reports that aggregate data across various fields, users can tailor their reports to reflect the exact cost measurements and dimensions they need.

Data Explorer is unique due to its explorative capabilities, which allow for the aggregation of complex data tailored to select cost measurements and dimensions, whether on a daily, weekly, or monthly basis. This level of granularity enables users to gain deeper insights and a more accurate understanding of their costs, ultimately empowering them to make more informed decisions.

Use cases

The Data Explorer feature can be utilized in various scenarios, including but not limited to:

Assess Resource IDs, allowing for an in-depth look at various aspects such as usage types and product families, enhanced by the ability to filter results according to specific services.
Track and analyze the cost-related aspects of sub-services, offering insights into the utilization of resources like S3 storage and NAT gateway, measured in hours and gigabytes.
Identify untagged data, providing essential information to guide the tagging process, which is crucial for improved data management and categorization within an organization.

Measurements vs. Dimensions

Measurements- Numerical values or quantitative data used to represent a particular aspect of the analyzed data. These can include numbers such as cost types, and usage.

Dimensions- Categories or classifications that help organize and group related measurements. These can include the service type, region, or any tags. Dimensions provide context to the measurements and allow for deeper analysis and understanding of the data. Dimensions affect the level of detail in the view.

Creating a New Data Explorer

Navigate to Data Explorer in the navigation bar on the left side of the console.
Click New Data Explorer. The Create new explorer side window appears.
Name: Provide a name for your report.
Description (Optional): Include a description for your report.
Time Frame and Interval: Define the relevant dates for your report data and the interval (day, week, or month).
Select filters (Optional): Filter the new Data Explorer using the MegaBill components, such as services, cost centers, or Virtual Tags.
Dimensions (Optional): Add dimensions to a new Data Explorer by using the MegaBill components, such as services, cost centers, or Virtual Tags.
Select measurements (Optional): Define measurements-
- Cost/Usage Data: Add the cost and usage type and choose from Sum, Average, Minimum, or Maximum. To add another measurement, click .
- Dimensional Analysis (Optional): Count the number of unique values within a specific dimension or count the total number of entries or occurrences within a dimension, including duplicates. For a full explanation, see the dimension analysis section. To add another dimension, click .
- Finout predefined queries: Predefined queries are prebuilt, standardized queries designed to streamline data retrieval and analysis. Created to address common data needs or reporting requirements, they allow for quick access to specific data sets or insights without the need to craft complex queries from scratch. For a full explanation, see the predefined queries section.
- Additional Billing Metrics (Optional): Select additional billing metrics that provide additional info on Saving and Reservation plans from the CUR. See AWS documentation for more details on savings plans.
Columns management (Optional): Reorder or rename selected fields as desired. Drag measurements up or down by clicking on the dots beside each measurement.
Order by: Set the order of the report based on selected measurements.
Click Save to generate the report. Once saved, the report will be displayed with all the selected data.

Working with Data Explorer

Data Explorer List

Navigate to Data Explorer.
Edit, Duplicate, and Delete a Data Explorer analysis.

Data Explorer Analysis

Navigate to Data Explorer.
Choose a Data Explorer analysis. The Data Explorer appears.
You can manage your data explorer by optionally performing the following actions:
- Download CSV: Select one of the following two options and click Export.
  - Top Results Export: Instantly export a CSV file with the top 10 thousand rows. or
  - Full Data Export (Coming Soon): Generate and send a detailed report of your endpoint with up to 4 full months of data.
    Note: It may take a full minute to generate a report.
    Choose an Endpoint. See Endpoints for more information.
    Time Frame -The default time frame is based on your report, but you can adjust it to any time range you prefer.
- Delete Data Explorer: Delete the data explorer.
- Duplicate Data Explorer: Duplicate the data explorer.
- Edit Data Explorer: Make the necessary edits and click Save. For more information, see Creating a New Data Explorer.

Dimension Analysis

In Dimension Analysis, two key concepts are used to gain insights into your data:

Count Distinct Dimensions: Counts the number of unique values within a specific dimension. Use Case #1: Analyzing user tags - Counting distinct dimensions shows how many unique tags exist across all resources. This helps you understand the diversity and categorization within that dimension. Use Case #2: Determine the distinct number of resources for each service per day- Identify the various values for each user tag and check the number of instance types for EC2/RDS or the number of regions per account.
Count Dimensions: Counts the total number of entries or occurrences within a dimension, including duplicates.
Note: This feature is supported for every dimension in Data Explorer and is available for all accounts.
To add dimensional analysis:

Select one of the following:
- Count Dimensions - Count the values that are unique from one another based on what was selected above.
- Count - Count all the values that have been selected.
  Note: If you selected the same dimension as in the report, you will only receive one return per value.
Click Select Dimension. The Dimension window appears.
Mark the dimensions that you want to be used for the analysis and click Apply Group By.
Click to add another dimensional analysis.

Finout Predefined Queries

Predefined queries are prebuilt, standardized queries designed to streamline data retrieval and analysis. Created to address common data needs or reporting requirements, they allow for quick access to specific data sets or insights without the need to craft complex queries from scratch.

Resource Normalized Runtime

The Resource Normalized Runtime metric measures the total operational time of resources within a specified timeframe, normalized against the total hours in that period. It is calculated by dividing the total running hours of all resources by the total number of hours in the timeframe.

Note: When grouping by date, each timeframe is set to 24 hours.

Formula:

Resource Normalized Runtime = Total Running Hours / Total Hours in Timeframe
Numerator (Total Running Hours): The cumulative sum of hours all queried resources were active during the specified timeframe.
Denominator (Total Hours in Timeframe): The total number of hours within the specified timeframe.
Note: When grouping by date, this is set to 24 hours.

Example Calculations:

Scenario 1: 5 Resources Over 3 Days

Date

Total Running Hours

Total Hours in Timeframe

Resource Normalized Runtime

2024-07-28

3.29

2024-07-29

3.54

2024-07-30

3.08

2024-07-28:

Total Running Hours: 79 hours (cumulative for all 5 resources)

Total Hours in Timeframe: 24 hours

Resource Normalized Runtime: 79 / 24 = 3.29

Scenario 2: 3 Resources Over 2 Days with Resource IDs

Date

Resource ID

Total Running Hours

Total Hours in Timeframe

Resource Normalized Runtime

2024-07-28

i-abc123

1.25

2024-07-28

i-def456

1.67

2024-07-29

i-ghi789

1.46

i-abc123:

Total Running Hours: 30 hours

Total Hours in Timeframe: 24 hours

Resource Normalized Runtime: 30 / 24 = 1.25

Scenario 3: 4 Resources Over 1 Day

Date

Total Running Hours

Total Hours in Timeframe

Resource Normalized Runtime

2024-08-01

1.00

2024-08-01:

Total Running Hours: 24 hours (cumulative for all resources running the entire day)

Total Hours in Timeframe: 24 hours

Resource Normalized Runtime: 24 / 24 = 1.00

Scenario 4: Aggregated Resources Over Multiple Days Without Date Column

Resource ID

Total Running Hours

Total Hours in Timeframe

Resource Normalized Runtime

i-abc123

120

240

0.50

i-def456

200

240

0.83

i-ghi789

160

240

0.67

i-jkl012

240

0.33

i-abc123:

Total Running Hours: 120 hours

Total Hours in Timeframe: 240 hours (10 days x 24 hours/day)

Resource Normalized Runtime: 120 / 240 = 0.50

Scenario 5: Aggregated Total Running Hours Without Date or Resource ID

Total Running Hours

Total Hours in Timeframe

Resource Normalized Runtime

2100

600

3.50

2400

600

4.00

3000

600

5.00

1800

600

3.00

Entry 1:

Total Running Hours: 2100 hours

Total Hours in Timeframe: 600 hours

Resource Normalized Runtime: 2100 / 600 = 3.50

Resource Normalized Runtime vCPU

Resource Normalized Runtime vCPU: This metric calculates the total normalized vCPU runtime used by resources over a specified timeframe. It is determined by multiplying the total running hours of each resource by its vCPU count and dividing by the total number of hours in the timeframe.

Note: When grouping by date, each time frame is set to 24 hours.

Formula:

Resource Normalized Runtime vCPU = Σ (Running Hours x vCPU) / Total Hours in Timeframe
Numerator: The cumulative sum of the product of running hours and vCPU count for each resource.
Denominator (Total Hours in Timeframe): The total number of hours within the specified timeframe. When grouping by date, this is set to 24 hours.

Example Calculations:

Scenario 1: Multiple Resources with vCPUs Grouped by Day

Date

Resource ID

vCPU

Running Hours

Normalized Runtime vCPU

2024-07-28

i-12345678

5.00

2024-07-28

i-12345677

3.00

2024-07-28

i-12345676

2.00

2024-07-28

i-12345675

1.42

2024-07-28

i-12345674

13.33

i-12345678:

Calculation: (5 x 24) / 24 = 5.00

The resource consumed 5.00 vCPU days.

Scenario 2: Mixed vCPU Resources Over 2 Days

Date

Resource ID

vCPU

Running Hours

Normalized Runtime vCPU

2024-07-28

i-98765432

6.00

2024-07-28

i-87654321

1.67

2024-07-29

i-76543210

0.42

i-98765432:

Calculation: (8 x 18) / 24 = 6.00

The resource consumed 6.00 vCPU days.

Scenario 3: Aggregated vCPU Resources Over Multiple Days

Date

Resource ID

vCPU

Running Hours

Normalized Runtime vCPU

2024-07-30

i-55555555

16.67

2024-07-30

i-44444444

5.00

2024-07-31

i-33333333

1.88

i-55555555:

Calculation: (10 x 40) / 24 = 16.67

The resource consumed 16.67 vCPU days.

Scenario 4: vCPU Resources Without Date or Resource ID

vCPU

Running Hours

Total Hours in Timeframe

Normalized Runtime vCPU

200

600

2.67

150

600

1.00

240

600

2.40

Entry 1:

Calculation: (8 x 200) / 600 = 2.67

The resources consumed 2.67 vCPU days in the timeframe.

EBS Running Hours

This metric calculates the total running hours of EBS resources within a specified timeframe and tracks the cumulative hours that they have been active.

Formula:

-EBS Running Hours = Sum of Running Hours for EBS Resources in a selected timeframe.

-Total Running Hours: The cumulative sum of hours that individual EBS resources were active within the selected timeframe.

- Per resource: The total hours for each resource are summed based on the timeframe and aggregation type.

- For grouped resources: When multiple resources are grouped, their running hours are summed together to give the total running hours.

Example Calculations:

Scenario 1: Single Resource Over 2 Days, Group by Day

Date

Total Running hours

2024-08-01

2024-08-02

For 2 days:

Total Running Hours for August 1: 22 hours
Total Running Hours for August 2: 18 hours

The total running hours over these two days for a single resource is 40 hours.

Scenario 2: 3 Resources Over 4 Days, Group by Day

Date

Total Running Hours

2024-09-05

2024-09-06

2024-09-07

2024-09-08

For 4 days with 3 resources:

Total Running Hours for September 5 for all 3 resources: 65 hours
Total Running Hours for September 6 for all 3 resources: 72 hours
Total Running Hours for September 7 for all 3 resources: 80 hours
Total Running Hours for September 8 for all 3 resources: 78 hours

The total running hours over these four days is 295.

Scenario 3: 5 Resources Over 1 Month (August)

Date Range

Total Running Hours

2024-08-01 to 2024-08-31

3,000 hours

The total running hours for August for all 5 resources is 3000 hours.

S3 Number of Objects

The S3 Number of Objects metric calculates the average number of objects stored in your S3 buckets from the last 24 hours of a selected timeframe. This metric is derived from CloudWatch data and helps monitor object count trends in your S3 buckets, providing insights into storage usage.

Note: The resource ID will be automatically added to your Data Explorer report when selecting this predefined query.

Scenario 1: Number of Objects in S3 Buckets, Broken Down by Virtual Tag Teams

You want to create a report of the number of S3 objects in a bucket over April.

Note: The S3 Number of Objects metric calculates the average number of objects stored in your S3 buckets from the last 24 hours of the selected timeframe. For example, in the table below, Bucket001, Team A, had an average of 1,500,000 objects on April 30th.

Resource ID

Team

Number of S3 Objects

Bucket 001

Team A

1,500,000

Bucket 001

Team B

800,000

Bucket 002

Team A

1,100,000

Bucket 003

Team C

500,000

Bucke t003

Team B

900,000

Scenario 2: Number of Objects in S3 Buckets in Q1 2024, Broken Down by Month

You want to create a report for Q1 with object counts in monthly intervals to monitor the growth of data in your S3 buckets over time.

Resource ID

Month

Number of S3 Objects

Bucket 001

January

1,200,000

Bucket 002

January

900,000

Bucket 001

February

1,500,000

Bucket 003

February

600,000

Bucket 002

March

1,000,000

Bucket 003

March

800,000

Scenario 3: Number of Objects in S3 Buckets, Broken Down by Virtual Tag Teams and by Week

You want to create a report for the previous month with weekly intervals, showing the number of objects in each S3 bucket, broken down by virtual tag teams.

Resource ID

Team

Week

Number of S3 Objects

Bucket 10

Team A

Week 1

11,000

Bucket 10

Team A

Week 2

8,800

Bucket 10

Team B

Week 1

3,000

Bucket 20

Team A

Week 3

21,800

Bucket 20

Team B

Week 2

16,700

Daily Actual Storage

Daily Actual Storage transforms AWS's billing-focused GB-Month reporting into actionable, day-by-day storage insights. While AWS Cost & Usage Reports (CUR) are perfect for monthly invoicing, they create a blind spot for engineering teams who need to see what's happening daily on their storage infrastructure.

Daily actual storage shows how many GB were in your storage for a particular day, and not the normalized daily GB (1GB / day per month) from the CUR. Finout converts these normalized daily GB back into precise daily GB measurements, revealing the actual storage footprint on your daily storage (S3, EBS Snapshots, EFS, etc). Due to the varying number of days each month, the CUR may show artificial daily spikes or drops at the start and end of the month. Using actual daily storage values keeps the data consistent and reflects the actual GB storage.

Note: The Normalized Storage column only returns values when GB-Mo is selected as the unit. If any other unit is used, the column will display 0.

Daily Usage Example

If you store 1 GB of data for just 1 day, that’s about 1/30 of a month:

CUR daily storage data: 1 GB × (1 ÷ 30) = 0.033 GB-Mo Since you used only a fraction of the month, you’re charged only a fraction of the cost. This is a pro-rated charge; you only pay for what you use.
Finout Daily Actual Storage: 1 GB stored for a whole month = 1 GB-Mo

FAQs

Question: What is the difference between the object count from CloudWatch and the object count from AWS CUR.

Answer: The CUR alone does not provide object-level or detailed bucket insights without integrating other AWS tools such as S3 Storage Lens (with advanced metrics) or S3 Inventory. Without these, the CUR will only show aggregated S3 storage usage and costs.

Question: How can I get object count data per S3 bucket in the AWS CUR?

Answer: To obtain object count data per S3 bucket in the CUR, you need additional services:

- S3 Storage Lens with advanced metrics- This is required for detailed usage insights.

- S3 Inventory: Provides detailed reports at the object level but does not integrate directly into the CUR.

PreviousFinancial Plans NextOptimize

Last updated 1 month ago

Was this helpful?