Skip to main content

BI Project Scenario

Scenario

You are a BI analyst for a grocery store chain monitoring dietary trends that influence in‑store purchases. Your company wants you to analyze which types of Hass avocados are purchased most often. Avocados are categorized into four sizes—small, medium, large, and extra large—and each sale includes the average price, total volume, and date.

Using this dataset, you will create a historical table to demonstrate how partitions and clusters work in BigQuery. Your goal is to answer the question:

What is the distribution of avocado sales from 2015 to 2021?

Create a Baseline Table (No Partition, No Cluster)

Start by creating a new table without partitions or clustering. This baseline will help you compare performance later. Name the table avocados.

BigQuery interface showing SQL editor

After running the SQL, your table should look like this:

BigQuery table preview for baseline avocado table

Create a Partitioned Table

Next, create a table partitioned by an integer range representing the years 2015–2022. Name this table avocados_partitioned.

Return to the SQL editor, delete the previous query, paste the new SQL, and click Run.

BigQuery partition creation example

Your partitioned table should now appear like this:

BigQuery partitioned table preview

Create a Partitioned and Clustered Table

Now create a table that is both partitioned by year and clustered by type. Name this table avocados_clustered.

BigQuery clustered table creation

The resulting clustered table should look like this:

BigQuery clustered table preview

Query the Tables and Compare Performance

Query the Non‑Partitioned Table

Querying non-partitioned table in BigQuery

Query the Partitioned Table

Querying partitioned table in BigQuery

Query the Partitioned and Clustered Table

Querying partitioned and clustered table in BigQuery

Visualize the Results

Number of Avocados Sold per Year

Chart showing number of avocados sold per year

Total Volume per Year

Chart showing total avocado volume per year

Average Price per Year

Chart showing average avocado price per year

Conclusion

By creating baseline, partitioned, and partitioned‑clustered tables, you can clearly observe how BigQuery improves performance through efficient data pruning and clustering. Partitioning reduces scanned data, while clustering improves filtering and sorting within partitions.

This workflow is essential for BI professionals working with large datasets. For more foundational concepts, explore:

With partitions and clusters, you can deliver faster insights, reduce costs, and scale your analytical workflows efficiently.

Comments

Popular posts from this blog

Alfred Marshall – The Father of Modern Microeconomics

  Welcome back to the blog! Today we explore the life and legacy of Alfred Marshall (1842–1924) , the British economist who laid the foundations of modern microeconomics . His landmark book, Principles of Economics (1890), introduced core concepts like supply and demand , elasticity , and market equilibrium — ideas that continue to shape how we understand economics today. Who Was Alfred Marshall? Alfred Marshall was a professor at the University of Cambridge and a key figure in the development of neoclassical economics . He believed economics should be rigorous, mathematical, and practical , focusing on real-world issues like prices, wages, and consumer behavior. Marshall also emphasized that economics is ultimately about improving human well-being. Key Contributions 1. Supply and Demand Analysis Marshall was the first to clearly present supply and demand as intersecting curves on a graph. He showed how prices are determined by both what consumers are willing to pay (dem...

Unlocking South America's Data Potential: Trends, Challenges, and Strategic Opportunities for 2025

  Introduction South America is entering a pivotal phase in its digital and economic transformation. With countries like Brazil, Mexico, and Argentina investing heavily in data infrastructure, analytics, and digital governance, the region presents both challenges and opportunities for professionals working in Business Intelligence (BI), Data Analysis, and IT Project Management. This post explores the key data trends shaping South America in 2025, backed by insights from the World Bank, OECD, and Statista. It’s designed for analysts, project managers, and decision-makers who want to understand the region’s evolving landscape and how to position themselves for impact. 1. Economic Outlook: A Region in Transition According to the World Bank’s Global Economic Prospects 2025 , Latin America is expected to experience slower growth compared to global averages, with GDP expansion constrained by trade tensions and policy uncertainty. Brazil and Mexico remain the largest economies, with proj...

Kickstart Your SQL Journey with Our Step-by-Step Tutorial Series

  Welcome to Data Analyst BI! If you’ve ever felt overwhelmed by rows, columns, and cryptic error messages when trying to write your first SQL query, you’re in the right place. Today we’re launching a comprehensive SQL tutorial series crafted specifically for beginners. Whether you’re just starting your data career, pivoting from another field, or simply curious about how analysts slice and dice data, these lessons will guide you from day zero to confident query builder. In each installment, you’ll find clear explanations, annotated examples, and hands-on exercises. By the end of this series, you’ll be able to: Write efficient SQL queries to retrieve and transform data Combine multiple tables to uncover relationships Insert, update, and delete records safely Design robust database schemas with keys and indexes Optimize performance for large datasets Ready to master SQL in a structured, step-by-step way? Let’s explore the full roadmap ahead. Wh...