Skip to main content

Part III: Combining Data Across Tables in SQL

 


In real-world databases, data is rarely confined to a single table. To unlock deeper insights, you need to learn how to merge and compare datasets stored in different tables. In this third installment of our SQL tutorial series on Data Analyst BI, we’ll explore:

  • JOIN Operations: Linking rows from multiple tables based on related columns

  • Set Operations: Merging result sets via UNION, INTERSECT, and EXCEPT

Mastering these techniques will enable you to build richer reports, enforce data integrity, and answer complex business questions.

JOIN Operations

JOINs are the cornerstone of combining data across tables. By matching rows on key columns, you can assemble comprehensive views of your data.

1. INNER JOIN

Returns only rows that have matching keys in both tables.

sql
SELECT o.order_id,
       c.customer_name,
       o.order_date
FROM orders o
INNER JOIN customers c
  ON o.customer_id = c.customer_id;

Use INNER JOIN when you need records that exist in both tables—e.g., orders with valid customers.

2. LEFT (OUTER) JOIN

Returns all rows from the left table and matching rows from the right. Non‐matching right‐table values become NULL.

sql
SELECT c.customer_name,
       o.order_id
FROM customers c
LEFT JOIN orders o
  ON c.customer_id = o.customer_id;

Ideal for finding all customers—even those without orders.

3. RIGHT (OUTER) JOIN

Symmetric to LEFT JOIN: all rows from the right table plus any matching left-table rows.

sql
SELECT p.product_id,
       s.supplier_name
FROM products p
RIGHT JOIN suppliers s
  ON p.supplier_id = s.supplier_id;

Use RIGHT JOIN when you want every supplier listed, regardless of whether they supply a product.

4. FULL (OUTER) JOIN

Combines LEFT and RIGHT JOIN: returns every row from both tables, with NULLs where no match exists.

sql
SELECT a.id AS id_a,
       b.id AS id_b
FROM table_a a
FULL OUTER JOIN table_b b
  ON a.id = b.id;

Great for reconciling two datasets and spotting unmatched records on either side.

5. CROSS JOIN

Produces a Cartesian product: every row in table A paired with every row in table B.

sql
SELECT c.customer_name,
       p.product_name
FROM customers c
CROSS JOIN products p;

Use sparingly—only when you need all possible combinations (e.g., creating a calendar of dates × products).

6. SELF JOIN

Joins a table to itself, usually to compare rows within the same table.

sql
SELECT e1.employee_name AS manager,
       e2.employee_name AS subordinate
FROM employees e1
JOIN employees e2
  ON e1.employee_id = e2.manager_id;

Ideal for hierarchical data like organizational charts.

Set Operations

While JOINs combine columns, set operations merge entire result sets. They’re useful when you want to unify, intersect, or exclude data across similar queries.

1. UNION vs. UNION ALL

  • UNION removes duplicate rows

  • UNION ALL retains duplicates and is faster

sql
SELECT email FROM newsletter_subscribers
UNION
SELECT email FROM customers;

Merge two email lists without repeats. Use UNION ALL if you need to count duplicates.

2. INTERSECT

Returns only rows common to both queries.

sql
SELECT customer_id FROM high_value_customers
INTERSECT
SELECT customer_id FROM churn_risk_list;

Identify customers who are both high‐value and at risk of churn.

3. EXCEPT (or MINUS)

Subtracts the second result set from the first. In some databases, use MINUS instead of EXCEPT.

sql
SELECT employee_id FROM all_employees
EXCEPT
SELECT employee_id FROM terminated_employees;

List active employees by removing terminated ones.

Best Practices and Performance Tips

  • Always specify column lists instead of SELECT * in JOINs to reduce I/O.

  • Index the columns used in ON clauses and set operations for faster joins and deduplication.

  • For large datasets, prefer UNION ALL over UNION when deduplication isn’t needed.

  • When multiple joins are required, join smaller tables first to minimize intermediate result sizes.

Combining tables with JOIN operations and set operations transforms isolated datasets into strategic assets. In the next module, we’ll cover data modification—INSERT, UPDATE, and DELETE—so you can not only read but also write and maintain your databases. Stay tuned!

Comments

Popular posts from this blog

Alfred Marshall – The Father of Modern Microeconomics

  Welcome back to the blog! Today we explore the life and legacy of Alfred Marshall (1842–1924) , the British economist who laid the foundations of modern microeconomics . His landmark book, Principles of Economics (1890), introduced core concepts like supply and demand , elasticity , and market equilibrium — ideas that continue to shape how we understand economics today. Who Was Alfred Marshall? Alfred Marshall was a professor at the University of Cambridge and a key figure in the development of neoclassical economics . He believed economics should be rigorous, mathematical, and practical , focusing on real-world issues like prices, wages, and consumer behavior. Marshall also emphasized that economics is ultimately about improving human well-being. Key Contributions 1. Supply and Demand Analysis Marshall was the first to clearly present supply and demand as intersecting curves on a graph. He showed how prices are determined by both what consumers are willing to pay (dem...

Unlocking South America's Data Potential: Trends, Challenges, and Strategic Opportunities for 2025

  Introduction South America is entering a pivotal phase in its digital and economic transformation. With countries like Brazil, Mexico, and Argentina investing heavily in data infrastructure, analytics, and digital governance, the region presents both challenges and opportunities for professionals working in Business Intelligence (BI), Data Analysis, and IT Project Management. This post explores the key data trends shaping South America in 2025, backed by insights from the World Bank, OECD, and Statista. It’s designed for analysts, project managers, and decision-makers who want to understand the region’s evolving landscape and how to position themselves for impact. 1. Economic Outlook: A Region in Transition According to the World Bank’s Global Economic Prospects 2025 , Latin America is expected to experience slower growth compared to global averages, with GDP expansion constrained by trade tensions and policy uncertainty. Brazil and Mexico remain the largest economies, with proj...

Kickstart Your SQL Journey with Our Step-by-Step Tutorial Series

  Welcome to Data Analyst BI! If you’ve ever felt overwhelmed by rows, columns, and cryptic error messages when trying to write your first SQL query, you’re in the right place. Today we’re launching a comprehensive SQL tutorial series crafted specifically for beginners. Whether you’re just starting your data career, pivoting from another field, or simply curious about how analysts slice and dice data, these lessons will guide you from day zero to confident query builder. In each installment, you’ll find clear explanations, annotated examples, and hands-on exercises. By the end of this series, you’ll be able to: Write efficient SQL queries to retrieve and transform data Combine multiple tables to uncover relationships Insert, update, and delete records safely Design robust database schemas with keys and indexes Optimize performance for large datasets Ready to master SQL in a structured, step-by-step way? Let’s explore the full roadmap ahead. Wh...