Joins illustration

New to this blog? Start the Intro to Power Query series here

This is the 2nd of 2 posts on joins and is a complement to the Merge post.  Link to 1st post.


In this exercise, we move beyond the conceptual and work through an exercise using 2 data tables to demonstrate all 6 types of joins.   

Data table 1, Account, contains a dollar value of $50.00.
Data table 2, EEID, can be matched to Data table 1 based on Company.  However, there are multiple rows where the Company is Dallas.  
After loading each of the data tables in to Power Query, Merge Queries as New was applied, matching on Company.  Account was the top table (left when viewed on a Venn diagram);  EEID was the bottom table (right on a Venn diagram).


Download the workbook below to view the data tables and each of the joins.  

                                                      ↑

Left Outer join

The left outer join takes the Account table (left) and brings back each row from the EEID table (right) that has the same company.  Notice how the amount of $50.00 is in the new table twice.  It's duplicated, because there are 2 rows in the EEID table where the Company is Dallas.
A VLOOKUP in native Excel returns the first match; a Merge in Power Query returns all matches.  This critical difference can dramatically skew your analysis if not handled appropriately.  Future blog posts, particularly on Power Pivot, will provide alternate solutions for how to handle multiple data sets with duplicate values in the matching column.    


Right Outer join

The right outer join takes the EEID table (right) and brings back each row from the Account table (left) that has the same company.  Notice how the $50.00 amount is again duplicated, since it matches 2 rows from the EEID table.  In addition, the Houston company has been added to the table, since the join starts with all company items in the right table.  







Left Anti join

The left anti join starts with the Account table (left) and returns a value if there is no match in the EEID table.  In this instance, the Company Dallas was in the EEID table, and so the resulting table is blank.





Right Anti join

The right anti join starts with the EEID table (right) and returns a value if there is no match in the Account table.  In this instance, the Company Houston was in the EEID table but not the Account table.




Anti joins are excellent audit tools to monitor exceptions or open items.



Inner join

The inner join returns values only when Company matches in both tables.  Notice the $50.00 amount is duplicated. 






Full Outer join

The full outer join returns every Company in each table.  Notice the $50.00 amount is duplicated. 









Append

Append is another option for combining tables using Power Query.  Append is designed to take an existing table and expand it by adding on another table that uses the same column names.  If a column name doesn't exist in the original table, new columns are added.  Notice that append picked up the 1 row from the Account table and the 3 rows from the EEID table exactly as they appeared originally.  There is no Company.1 column and the $50.00 amount is NOT duplicated.









💥BEST PRACTICE TIP
As a general rule, Append is intended to make a table longer, while Merge makes a table wider. 💥

Merge and Append allow accountants and analyst to perform a wide array of new analytics.  However, extra diligence is required to ensure the steps applied are yielding the analysis intended.

In this lesson, you learned how to:
  • join 2 data tables using Merge;
  • watch out for duplicates which could skew results; and,
  • Append multiple data tables together instead of Merge.   

Prior post: Joins conceptually
Next post: 2 files 1 pivot table: Power Pivot

No comments:

Post a Comment