Excel has a lot of ways to group items that are the same, but these rely on values being an exact match. Data might be considered the same even though they are not an exact match. Grouping these similar items can be a difficult and often times manual task. Excel does actually have a feature that can group these close but not quite the same values together! It’s just a bit hidden. Power Query has a fuzzy grouping feature for just such instances. This will allow you to group items based on a similarity threshold as well as set up a translation table for different terms that mean the same thing. This post will show you how to use Power Query to group all your related values. Get your copy of the example workbook used in this post to follow along!

Add Data to a Table

This example shows a list of questions. You can see that while all the items in the list are distinct, some of them are similar and need to be grouped together. The first thing you will need to do is put the data you want to group into a table. This will make it easy to import the data into power query and use the fuzzy grouping capabilities. This will open the Create Table menu with your data range selected. The data is inside a table object and will be ready to import into power query. Make sure you rename the table to something short and descriptive such as Questions.

Create a Translation Table

Before importing the data into power query and grouping similar item, it’s worth setting up a translation table. This is a table you can use to build a custom mapping to help the fuzzy matching process. This can be used to tell Excel certain text means the same thing. For example, in the context of your data, the words email and message might be the same thing and should be grouped together. The translation table needs to have a From and To column headings. Values in the From column will then be considered the same as the value in the To column within a row. This should also be in an Excel table.

Import Tables to Power Query

Now that you have your data to group and translation map in tables, these can be imported to power query. Here’s how to import a table to power query. Import both tables to power query and you will be able to group the data.

Group Data from the Transform Tab

The grouping of similar items will eventually use the Table.FuzzyGroup() function in the M code. This isn’t availalbe from the query editor user interface. You can set up a regular grouping from the UI which will automatically build you a Table.Group() formula. This will have the same argument signature as the Table.FuzzyGroup() formula except for the last argument which deals with the fuzzy grouping options. It will be easier to build the regular grouping formula from the UI and then edit it to the required fuzzy grouping formula. This will open the Group By menu with the Basic option and the column to be grouped already selected. This will result in the above M code formula. This will perform an exact match grouping. Depending on your dataset, this might not change anything other than adding a column of where each row has a Table.

Edit the Group Step to a Fuzzy Group

Click into the formula bar and edit the resutling formula by adding the above parts seen in red. In this case, Translation is the name of the query with the translation table containing your From and To columns. This will change the function used to the Table.FuzzyGroup() function. The first three arguments are the same and a fourth argument is added at the end to specifcy the fuzzy groupinng options. This last option argument is a record with these fields.

IgnoreCase: This can be either true or false. Setting this as true will allow for a case-insensitive grouping. For example, Email and email are grouped together.IgnoreSpace: This can be either true or false. Setting this as true will ignore any space characters when grouping. For example, email and e mail are grouped together.Threshold: This is a number between 0 and 1 that specifies the similarity score at which two values will be grouped. A threshold of 0 would group everything together while a threshold of 1 would require an exact match for grouping.TransformationTable: This is a table that maps From values to an equivalent To value in the grouped results.

For more details on the Table.FuzzyGroup function you can view the Microsoft documentation. Press the Enter key or click on the Check to the left of the formula bar to accept the updated formula. This will perform the fuzzy grouping and group similar items while taking into account the mapping in your translation table! Again, you will see a new column with a Table in each row. Each of these tables contains the grouped data and you can see a preview when you click on the empty part of the cell. Now you can expand these tables out. You will see all the data now with values grouped based on the Item column. 💡 Tip: Review the fuzzy grouping results and then adjust the IgnoreCase, IgnoreSpace, and Threshold options to improve the grouping.

Load Data to Excel

This grouped data can now be loaded to Excel. Now your similar items are grouped in a new Excel table!

Conclusions

Grouping items that are similar but not exactly the same is a difficult task that most users will approach manually. Power Query does offer a fuzzy grouping function, but it’s not accessible from the query editor user interface. Performing a regular grouping and then adjusting the resulting M code to a fuzzy grouping is a fairly easy way to implement the function due to their similar argument signature. Did you know about the fuzzy grouping function available in power query for Excel? Let me know in the comments!