Quick Tips for Cleaning & De-Duping Your Data


I’ve been spending a lot of my time working in Power BI lately.

Power BI allows me to quickly connect data that would have taken hours of stacking vlookups in spreadsheets. I can find new opportunities looking at the same data through multiple lenses. Power BI is a great data visualization tool to spot things I might not have seen in my spreadsheet or a line chart.

All this sounds cool, right? It is, but when you’re first starting out and learning about creating relationships in Power BI, running into duplicates means that you might not be able to properly connect your data.

Below I’ve outlined my quick tips for finding duplicates, cleaning, and de-duping your data to prepare for creating a connection or building a bridge. I automatically apply these transformations onto any column that I’m prepping to clean and de-dupe,


How to find duplicates in your data

We’ve all been there. The dreaded “you’ve got dupes” popup when trying to create a relationship.

Womp womp.

 

Luckily, it’s easy to quickly identify duplicates within a column and there are some easy steps to prevent duplication when creating relationships or building a bridge between tables.

Create a quick table with “Value” and “Count of Value” to easily identify duplicates

TRANSFORM > LOWERCASE

=LOWER transforms all of your values into lowercase.

  • Right click your subject column and select transform > lowercase from the dropdown.

TRANSFORM > CLEAN

=CLEAN returns the original text value with non-printable characters removed.

  • Right click your subject column and select transform > clean from the dropdown.

TRANSFORM > TRIM

=TRIM removes whitespace values from the beginning and end of the text value.

  • Right click your subject column and select transform > trim from the dropdown.

REMOVE ROWS > REMOVE BLANK ROWS

  • With your subject column selected, click on the remove rows icon, then select remove blank rows.

REMOVE ROWS > REMOVE ERRORS

  • With your subject column selected, click on the remove rows icon, then select remove errors.

REMOVE ROWS > REMOVE DUPLICATES

  • With your subject column selected, click on the remove rows icon, then select remove duplicates.

If you remember to lowercase, clean, trim, remove blanks, remove errors, and remove duplicates you’ll be on your way to creating new relationships between all types of data!

READ ALSO  How to Create a Power BI Template

Watch our Power BI tutorials & more by subscribing to @SeerInteractive on YouTube.



Source link

?
WP Twitter Auto Publish Powered By : XYZScripts.com