How to pre-join excel datasets before power bi to simplify data model relationships
- Step 1Identify the two datasets and the shared key — Decide which column links the two exports (e.g.
ProductID) and which dataset is the fact grain you want to keep. - Step 2Open the Sheet Joiner — Load the Multi-Sheet Joiner before touching Power BI. Two drop zones appear for the two sources.
- Step 3Drop the fact table as File A — File A should be the table at the grain you want the flat fact to have (every File-A row is kept). First sheet only.
- Step 4Drop the dimension/lookup as File B — File B's columns (minus the key) are appended to each fact row. Put its table on the first sheet.
- Step 5Set keys and join type — Enter the key header for each file in
leftKey/rightKey. Useleftto keep every fact row, orinnerto keep only matched rows. - Step 6Import the flat table into Power BI — Download
joined-data.xlsxand load it as a single table. Build visuals directly — no relationship to model.
Pre-join vs Power BI relationship
When a flat pre-joined table beats an in-model relationship — and when it does not.
| Factor | Pre-join in Excel (this tool) | Power BI relationship |
|---|---|---|
| Many-to-many handling | Eliminated — one flat table | Needs bidirectional filter or a bridge table |
| Refresh on source change | Manual — re-run the join | Automatic on dataset refresh |
| Model complexity | Single table, no DAX relationship | Multiple tables + relationship + cardinality |
| Number vs text key | Matched (string coercion) | Relationship errors on mismatched types |
| Best for | Static snapshots, one-off reports | Live, frequently changing data |
Grain after the join
The output grain depends on cardinality. Pre-join to the fact grain to keep measures additive.
| File A : File B cardinality | Output grain | Modelling note |
|---|---|---|
| Fact : Dimension (many-to-one) | Stays fact grain | Ideal — dimension columns denormalised onto each fact row |
| One-to-many | Fans out to File-B grain | Output rows multiply; re-aggregate before measuring |
| Many-to-many on the key | Cross-product per key | Avoid — verify dedup first or output explodes |
| One-to-one | Stays one row per key | Clean flat merge |
Tier limits
Pre-joining needs two files, so Free (1 file) cannot run it. Size the tier to your larger source.
| Tier | Max size / file | Max rows / file | Files |
|---|---|---|---|
| Free | 5 MB | 10,000 | 1 (cannot join) |
| Pro | 50 MB | 100,000 | 5 |
| Pro-media | 200 MB | 500,000 | 20 |
| Developer | 500 MB | Unlimited | Unlimited |
Cookbook
Pre-join fragments aimed at a flatter Power BI model. Watch the grain: pre-join to the fact level so measures stay additive.
Denormalise a dimension onto the fact (many-to-one)
Sales (fact) joined to Products (dimension) on ProductID. Product attributes land on every sale row, so Power BI needs no Products table or relationship.
File A (sales fact): File B (products dim): ProductID,qty,date ProductID,name,category P1,3,2026-01-02 P1,Widget,Hardware P2,1,2026-01-03 P2,Gadget,Electronics Left join (ProductID <-> ProductID): ProductID,qty,date,name,category <- flat fact, no relationship needed
Mismatched key types that Power BI rejects
One export stores ProductID as number, the other as text — Power BI cannot relate them without a type change. The joiner matches them as strings.
File A ProductID: 7 (number) File B ProductID: "7" (text) Power BI: cannot create relationship (incompatible types) Sheet Joiner: 7 joins to "7" -> one flat row
Inner join to drop unmodelled facts
Use inner to keep only fact rows that have a matching dimension entry — equivalent to filtering out facts that would land on a blank dimension row in Power BI.
joinType: inner Sale with ProductID P9 (not in products) is dropped, so no blank-dimension row appears in Power BI.
Beware one-to-many fan-out before measuring
If File B has multiple rows per key, the fact row multiplies — which would double-count a SUM in Power BI. Verify cardinality, or re-aggregate after the join.
File A (1 sale): File B (2 promo rows for P1):
ProductID,revenue ProductID,promo
P1,100 P1,Spring
P1,Email
Left join (2 rows): revenue 100 now appears twice
-> SUM(revenue) would read 200. Dedupe File B first.Keep the model live? use a relationship instead
Pre-join is for static snapshots. If the data refreshes often and you need live updates, model the relationship in Power BI rather than pre-joining.
Static monthly snapshot -> pre-join here, import one table Live / frequently refreshed -> import both, model relationship Needs row/col cleanup first -> /excel-tools/excel-date-standardizer, then join
Edge cases and what actually happens
Many-to-many key explodes the table
Cross-productIf both files have duplicate keys, the join produces a cross-product per key (every File-A match times every File-B match) — the exact blow-up you were trying to avoid in Power BI. Deduplicate at least one side with the deduplicator before joining.
Fan-out double-counts a measure
By designA one-to-many join repeats File-A values across rows, so a naive SUM in Power BI over the flat table double-counts. Pre-join to the fact grain (dimension as File B), or re-aggregate with the pivot generator after joining.
Fact key has no dimension match
Kept (left)On a left join, facts with no dimension row are kept with blank dimension columns — analogous to Power BI's blank-row behaviour. Switch to inner to drop them, or fill the gaps before import.
Number vs text key
PreservedPower BI cannot relate a numeric key to a text key, but the joiner coerces both to strings and matches them. Standardise the type in Power BI later, or accept the merged column as text in the flat table.
Date keys formatted differently
No matchDates are read as formatted strings, so two files must display the date identically for the keys to match. Standardise both with the date standardizer before joining a date dimension.
Dimension table on sheet 2
Sheet 1 onlyOnly the first sheet of each file is read. A workbook with the dimension on a later tab will be misread — move it to tab 1 or export it standalone first.
Right-only dimension rows expected
DroppedDimension rows with no matching fact never appear (this is left/inner, not full outer). That is usually fine for a star schema flattened to a fact, but if you need unused dimension members keep them in Power BI.
Free tier
Run rejectedPre-joining needs two files, above Free's 1-file limit, and the tool is Pro-minimum. The run is blocked with 'Sheet Joiner requires Pro tier.' until you upgrade.
Composite key relationship
Not supportedPower BI can build a relationship on a concatenated key; this tool matches one column per side. Concatenate the key parts into one helper column in each file first, then join on it.
Output too large to model comfortably
ExpectedFlattening a many-to-many denormalises and can balloon row counts. The metrics panel reports output rows; if the flat table is huge, an in-model relationship may actually be the leaner choice.
Frequently asked questions
When should I pre-join vs use Power BI relationships?
Pre-join for static snapshots where one flat table is simpler than a relationship — especially to avoid many-to-many modelling. Use Power BI relationships when the data refreshes frequently and you need live updates, or when you need to keep unused dimension members.
Does pre-joining help Power BI performance?
Often, yes. A single flat table avoids cross-table filter propagation and bidirectional filtering at query time. The trade-off is a larger, denormalised table and a manual refresh (re-run the join when sources change).
How do I avoid double-counting measures after a join?
Pre-join to the fact grain (the fact table as File A, the dimension as File B) so each fact row stays unique. If a one-to-many fan-out occurs, re-aggregate with the pivot generator before building SUM measures.
Can I undo the join later if I need separate tables?
Keep your two original source files. The merged .xlsx is a derived artifact you can always regenerate, and you can model the originals as separate tables in Power BI whenever you need to.
It solves my number-vs-text relationship error — how?
Power BI refuses to relate a numeric key to a text key. The joiner coerces both keys to strings, so 7 matches "7", and the merged column arrives as one consistent column in the flat table.
What if a fact has no matching dimension row?
On a left join it is kept with blank dimension columns (like Power BI's blank row). Use an inner join to drop such facts, or clean the missing keys before importing.
Can I pre-join on a composite key?
Not directly — the tool matches one column per side. Concatenate the key parts into a single helper column in each file (e.g. region|product) and join on that helper column.
Does the output keep any live connection to Power BI?
No. It is a plain values-only .xlsx (joined-data.xlsx, sheet Joined). It does not refresh; re-run the join whenever the source files change, then re-import.
What happens with a true many-to-many key?
You get a cross-product per key, which can explode the row count — the same complexity you were avoiding in the model. Deduplicate at least one side with the deduplicator first, or keep the relationship in Power BI.
Are my datasets uploaded anywhere?
No. Parsing and joining run in your browser via SheetJS. The data you are about to model never leaves the tab — useful for confidential snapshots.
What tier and size limits apply?
Pro minimum (two files exceed Free's 1-file cap). Pro covers 50 MB / 100,000 rows per file; Pro-media 200 MB / 500,000; Developer 500 MB / unlimited — size to your larger source.
Should I clean the data before joining?
Yes, for a clean model. Standardise date keys with the date standardizer and remove duplicate keys with the deduplicator so the join produces predictable grain before import.
Privacy first
Every JAD Excel tool runs entirely in your browser using SheetJS and ExcelJS. Your spreadsheets, formulas, and data never leave your device — verified by zero outbound network requests during processing.