How to melt an excel budget wide format to long for pandas time-series analysis
- Step 1Export the budget as .xlsx — Save the wide budget on the first sheet of the workbook — the tool reads sheet index 0 only. A 'Budget' summary tab must be the first tab, or copy it there.
- Step 2Identify ID columns and year columns — ID columns are the dimensions to keep per row —
Department,Cost Centre. Value columns are the wide year headers —Year_2021, Year_2022, Year_2023. Note the exact header text. - Step 3Fill the option fields — Type the dimensions into ID columns (comma-sep) and the years into Value columns (comma-sep). Set Variable col name to
Yearand Value col name toBudget_USDto match your pandas workflow. - Step 4Run and download unpivoted.xlsx — Click run; the reshape happens in the browser and offers
unpivoted.xlsx(sheetUnpivoted). Confirm the row count equals input rows × number of year columns. - Step 5Load into pandas —
import pandas as pd; df = pd.read_excel('unpivoted.xlsx'). You now have Department, Year, Budget_USD columns ready for analysis. - Step 6Cast types, then analyse — Values arrive as displayed text, so cast:
df['Budget_USD'] = pd.to_numeric(df['Budget_USD'], errors='coerce')anddf['Year'] = df['Year'].str.extract(r'(\d{4})').astype(int)if headers wereYear_2021. Thendf.groupby('Year')['Budget_USD'].sum()orsns.lineplot(...).
Field-to-melt() mapping
How each Un-Pivot field corresponds to a pandas df.melt() argument.
| Un-Pivot field | df.melt() argument | Budget example |
|---|---|---|
| ID columns (comma-sep) | id_vars | Department, Cost Centre |
| Value columns (comma-sep) | value_vars | Year_2021, Year_2022, Year_2023 |
| Variable col name | var_name | Year |
| Value col name | value_name | Budget_USD |
pandas type and shape notes
What to expect when read_excel opens the unpivoted budget, and the cast you may need.
| Aspect | Behaviour | pandas handling |
|---|---|---|
| Year column contents | Original header verbatim (e.g. Year_2021) | str.extract(r'(\d{4})').astype(int) for a clean year |
| Budget values | Displayed cell text (currency format may persist) | pd.to_numeric(..., errors='coerce') |
| Empty budget cells | Emit a row with empty value | Become NaN; dropna() if you want them gone |
| Unlisted columns | Dropped | Add to id_vars equivalent (ID columns) to keep |
| Row count | input rows × year columns | Assert before heavy aggregation |
Tier limits (Excel family)
Un-Pivot needs Pro or higher. Real numbers; the row cap is on input rows.
| Tier | Max file size | Input row limit | Files per run |
|---|---|---|---|
| Free | 5 MB | 10,000 | 1 (Un-Pivot is Pro-only) |
| Pro | 50 MB | 100,000 | 5 |
| Pro-media | 200 MB | 500,000 | 20 |
| Developer | 500 MB | Unlimited | Unlimited |
Cookbook
Budget reshapes paired with the pandas you run on the result. Each shows the wide input, the four fields, and the analysis code.
Multi-year budget melted for groupby
Three year columns become a Year column; pandas then sums by year across departments. Equivalent to df.melt(id_vars=['Department'], value_vars=['Year_2021','Year_2022','Year_2023'], var_name='Year', value_name='Budget_USD').
Input (wide budget):
Department,Year_2021,Year_2022,Year_2023
Eng,500000,540000,580000
Sales,300000,330000,360000
Fields:
ID columns: Department
Value columns: Year_2021, Year_2022, Year_2023
Variable col: Year
Value col: Budget_USD
In pandas:
df = pd.read_excel('unpivoted.xlsx')
df['Budget_USD'] = pd.to_numeric(df['Budget_USD'])
df.groupby('Year')['Budget_USD'].sum()Clean the Year column to integers
The Year column holds the header text Year_2021, not 2021. Extract the four-digit year in pandas (or rename headers to bare years before unpivoting). Then it sorts and plots correctly.
df = pd.read_excel('unpivoted.xlsx')
df['Year'] = df['Year'].str.extract(r'(\d{4})').astype(int)
# now 2021, 2022, 2023 as ints
df = df.sort_values('Year')seaborn line plot of budget over time
Long format is exactly what seaborn wants. hue splits by department; x is the cleaned Year, y is the numeric budget.
import seaborn as sns
df = pd.read_excel('unpivoted.xlsx')
df['Year'] = df['Year'].str.extract(r'(\d{4})').astype(int)
df['Budget_USD'] = pd.to_numeric(df['Budget_USD'])
sns.lineplot(data=df, x='Year', y='Budget_USD', hue='Department')Two dimensions kept for a pivot_table back-check
List Department and Cost Centre as ID columns so both survive. After analysis you can pivot back to wide with pandas pivot_table to compare against the original.
Fields:
ID columns: Department, Cost Centre
Value columns: Year_2022, Year_2023
Variable col: Year
Value col: Budget_USD
In pandas (round-trip check):
df = pd.read_excel('unpivoted.xlsx')
df['Budget_USD'] = pd.to_numeric(df['Budget_USD'])
df.pivot_table(index=['Department','Cost Centre'],
columns='Year', values='Budget_USD', aggfunc='sum')Handle missing budget cells as NaN
An empty year cell becomes a row with an empty value, which read_excel surfaces as NaN. Decide whether to keep them (zero-fill) or drop them, matching how you would treat melt output with NaNs.
Input (Sales missing Year_2023):
Department,Year_2022,Year_2023
Sales,330000,
After unpivot -> row: Sales, Year_2023, (empty)
In pandas:
df = pd.read_excel('unpivoted.xlsx')
df['Budget_USD'] = pd.to_numeric(df['Budget_USD'], errors='coerce')
# option A: drop
df = df.dropna(subset=['Budget_USD'])
# option B: zero-fill
# df['Budget_USD'] = df['Budget_USD'].fillna(0)Edge cases and what actually happens
Year column holds header text, not a year
ExpectedThe variable column contains the original header verbatim (Year_2021), not 2021. For a numeric/sortable year, either extract it in pandas (str.extract(r'(\d{4})')) or rename the wide headers to bare years before unpivoting.
Budget values come in as strings
Type coercionCells are read with formatting applied, so a value shown as $540,000 may arrive as that text. read_excel often infers numeric, but currency/format strings stay text. Cast with pd.to_numeric(df['Budget_USD'], errors='coerce') before summing.
Empty cells become NaN rows
PreservedThe melt emits a row per value column even for blank cells; pandas reads these as NaN. Decide explicitly: dropna(subset=['Budget_USD']) to remove, or fillna(0) to treat missing budget as zero. The tool never drops them for you.
A dropped column you needed for groupby
Dropped by designOnly ID and value columns survive. If you later groupby('Cost Centre') but did not list it as an ID column, it is gone. Add every grouping dimension to the ID columns before reshaping.
A Total column melted into the budget
Double-count riskIf a Total or grand-total column is listed among value columns, it becomes rows and any sum() double-counts. Omit it from both boxes so it is dropped; recompute totals in pandas instead.
Value column name does not match the header
Empty value blockMatching is exact text after trimming. Year2021 vs Year_2021 yields empty values for that year across all rows — an all-NaN block in pandas. Copy header text precisely from the source.
Budget on a non-first sheet
Not readOnly the first sheet is read. A budget on a 'Detail' or second tab is ignored. Move it to the first tab before uploading, or read_excel on the result sees nothing useful.
Row count over the tier cap
RejectedThe input-row cap applies before reshaping (Pro 100,000 input rows). A large department-by-cost-centre budget can exceed it. Split the source, upgrade (Developer unlimited), or melt in pandas for the largest files.
Free tier blocked
Requires ProUn-Pivot throws 'Un-Pivot requires Pro tier' for free accounts. If you are already in pandas, df.melt(...) is free; the tool's value is producing a shareable .xlsx for non-Python colleagues and skipping the in-Python reshape.
Frequently asked questions
Why not just use pandas.melt() on the original wide file?
If you are loading the file into pandas anyway, df.melt(...) is perfectly good. The tool helps when you want the long data as an .xlsx for non-Python tools (Power BI, Tableau) or colleagues, or when you would rather hand pandas an already-tidy file than maintain the melt call.
Does the Year column contain 'Year_2021' or '2021'?
The original column name verbatim — so Year_2021. To get 2021, either rename the headers to bare years before unpivoting with the header rename tool or extract in pandas: df['Year'] = df['Year'].str.extract(r'(\d{4})').astype(int).
Why are my budget values strings in pandas?
Cells are read with their display formatting, so currency- or comma-formatted numbers can arrive as text. read_excel infers numeric where it can, but format strings persist. Cast with pd.to_numeric(df['Budget_USD'], errors='coerce') before arithmetic.
Is the output identical to df.melt()?
For a single value group, yes — the long shape matches df.melt(id_vars=ID, value_vars=VALUE, var_name=..., value_name=...). Both keep rows for missing values (melt does not drop NaN by default either), and both drop columns you do not name. To reverse it (long back to wide, like df.pivot_table), use the pivot generator.
How do I handle missing budget cells?
They become rows with an empty value, read into pandas as NaN. Choose explicitly: dropna(subset=['Budget_USD']) to remove, or fillna(0) to treat as zero budget. The tool preserves them so you decide.
Can I use the output directly with seaborn or matplotlib?
Yes, after casting types: sns.lineplot(data=df, x='Year', y='Budget_USD', hue='Department') works on the long output once Year is an int and Budget_USD is numeric. Long format is exactly what seaborn expects.
Will a Total column cause double-counting?
Yes if you melt it. A Total column listed as a value column becomes data rows and any sum() double-counts. Leave Total out of both boxes so it is dropped, then compute totals in pandas.
How many output rows will I get?
Input rows multiplied by the number of year columns. A 1,000-row budget with five years gives 5,000 rows. Confirm this stays under the Pro 100,000-input-row cap before running.
Which input formats and sheet does it use?
It accepts .xlsx, .xls, .ods, and .csv, reads the first sheet only, and outputs .xlsx (sheet Unpivoted, file unpivoted.xlsx) that pd.read_excel reads natively.
Is my budget data uploaded anywhere?
No. The reshape runs in your browser via SheetJS. Confidential budget figures never leave the tab — only an anonymous processed-file counter is recorded for dashboard stats, which you can opt out of.
Can I keep more than one dimension column?
Yes — list every dimension in the ID columns box, comma-separated (e.g. Department, Cost Centre, Region). They all repeat on every melted row, so any of them is available for groupby or pivot_table.
Can I wire this into an automated analysis pipeline?
Yes. GET /api/v1/tools/excel-unpivot returns the schema so a script or MCP client can build the payload; execution goes through a paired @jadapps/runner on your machine, so budget content never reaches JAD. A common flow: budget export -> runner unpivot -> pandas notebook reads the result.
Privacy first
Every JAD Excel tool runs entirely in your browser using SheetJS and ExcelJS. Your spreadsheets, formulas, and data never leave your device — verified by zero outbound network requests during processing.