Extract a PDF Schedule or Timetable to JSON

How to extract a pdf schedule or timetable table to json

Step 1
Open the tool and drop the schedule PDF — Load the programme or timetable into the PDF Table to JSON tool. It extracts immediately in the browser — no options to set.
Step 2
Confirm the event columns in the preview — Check the first 20 objects: keys should be the schedule's real columns (time, session, room, speaker). If a day banner or title became the keys, see the edge cases.
Step 3
Download the JSON array — Save <name>.json — a flat array of event objects across every page of the schedule.
Step 4
Merge wrapped session rows and drop day banners — A long session title that wrapped becomes an extra row with an empty time cell; merge it onto the row above. Day headers ('Tuesday') come through as near-empty rows — capture them as a date field, then filter them out.
Step 5
Normalise dates and times to ISO 8601 — Combine the captured date with each event's printed time and convert to ISO 8601 (2026-06-02T14:30:00). A library like day.js or Luxon makes the timezone and AM/PM handling explicit and testable.
Step 6
Build .ics or POST to a calendar API — Calendar apps want iCal (.ics), not raw JSON — generate it from the normalised events (e.g. with ical-generator), or POST the events to a booking/calendar API.

Schedule fields: how each extracts and what to finish

The generic row/column extraction applied to a typical programme, with the schedule-specific cleanup your code does.

Schedule element	What you get	Your normalise step
Time / time range	String as printed (`"9:00–10:30"`, `"14:30"`)	Parse to start/end ISO 8601 datetimes
Date / day banner	Often a near-empty row (`"Tuesday"` then blanks)	Capture as the current `date`, then filter the row
Session / title	String; long titles may wrap to a second row	Merge continuation rows (empty time = belongs above)
Room / track / speaker	Own keys when columnar	Map to your event fields
Grid timetable cell	Flattened by visual row, not one-per-cell	Reshape rows×columns into individual events
Reprinted page header	Becomes a data row from page 2 onward	Filter `r.Time !== 'Time'`

Tier limits for schedule PDFs

Most programmes and timetables are small; multi-day conference books can run long.

Tier	Max file size	Max pages
Free	2 MB	50
Pro	50 MB	500
Pro + Media	500 MB	2,000
Developer	2 GB	10,000

Cookbook

A real conference programme extraction and the steps that turn printed times into calendar-ready events.

A day's programme with a day banner

Note 'Tuesday 2 June' becomes a near-empty row, and the long keynote title wraps to a second row with an empty time.

PDF:
Time         Session                         Room
Tuesday 2 June
09:00–10:00  Opening keynote: the road
             ahead for the platform          Main Hall
10:15–11:00  Workshop A                       Room 2

Downloaded JSON:
[
  { "Time": "Tuesday 2 June", "Session": "",                              "Room": "" },
  { "Time": "09:00–10:00",   "Session": "Opening keynote: the road",     "Room": "" },
  { "Time": "",              "Session": "ahead for the platform",        "Room": "Main Hall" },
  { "Time": "10:15–11:00",   "Session": "Workshop A",                    "Room": "Room 2" }
]

Carry the date down and merge wrapped titles

Walk the rows once: remember the current day banner, attach it to following events, and fold continuation rows (empty time) into the event above.

const rows = JSON.parse(json);
const events = [];
let date = null;
for (const r of rows) {
  if (/\d/.test(r.Time) && !/^\d{1,2}[:.]/.test(r.Time) === false) {} // (illustrative)
  if (/^[A-Za-z]+ \d/.test(r.Time) && !r.Session) { date = r.Time; continue; } // day banner
  if (!r.Time && events.length) {                       // wrapped continuation
    events[events.length - 1].Session += " " + r.Session;
    if (r.Room) events[events.length - 1].Room = r.Room;
    continue;
  }
  events.push({ date, ...r });
}

Normalise a printed time range to ISO 8601

Combine the carried date with the printed time range and produce real start/end datetimes — explicit timezone, no guessing.

import dayjs from "dayjs";

function toISO(date, time) {           // date "Tuesday 2 June", time "09:00–10:00"
  const [start, end] = time.split(/[–-]/).map(s => s.trim());
  const d = dayjs(date.replace(/^[A-Za-z]+ /, "") + " 2026", "D MMMM YYYY");
  const mk = t => d.hour(+t.split(":")[0]).minute(+t.split(":")[1]).second(0);
  return { start: mk(start).toISOString(), end: mk(end).toISOString() };
}
toISO("Tuesday 2 June", "09:00–10:00");
// { start: "2026-06-02T09:00:00.000Z", end: "2026-06-02T10:00:00.000Z" }

Generate an .ics calendar file

Calendars import iCal, not JSON. Build .ics from the normalised events so attendees can subscribe.

import ical from "ical-generator";

const cal = ical({ name: "Conference 2026" });
for (const e of normalisedEvents) {
  cal.createEvent({
    start: new Date(e.start),
    end:   new Date(e.end),
    summary: e.Session,
    location: e.Room,
  });
}
fs.writeFileSync("programme.ics", cal.toString());

Reshape a grid timetable into per-cell events

A class timetable with days across the top flattens by row. Pivot it into one event per (time-slot, day) cell.

// rows keyed by { Time, Mon, Tue, Wed, Thu, Fri }
const days = ["Mon","Tue","Wed","Thu","Fri"];
const slots = JSON.parse(json).filter(r => r.Time && r.Time !== "Time");
const events = slots.flatMap(r =>
  days.filter(d => r[d]).map(d => ({ day: d, time: r.Time, subject: r[d] }))
);

Edge cases and what actually happens

Times and dates come out as text, not Date objects

Expected

Every value is a string, including "9:00 AM" and "2 June". The tool never parses temporal values. Convert to ISO 8601 in your code with a date library so timezone and AM/PM handling is explicit and testable.

Day banner becomes a near-empty row

Capture then filter

A 'Tuesday 2 June' header that spans the row width arrives as a row with mostly empty cells. Capture it as the current date and attach it to following events, then drop the banner row (see the cookbook walk-through).

Session title wrapped onto two lines

Split row

Long titles wrap to a second visual line, which becomes a separate row with an empty time cell. Merge continuation rows into the event above — an empty time column is the tell-tale that a row is a continuation.

Grid timetable flattens by row

Reshape needed

A timetable with days across the top and time slots down the side extracts one object per time-slot row (with a key per day), not one event per cell. Pivot it in your code to get individual events (the cookbook shows the flatMap).

Repeated header on a multi-page programme

By design

From page 2 onward the reprinted column header is emitted as a data row. Filter r.Time !== 'Time' (or your first column) before building events.

Scanned printed programme

Empty array

A scanned booklet has no text layer, so extraction returns nothing. Run PDF OCR first to add a text layer, then extract — and double-check times, since OCR can confuse 0/O and 1/l.

Time range merges with the session text

Misaligned

If the time and the session sit very close horizontally, position-based grouping can merge them into one cell. Inspect the preview; split a "09:00 Opening keynote" cell with a leading-time regex in post-processing.

Programme exceeds the free page limit

Blocked

A thick multi-day conference book can exceed 50 pages. Upgrade to Pro (500 pages) or extract just the schedule pages with PDF Extract Pages before running the tool.

Calendar app rejects the raw JSON

Wrong format

Calendars import iCal (.ics), not JSON. The tool's job is to get the events into structured data; convert that to .ics (e.g. with ical-generator) before importing into Google Calendar, Outlook, or Apple Calendar.

Frequently asked questions

Do dates and times come out as text or as typed values?

As text — every value is a string, including "9:00 AM" and "2 June". The tool never parses temporal values, so the timezone and format decisions stay in your code. Convert to ISO 8601 with a date library (day.js, Luxon) after extracting; the cookbook shows a worked example.

What happens with a schedule that spans multiple pages?

All pages' rows are combined into one flat array, so a multi-page programme comes through as one continuous list of events — convenient. The only thing to handle is the reprinted column header on page 2+, which appears as a data row; filter it out (r.Time !== 'Time').

Can I import the JSON straight into Google Calendar?

Not directly — Google Calendar (and Outlook, Apple Calendar) import iCal (.ics), not JSON. Normalise the events to ISO 8601, then generate .ics with a library like ical-generator. The cookbook includes both steps.

How do I handle the day headers like 'Tuesday'?

They come through as near-empty rows. Walk the array once, remember the most recent day banner, and attach it as a date field to the events that follow, then drop the banner rows. This gives each event a full date to combine with its printed time.

My timetable is a grid (days across the top) — what do I get?

One object per time-slot row, with a key for each day column. To get individual events you pivot that into one event per (slot, day) cell — a short flatMap, shown in the cookbook. The extraction itself can't know a grid is a grid; it reads visual rows.

Why are long session titles split across two rows?

Rows are defined by vertical position, so a title that wraps to a second line becomes its own row with an empty time cell. Merge continuation rows (empty time means it belongs to the event above) when building your events.

Is the schedule uploaded anywhere?

No. Extraction runs entirely in your browser via PDF.js. Unpublished programmes and internal rosters never leave your device; only anonymous usage counters are recorded when you're signed in.

My programme is a scanned booklet — will it work?

Not as-is: a scan has no text layer, so you'll get an empty array. Run PDF OCR first, then extract, and verify the times because OCR can misread digits like 0/O and 1/l.

Can I keep room and speaker columns separate?

Yes, when they're distinct columns in the PDF, each becomes its own key (Room, Speaker). If they're crammed into one cell, you'll split them in post-processing. Check the preview to see how the columns landed before writing your import.

What are the size and page limits?

Free: 2 MB and 50 pages. Pro: 50 MB / 500 pages. Pro + Media: 500 MB / 2,000 pages. Developer: 2 GB / 10,000 pages. For a long conference book, upgrade or extract just the schedule pages first with PDF Extract Pages.

Can I extract a fillable booking form instead of a printed table?

If the data is in AcroForm fields rather than a printed grid, use the PDF Form Field Extractor, which reads field names and values directly. Use this table tool for printed timetables and programmes.

Why does the preview only show 20 events?

The on-page preview caps at the first 20 objects to stay responsive and reports the total count. The downloaded .json contains every event — the cap is display-only.

Privacy first

All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.

How to extract a pdf schedule or timetable table to json

Step 1
Open the tool and drop the schedule PDF — Load the programme or timetable into the PDF Table to JSON tool. It extracts immediately in the browser — no options to set.
Step 2
Confirm the event columns in the preview — Check the first 20 objects: keys should be the schedule's real columns (time, session, room, speaker). If a day banner or title became the keys, see the edge cases.
Step 3
Download the JSON array — Save <name>.json — a flat array of event objects across every page of the schedule.
Step 4
Merge wrapped session rows and drop day banners — A long session title that wrapped becomes an extra row with an empty time cell; merge it onto the row above. Day headers ('Tuesday') come through as near-empty rows — capture them as a date field, then filter them out.
Step 5
Normalise dates and times to ISO 8601 — Combine the captured date with each event's printed time and convert to ISO 8601 (2026-06-02T14:30:00). A library like day.js or Luxon makes the timezone and AM/PM handling explicit and testable.
Step 6
Build .ics or POST to a calendar API — Calendar apps want iCal (.ics), not raw JSON — generate it from the normalised events (e.g. with ical-generator), or POST the events to a booking/calendar API.

Schedule fields: how each extracts and what to finish

The generic row/column extraction applied to a typical programme, with the schedule-specific cleanup your code does.

Schedule element	What you get	Your normalise step
Time / time range	String as printed (`"9:00–10:30"`, `"14:30"`)	Parse to start/end ISO 8601 datetimes
Date / day banner	Often a near-empty row (`"Tuesday"` then blanks)	Capture as the current `date`, then filter the row
Session / title	String; long titles may wrap to a second row	Merge continuation rows (empty time = belongs above)
Room / track / speaker	Own keys when columnar	Map to your event fields
Grid timetable cell	Flattened by visual row, not one-per-cell	Reshape rows×columns into individual events
Reprinted page header	Becomes a data row from page 2 onward	Filter `r.Time !== 'Time'`

Tier limits for schedule PDFs

Most programmes and timetables are small; multi-day conference books can run long.

Tier	Max file size	Max pages
Free	2 MB	50
Pro	50 MB	500
Pro + Media	500 MB	2,000
Developer	2 GB	10,000

Cookbook

A real conference programme extraction and the steps that turn printed times into calendar-ready events.

A day's programme with a day banner

Note 'Tuesday 2 June' becomes a near-empty row, and the long keynote title wraps to a second row with an empty time.

PDF:
Time         Session                         Room
Tuesday 2 June
09:00–10:00  Opening keynote: the road
             ahead for the platform          Main Hall
10:15–11:00  Workshop A                       Room 2

Downloaded JSON:
[
  { "Time": "Tuesday 2 June", "Session": "",                              "Room": "" },
  { "Time": "09:00–10:00",   "Session": "Opening keynote: the road",     "Room": "" },
  { "Time": "",              "Session": "ahead for the platform",        "Room": "Main Hall" },
  { "Time": "10:15–11:00",   "Session": "Workshop A",                    "Room": "Room 2" }
]

Carry the date down and merge wrapped titles

Walk the rows once: remember the current day banner, attach it to following events, and fold continuation rows (empty time) into the event above.

const rows = JSON.parse(json);
const events = [];
let date = null;
for (const r of rows) {
  if (/\d/.test(r.Time) && !/^\d{1,2}[:.]/.test(r.Time) === false) {} // (illustrative)
  if (/^[A-Za-z]+ \d/.test(r.Time) && !r.Session) { date = r.Time; continue; } // day banner
  if (!r.Time && events.length) {                       // wrapped continuation
    events[events.length - 1].Session += " " + r.Session;
    if (r.Room) events[events.length - 1].Room = r.Room;
    continue;
  }
  events.push({ date, ...r });
}

Normalise a printed time range to ISO 8601

Combine the carried date with the printed time range and produce real start/end datetimes — explicit timezone, no guessing.

import dayjs from "dayjs";

function toISO(date, time) {           // date "Tuesday 2 June", time "09:00–10:00"
  const [start, end] = time.split(/[–-]/).map(s => s.trim());
  const d = dayjs(date.replace(/^[A-Za-z]+ /, "") + " 2026", "D MMMM YYYY");
  const mk = t => d.hour(+t.split(":")[0]).minute(+t.split(":")[1]).second(0);
  return { start: mk(start).toISOString(), end: mk(end).toISOString() };
}
toISO("Tuesday 2 June", "09:00–10:00");
// { start: "2026-06-02T09:00:00.000Z", end: "2026-06-02T10:00:00.000Z" }

Generate an .ics calendar file

Calendars import iCal, not JSON. Build .ics from the normalised events so attendees can subscribe.

import ical from "ical-generator";

const cal = ical({ name: "Conference 2026" });
for (const e of normalisedEvents) {
  cal.createEvent({
    start: new Date(e.start),
    end:   new Date(e.end),
    summary: e.Session,
    location: e.Room,
  });
}
fs.writeFileSync("programme.ics", cal.toString());

Reshape a grid timetable into per-cell events

A class timetable with days across the top flattens by row. Pivot it into one event per (time-slot, day) cell.

// rows keyed by { Time, Mon, Tue, Wed, Thu, Fri }
const days = ["Mon","Tue","Wed","Thu","Fri"];
const slots = JSON.parse(json).filter(r => r.Time && r.Time !== "Time");
const events = slots.flatMap(r =>
  days.filter(d => r[d]).map(d => ({ day: d, time: r.Time, subject: r[d] }))
);

Edge cases and what actually happens

Times and dates come out as text, not Date objects

Expected

Day banner becomes a near-empty row

Capture then filter

Session title wrapped onto two lines

Split row

Grid timetable flattens by row

Reshape needed

Repeated header on a multi-page programme

By design

From page 2 onward the reprinted column header is emitted as a data row. Filter r.Time !== 'Time' (or your first column) before building events.

Scanned printed programme

Empty array

A scanned booklet has no text layer, so extraction returns nothing. Run PDF OCR first to add a text layer, then extract — and double-check times, since OCR can confuse 0/O and 1/l.

Time range merges with the session text

Misaligned

Programme exceeds the free page limit

Blocked

A thick multi-day conference book can exceed 50 pages. Upgrade to Pro (500 pages) or extract just the schedule pages with PDF Extract Pages before running the tool.

Calendar app rejects the raw JSON

Wrong format

Frequently asked questions

Do dates and times come out as text or as typed values?

What happens with a schedule that spans multiple pages?

Can I import the JSON straight into Google Calendar?

How do I handle the day headers like 'Tuesday'?

My timetable is a grid (days across the top) — what do I get?

Why are long session titles split across two rows?

Is the schedule uploaded anywhere?

No. Extraction runs entirely in your browser via PDF.js. Unpublished programmes and internal rosters never leave your device; only anonymous usage counters are recorded when you're signed in.

My programme is a scanned booklet — will it work?

Not as-is: a scan has no text layer, so you'll get an empty array. Run PDF OCR first, then extract, and verify the times because OCR can misread digits like 0/O and 1/l.

Can I keep room and speaker columns separate?

What are the size and page limits?

Can I extract a fillable booking form instead of a printed table?

If the data is in AcroForm fields rather than a printed grid, use the PDF Form Field Extractor, which reads field names and values directly. Use this table tool for printed timetables and programmes.

Why does the preview only show 20 events?

The on-page preview caps at the first 20 objects to stay responsive and reports the total count. The downloaded .json contains every event — the cap is display-only.

Privacy first

All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.

Extract a PDF Schedule or Timetable Table to JSON

How to extract a pdf schedule or timetable table to json

Schedule fields: how each extracts and what to finish

Tier limits for schedule PDFs

Cookbook

A day's programme with a day banner

Carry the date down and merge wrapped titles

Normalise a printed time range to ISO 8601

Generate an .ics calendar file

Reshape a grid timetable into per-cell events

Edge cases and what actually happens

Times and dates come out as text, not Date objects

Day banner becomes a near-empty row

Session title wrapped onto two lines

Grid timetable flattens by row

Repeated header on a multi-page programme

Scanned printed programme

Time range merges with the session text

Programme exceeds the free page limit

Calendar app rejects the raw JSON

Frequently asked questions

Do dates and times come out as text or as typed values?

What happens with a schedule that spans multiple pages?

Can I import the JSON straight into Google Calendar?

How do I handle the day headers like 'Tuesday'?

My timetable is a grid (days across the top) — what do I get?

Why are long session titles split across two rows?

Is the schedule uploaded anywhere?

My programme is a scanned booklet — will it work?

Can I keep room and speaker columns separate?

What are the size and page limits?

Can I extract a fillable booking form instead of a printed table?

Why does the preview only show 20 events?

Privacy first

Related guides

Extract a PDF Schedule or Timetable Table to JSON

How to extract a pdf schedule or timetable table to json

Schedule fields: how each extracts and what to finish

Tier limits for schedule PDFs

Cookbook

A day's programme with a day banner

Carry the date down and merge wrapped titles

Normalise a printed time range to ISO 8601

Generate an .ics calendar file

Reshape a grid timetable into per-cell events

Edge cases and what actually happens

Times and dates come out as text, not Date objects

Day banner becomes a near-empty row

Session title wrapped onto two lines

Grid timetable flattens by row

Repeated header on a multi-page programme

Scanned printed programme

Time range merges with the session text

Programme exceeds the free page limit

Calendar app rejects the raw JSON

Frequently asked questions

Do dates and times come out as text or as typed values?

What happens with a schedule that spans multiple pages?

Can I import the JSON straight into Google Calendar?

How do I handle the day headers like 'Tuesday'?

My timetable is a grid (days across the top) — what do I get?

Why are long session titles split across two rows?

Is the schedule uploaded anywhere?

My programme is a scanned booklet — will it work?

Can I keep room and speaker columns separate?

What are the size and page limits?

Can I extract a fillable booking form instead of a printed table?

Why does the preview only show 20 events?

Privacy first

Related guides