How to set up: vendor invoices
- Step 1Pair a runner and add the Google credential — Both `http-request` and `google-sheets` are connectors marked `runnerOnly` — they execute only on a paired `@jadapps/runner`, not in the browser. In the runner, store an OAuth2 credential with the Sheets scope under a name like `google-prod`; the workflow references it by that name via `credentialRef`.
- Step 2Fork the blueprint — From the workflow page, fork the blueprint. The orchestrator copies the three-node chain (`http-request` → `pdf-to-text` → `google-sheets`) into a new private draft owned by you, wires consecutive ports, and snapshots it as version 1. A `forked` audit event records the source blueprint slug.
- Step 3Configure the source URL — Open the `http-request` node and set `url` to the PDF you want (method `GET`, default `timeoutMs` 60000). The response streams to disk on the runner and passes to the next node as a file.
- Step 4Set the Sheets target — Open the `google-sheets` node: action `appendRows`, set `spreadsheetId`, set `range` (e.g. `Sheet1!A:Z`), keep `valueInputOption` `USER_ENTERED` so dates and numbers parse, and set `credentialRef` to your stored credential name.
- Step 5Run and watch the trace — Trigger the run. The orchestrator walks the topologically sorted nodes once, firing a status callback per node. Each node shows pending → running → done/error with its duration and a one-line summary; the per-node trace is persisted on the run record.
- Step 6(Optional) Schedule it — The blueprint ships with `scheduleCron: null` (manual/webhook). To run it on a schedule, set a `schedule_cron` on your forked workflow; the Cloudflare cron tick scans scheduled workflows, decides which are due since `last_fired_at`, and enqueues a run.
Frequently asked questions
What exactly does this workflow chain?+
Three nodes in order: `http-request` (GET a PDF by URL), [pdf-to-text](/pdf-tools/pdf-to-text) (extract the text on the runner), then `google-sheets` with `action: appendRows` (append the result to a tab). That is the literal blueprint chain.
Does it ingest a whole folder of PDFs at once?+
Not as shipped. The blueprint chain fetches a single PDF per run via `http-request`. To process many PDFs you trigger the workflow per file (e.g. by webhook), or add a for-each loop node around the chain yourself after forking.
Are my PDFs uploaded to JAD's servers?+
No. PDF text extraction runs on your paired runner. The connectors are runnerOnly. The only data sent off-machine is the extracted output going to Google Sheets over OAuth.
Where are my Google credentials stored?+
On the runner. The `google-sheets` node references a credential by name (`credentialRef`, e.g. `google-prod`); the runner resolves the OAuth2 token at run time. The token is not stored on the JAD/Cloudflare server.
Do I need a paid plan to run this?+
Both `http-request` and `google-sheets` are Pro connectors (`isPro: true`) and require a paired runner. Note this is separate from the media tier-precheck: `precheckWorkflowTier` only blocks nodes whose `minTier` exceeds your tier, and these connectors carry no `minTier`, so they do not trip the pro_media gate.
Is this workflow cron-scheduled out of the box?+
No — the blueprint sets `scheduleCron: null`, so a fork runs manually or by webhook. Scheduling is opt-in: set a `schedule_cron` on your forked workflow and the Cloudflare cron tick will enqueue runs when they are due.
How does forking work?+
Forking calls the from-blueprint route, which copies the three-node chain into a new private draft owned by you, wires consecutive ports, snapshots it as version 1, and writes a `forked` audit event tagged with the blueprint slug.
Can I see a run history / trace?+
Yes. Each run records a `WorkflowRunTrace[]` — per-node status, duration, output size, and a one-line summary — plus an output_summary with step/success counts and total duration. The run panel renders each node pending → running → done/error as the orchestrator fires its onStep callbacks.
What happens if a PDF has no tables or no text?+
pdf-to-text returns an empty or near-empty string (it does not OCR scanned images). The chain still completes; appendRows may add a blank row. For scanned PDFs, run an OCR step before this chain.
Why did numbers like account IDs change in my sheet?+
`valueInputOption` defaults to USER_ENTERED, which parses values like a human typing — leading zeros drop and long numbers go scientific. Switch the node to RAW to keep cells as literal strings.
What if the source URL is slow or flaky?+
Raise `timeoutMs` on the `http-request` node (default 60000, max 600000) and set its error policy to retry. The runner retries a failing step up to 3 times with linear backoff (200ms, 400ms) before surfacing the error.
What other workflows are like this one?+
See [csv-to-slack-summary](/workflows/csv-to-slack-summary) for a data-clean-then-connector pattern, [rss-to-notion-digest](/workflows/rss-to-notion-digest) for a fetch-then-create-page chain, and [video-transcode-to-r2](/workflows/video-transcode-to-r2) for a local-process-then-upload chain.