How do I convert a bank statement PDF that includes multiple accounts into separate CSV files?

Jan 22, 2026

If your bank sends “combined” statements—one PDF with several accounts—you’ve felt the pain. Hours of copy/paste just to reconcile or import. No thanks.

Here’s the fix: split that combined bank statement by account number and export clean, separate CSV files in minutes. No manual slicing. No guesswork.

Below, you’ll see why multi-account PDFs are tricky, the fastest way to handle them with BankXLSX, and a simple workflow you can reuse every month. I’ll also show a DIY Excel Power Query option for clean digital PDFs, what to do with scanned statements (OCR), and how to batch large piles of files. You’ll get a validation checklist, common pitfalls, and tips for naming, deduping, and handling currency.

Whether you’re closing the books, feeding a BI report, or prepping for audit, this will take you from one messy PDF to accurate per-account CSVs—fast and consistent.

Who this guide is for and what you’ll achieve

Controllers, accountants, finance managers, bookkeepers—anyone stuck with combined statements. If you can’t easily export a bank statement to CSV/Excel with multiple accounts, this is for you.

  • A repeatable way to split one PDF into separate files—one for each account.
  • Clean columns: Date, Description, Amount, Balance, Reference. Signs and formats done right.
  • File names people understand and auditors won’t question.

Example: a mid-market SaaS team runs four operating accounts and one payroll account. They get one 18-page PDF each month. Instead of copy/paste, they auto-split and export five CSVs—{account_number}_{yyyy-mm}.csv—and import them straight into the subledger.

Most “bad data” isn’t bad. It’s tiny format mismatches. Decide once how to treat parentheses, decimals, wrapped descriptions, and you’ll stop fixing the same issue every month.

Why multi-account PDF statements are tricky to convert

Humans see account sections. Extractors don’t. Account names and numbers often live in headers or section titles, not in the rows. Once the table is pulled out, the row loses its account context unless you add it back.

Layouts also vary inside the same PDF. One section shows Debit/Credit columns. Another shows a single Amount with a running balance. ACH and card settlements love long, wrapped descriptions, so you have to merge multi-line transaction descriptions into one row or the import goes sideways.

  • Negative numbers in parentheses. Miss this, and totals won’t match.
  • Locale mix. 1.234,56 vs 1,234.56—same document, different sections.
  • Scanned vs digital. Scans need OCR; low DPI turns 0 into O and 1 into I.
  • Balance or subtotal lines that masquerade as transactions.

Real example: merchant settlement pages often stuff subtotals and footnotes between batches. If you don’t filter them out, you’ll import fake “transactions.”

Quick answer: the fastest path to clean, separate CSVs

Use a bank-statement tool that auto-detects account sections and exports per-account CSVs in one go. With BankXLSX, you set a profile for your bank’s layout—how accounts are found, how columns map, how dates and parentheses work—then reuse it each month.

  • Upload the combined PDF.
  • BankXLSX finds account sections like “Account ending in 1234” or “Checking 4567” and tags rows correctly.
  • It outputs one CSV per account, consistently named and structured.

Most teams go from hours of manual work to a few minutes of review. The real win is predictability: once your rules are in, you won’t chase oddball fees or refunds again. Audits get easier too—same inputs, same outputs, plus a manifest.

Prerequisites and file preparation

Do a tiny bit of setup so the export behaves.

  • Gather your PDFs. Note digital vs scanned. Have passwords handy if they’re locked.
  • Decide outputs. CSV, XLSX, or both? Pick a per-account file naming plan: {account_number}_{yyyy-mm}.csv. Prefer names over numbers? Map them.
  • Choose formats. Use YYYY-MM-DD for dates. Confirm sign rules. Normalize dates and currency formats to match your ERP.
  • Pick columns. Usually: Date, Description, Amount (signed), optional Balance, Reference/Check No., Currency. If you need Debit and Credit separately, plan that mapping now.
  • Plan checks. Opening/closing balances, transaction counts, totals—per account.

Example: global subsidiary? Use English OCR for scans, keep the original currency symbol, and add a second column for the transaction currency. FX can happen later in the ledger with better controls.

Step-by-step: Auto-split and convert with BankXLSX

  1. Create a conversion profile. Select PDF as the source. If you have scans, enable OCR and choose the language.

  2. Detect account sections. Pick what matches your statement:

    • Header pattern: text like “Account ending in 1234.”
    • Section delimiters: “Checking Account,” “Account Summary,” etc.
    • In-table column: if the table already includes an Account column.
  3. Map fields. Date, Description, Amount, Balance, Reference, Currency. If you have separate Debit and Credit, map them to a single signed Amount in CSV.

  4. Normalize and clean. Treat parentheses as negatives, set thousand/decimal separators, merge wrapped lines, and strip repeated headers/footers that sneak into tables.

  5. Preview. Upload a sample. Check that rows land under the right account. Spot-check totals, fees, and checks against the PDF.

  6. Configure outputs. One CSV per account. Set naming patterns. Optionally consolidate across multiple statements and turn on deduplication.

  7. Run and download. Process the backlog and send files to storage or your next system.

Tip: if your bank uses different layouts for cards vs operating accounts, create separate profiles. Simpler rules, fewer surprises.

Handling scanned PDFs and complex layouts

Working with scans? Use OCR, but set it up right so the numbers come out clean.

  • Scan at 300 DPI+ in grayscale or black-and-white.
  • Deskew and crop. Skewed pages cause column drift after OCR.
  • Pick the correct language pack. Avoid extra character sets you don’t need.

For tricky layouts, add guardrails:

  • Use multiple detection patterns. “Account ending in ####” and “Account Number: ####” can both appear.
  • Handle wraps explicitly. Merge rows with missing date/amount or clear indentation so long ACH memos stay together.
  • Segment by page ranges when needed. Skip check images and summary pages; keep just the transaction tables.

Example: a treasury team handling scans from different branches split one profile into two (legacy vs new templates). Manual fixes dropped to almost zero, and outputs stayed consistent.

Ensuring accuracy: validation and reconciliation checklist

Check a few things and you’ll avoid messy reimports later.

  • Balances. Opening and closing per account match the PDF. If balance rows aren’t exported, recompute to verify.
  • Signs. Debits and credits align with your system’s convention.
  • Dates. Watch for day/month swaps. Confirm min/max dates sit inside the statement period.
  • Counts. Transactions per account match the PDF or the page totals.
  • Narratives. Long descriptions survived the merge intact.
  • Filters. Balance and subtotal lines removed so they don’t import as transactions.
  • Duplicates. If you’re merging months, use a composite key (Date + Amount + Reference + trimmed Description hash) to de-duplicate.

Quick story: a controller kept seeing $0.01 interest differences. One savings section used European separators. Normalizing separators fixed every month after that.

Advanced configuration for scale and control

Small tweaks help when volume grows.

  • Account aliases. Show friendly names to humans; keep the original number in the data.
  • Duplicate logic. Composite keys plus a posting-date tolerance for weekends/holidays. Round amounts to two decimals before hashing to avoid minor OCR quirks.
  • Multi-currency. Keep account currency and transaction currency as separate fields. Do FX later in the ERP for audit clarity.
  • Corporate cards. Split by card number if available, or include a Card column for department chargebacks.
  • Batch runs. Batch convert bank statement PDFs to per-account CSVs using a simple manifest and consistent naming.
  • Output manifests and logs. Capture filename, account ID, row count, period, and profile version for tracing.

Treat your conversion profile like code. Version it, test on a sample, then run it on the real stuff. Reliable outputs, every time.

Alternative DIY workflow: Excel Power Query (best for simple digital PDFs)

Have clean digital PDFs and only a couple accounts? Excel Power Query can do the job.

  • Get Data from PDF. Pick the tables that hold transactions for each account.
  • Clean columns. Promote headers, set data types, normalize dates. If you have Debit/Credit, create a signed Amount column.
  • Add account context. If each table belongs to one account, add that number or name. If the number sits in nearby text, import it as a tiny table and merge.
  • Combine, then split. Append all tables into one dataset, then split by Account into separate queries.
  • Export. Load each query to a CSV with your naming pattern.

Example: a small agency with two accounts built one Excel file with four queries. Label each account, append, then export per-account. Next month they just hit refresh.

Heads-up: this falls apart with scans, mixed layouts, images, or when you need reliable auto-splitting by account number every time.

Light scripting approach for technical teams

Comfortable with Python or similar? A small script can be enough.

  • Extract. Use a PDF table extractor. If scanned, OCR first, then parse. Export raw rows per page.
  • Detect account context. Parse account headers or section titles and carry that account value down the rows until the next section.
  • Normalize. Merge multi-line descriptions by spotting rows missing dates/amounts. Convert parentheses to negatives. Standardize separators and dates.
  • Group and export. Group by account, de-duplicate, and write one CSV per account.
  • Test and monitor. Add checks for row counts and totals on known PDFs.

Pseudocode sketch:

for page in pdf_pages:
    header = parse_header(page)            # find account number/name here
    for row in table_rows(page):
        row.account = header.account
        rows.append(row)

rows = clean_and_normalize(rows)           # signs, dates, merged wraps
for acct, grp in groupby(rows, key=lambda r: r.account):
    export_csv(grp)

Build a small “layout fingerprint” for each bank template—fonts, column count, key phrases. If the bank redesigns the statement, the fingerprint fails early and you catch it before data moves downstream.

Common pitfalls and how to avoid them

  • Losing account context. Always assign an Account column or use automatic section splitting. Don’t rely on page numbers.
  • Parentheses and locales. Handle negative amounts in parentheses and set the right separators to avoid 1.234,56 vs 1,234.56 confusion.
  • Wrapped narratives. Merge rows without dates/amounts so long descriptions don’t split into multiple transactions.
  • Balance/subtotal lines. Filter them out so only real transactions make it into your import.
  • Deduplication. Use composite keys and posting-date fallbacks. Some banks repeat end-of-month items on the next statement’s first page.
  • OCR issues. Require 300 DPI. Validate a sample before running a batch.
  • Password handling. Only process what you’re authorized to access. Store passwords securely, never hardcoded.

Keep a small “known issues” note per bank layout. Example: “interest lines don’t include Reference.” Saves future-you a lot of head-scratching.

Security, privacy, and compliance considerations

Treat statements like the sensitive data they are.

  • Access control. Limit who can upload, view, or download outputs. Use roles and SSO if possible.
  • Encryption. Protect data in transit and at rest. Private buckets, tight credentials.
  • Password handling. Use a secrets manager, not a spreadsheet or script variables.
  • Retention. Keep files long enough for audit, not longer than policy allows.
  • Auditability. Keep run logs/manifests with filename, account ID, rows, period, and profile version.
  • Change control. Version your profile. Test changes on a sample before you run the month’s batch.

Example: a portfolio company logged a SHA-256 hash per output and the profile version. When auditors asked to reproduce May, they reran the same inputs and matched the hash. Discussion over.

FAQs

Can I split one PDF into multiple CSVs automatically?
Yes. Set up account detection and export one CSV per account. That’s the whole point here.

What if the statement is scanned or low quality?
Turn on OCR, scan at 300 DPI, and validate a sample so numbers and dates look right.

Can I consolidate multiple months into one per-account CSV without duplicates?
Yes. Use a composite key (Date + Amount + Reference + trimmed Description hash) and allow for posting date shifts.

Can I export to XLSX instead of CSV?
Sure. Many teams use CSV for ingestion and XLSX for reviewers who want filters and pivots.

How do I name files by account name instead of number?
Create an alias map so {account_name}_{yyyy-mm}.csv is friendly, and keep the number inside the data.

Will this work with password-protected statements?
Yes, if you’re authorized. Provide the password securely during processing.

What about mixed currencies?
Keep both: account currency and transaction currency. Do FX later, where controls are better.

Next steps

  • Grab a representative combined PDF. Note if it’s digital or scanned and which accounts are inside.
  • Define outputs. Decide on CSV and/or XLSX, columns, signs, and your {account_number}_{yyyy-mm}.csv naming.
  • Set up BankXLSX. Enable auto-split by account, map fields, and set rules for parentheses, locales, and wrapped lines.
  • Run a pilot. Export per-account, compare totals and counts to the PDF, tweak once.
  • Scale up. Batch the backlog, drop outputs into your shared drive or data lake, connect to ERP or BI ingestion.
  • Operationalize. Version the profile, keep manifests, schedule the monthly run.

You’ll go from a single messy PDF to accurate, per-account CSVs in minutes. Clean data your team trusts. Auditors, too.

Key Points

  • Fastest route: use BankXLSX to auto-split account sections from a combined PDF and export one CSV (or XLSX) per account. Build a reusable profile, turn on OCR for scans, and batch large folders fast.
  • Trust the data: normalize dates and currency, map Debit/Credit to a single signed Amount, handle parentheses negatives, and merge multi-line descriptions. Always validate balances, counts, and signs.
  • Ready for scale: use clear file names (e.g., {account_number}_{yyyy-mm}.csv), dedupe across months, keep currencies, and store manifests/audit logs.
  • DIY options: Excel Power Query works for clean digital PDFs; light scripting helps for custom setups. More effort than a bank-focused tool.

Conclusion

Turning a multi-account statement PDF into clean, per-account CSVs doesn’t need to be tedious. Create a profile once, auto-split accounts, normalize dates and amounts (including parentheses), merge long descriptions, and verify balances before import. For scans, use OCR. At volume, stick to consistent names, deduping, and manifests for audit.

Excel Power Query is fine for simple digital PDFs. If you want fewer headaches month after month, set up BankXLSX, upload a sample combined statement, and export per-account CSV or XLSX files in minutes.