More and more receipts never touch paper. Your Uber fare, your Airbnb stay, your AWS bill, your GitHub subscription — they all arrive as PDFs, sitting in your email or in another app. A receipt scanner that only works with the camera misses half of modern spending. This is a look at why PDF receipts are genuinely hard to handle on Android, and how Enceipt's ingestion pipeline deals with them.
Why PDF scanning is different from camera scanning
When you photograph a paper receipt, the app runs optical character recognition (OCR) on an image: it has to find text in pixels, deal with lighting, skew, and crumples. A PDF seems easier — surely the text is just there? Sometimes. But "PDF" covers two very different things, and the difference matters a lot.
- Text-based PDFs contain real, selectable text. Invoices from AWS, GitHub, Adobe, or Notion are usually like this. The text can be extracted directly, with no OCR needed — fast and accurate.
- Image-based PDFs are really just a photo wrapped in a PDF container. A scanned paper invoice, or a receipt someone photographed and "saved as PDF", has no extractable text at all. To read it you must run OCR on the embedded image, exactly as you would with a camera capture.
A good PDF scanner has to detect which kind it is dealing with and route accordingly. Treat an image PDF as text and you get nothing; treat a text PDF as an image and you throw away accuracy.
The common problems
Beyond the text-versus-image split, PDF receipts bring their own headaches:
- Multi-page invoices. A hotel folio or a cloud bill can run several pages, with the total on the last page and line items spread across the rest. Grabbing only the first page loses the number that matters.
- Amount versus address confusion. As with rideshare receipts, PDFs are full of numbers that are not the total: postal codes, phone numbers, invoice IDs, order numbers, dates rendered as digits. Picking the "biggest number" is a recipe for errors.
- Inconsistent layouts. Every vendor formats differently. "Total", "Amount due", "Grand total", "Charged to card" — the label that marks the real figure varies, and sometimes there are several candidate totals (subtotal, tax, total).
Enceipt's approach
Enceipt treats a shared document as the start of a pipeline rather than a single guess.
First, it detects the document type. If the PDF has an extractable text layer, Enceipt reads it directly. If it is an image-only PDF, Enceipt renders the page and runs the same on-device OCR it uses for the camera. Either way, the text ends up in one place.
Then it applies zone-aware parsing. Rather than scanning for the largest number, the parser understands receipt anatomy — where merchant identity, line items, and totals tend to sit, and which labels mark the amount actually paid. It discounts numbers that look like ZIP codes, phone numbers, or order IDs, and across multi-page documents it looks for the final total rather than an early subtotal.
For genuinely awkward documents, Pro users can bring their own AI provider. With a key for OpenAI, Anthropic Claude, Google Gemini, or a self-hosted Ollama endpoint, Enceipt can send the extracted text (never the image, never card numbers) to that provider to untangle a difficult layout — and it falls back to the on-device parser if the call times out.
Share from any app — no download dance
The best part is how little friction there is. You do not have to download the PDF, find it in a files app, and import it. On Android you simply use the share sheet:
- In Uber, Airbnb, your email client, or any app holding the receipt, tap Share.
- Choose Enceipt.
- Enceipt ingests the document on your device and opens the review screen with the merchant, total, and date filled in.
Everything happens locally. The PDF is read on your phone; nothing is uploaded unless you have explicitly configured a BYOK AI provider and chosen to use it.
Supported sources
Because the pipeline is built around general PDF handling rather than per-vendor hacks, it works with a wide range of senders, including:
- Uber and Lyft
- Airbnb and Booking.com
- AWS, GitHub, Adobe, Notion, Zoom, and Slack
- Amazon order summaries and countless standard invoices
If an app can produce or share a PDF, you can usually get it into Enceipt.
Why it stays private
It is worth repeating the through-line of everything Enceipt does: the document, the extracted text, and the resulting expense all stay on your device. There is no account and no server-side storage. A PDF receipt that lands in Enceipt becomes an encrypted local record — and, when you are ready, part of a clean PDF or CSV report for your accountant.
Try it
If your receipts arrive as PDFs as often as on paper, you need a scanner that handles both. Enceipt does, on-device, with a share-sheet flow that takes seconds.