Google Drive Document Intake


§ Stack · Google Drive

Document intake from a Google Drive folder.

Google Drive is the default document source for most SMB and mid-market back offices. A Drive folder called “Inbox” or “To Process” accumulates attachments faster than any human can clear. A scheduled intake process watches the folder, extracts the key fields from each new document, queues draft records for review, and moves the source file to a dated archive.

Works on any Workspace tier

The process requires OAuth Drive API access. No specific dependency on Workspace Business, Business Plus, or Enterprise. Free/personal Gmail accounts work but Drive API rate limits cap meaningful volume — plan on Workspace tier above Starter if you’re processing more than a few hundred documents per week.

Scheduled sequence

  1. Every N minutes (configurable — typical is 15 min during business hours, hourly off-hours), the process polls the designated Drive folder for new documents.
  2. Each new file is classified by type (invoice, PO, signed form, packing slip, etc.) using a combination of filename pattern and first-page OCR.
  3. Fields are extracted per document type. Invoice: vendor, date, amount, PO ref, line items. Signed form: form type, filer name, date, key fields. Each extraction produces a confidence score.
  4. High-confidence extractions (default: ≥95%) are queued as draft records in the destination system. Low-confidence extractions are flagged for human review with the raw document and the candidate extractions displayed side-by-side.
  5. After the reviewer approves (or corrects) the draft, the record is committed to the destination system. The source file is moved to a dated archive subfolder (e.g. archive/2026-04/).
  6. Daily summary: N documents processed, N auto-approved, N sent to review, N failed to parse (with reasons).

What it does not do

  • Does not edit or rewrite source files. Source is read-only; the archive is immutable.
  • Does not auto-approve extractions below the confidence threshold. A reviewer sees every ambiguous document.
  • Does not send email to the person who uploaded the document. This is back-office intake, not a support channel.
  • Does not try to learn from rejected extractions without explicit configuration. Silent ML drift corrupts the process faster than manual handling fixes it.

Typical month after go-live

For a folder that was receiving 1,000–2,000 documents/month manually keyed, expect 65–80% auto-approved, 15–25% reviewed + approved, 3–10% flagged as unparseable and escalated. The specialist’s time shifts from keying to reviewing — roughly one-third of prior keying time goes away entirely; the rest becomes faster review work. The 3–10% escalation queue is where the interesting Exception Log entries come from.

See also: data-entry specialist role, validation inside Airtable.