Spreadsheet & Data Wrangling Master
提供完整的数据处理流程,涵盖清洗、转换、分析与自动化报告生成。
通过API提取收据、发票、银行账单等文档的结构化数据。
openclaw skills install @dbirulia/documents-ai命令、参数、文件名以原文为准
实时 OCR 与数据提取 API —— 从收据、发票、银行对账单、W-9 表格、采购订单等文档中提取结构化数据,支持文档分类、欺诈检测以及原始 OCR 文本输出。
获取 API 密钥: https://app.veryfi.com/api/settings/keys/
了解更多: https://veryfi.com
curl -X POST "https://api.veryfi.com/api/v8/partner/documents/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@invoice.pdf"响应示例:
{
"id": 62047612,
"created_date": "2026-02-19",
"currency_code": "USD",
"date": "2026-02-18 14:22:00",
"document_type": "receipt",
"category": "餐饮与娱乐",
"is_duplicate": false,
"vendor": {
"name": "Starbucks",
"address": "123 Main St, San Francisco, CA 94105"
},
"line_items": [
{
"id": 1,
"order": 0,
"description": "Caffe Latte Grande",
"quantity": 1,
"price": 5.95,
"total": 5.95,
"type": "food"
}
],
"subtotal": 5.95,
"tax": 0.52,
"total": 6.47,
"payment": {
"type": "visa",
"card_number": "1234"
},
"ocr_text": "STARBUCKS\n123 Main St...",
"img_url": "https://scdn.veryfi.com/documents/...",
"pdf_url": "https://scdn.veryfi.com/documents/..."
}curl -X POST "https://api.veryfi.com/api/v8/partner/bank-statements/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@bank-statement.pdf"响应示例:
{
"id": 4820193,
"created_date": "2026-02-19T12:45:00.000000Z",
"bank_name": "Chase",
"bank_address": "270 Park Avenue, New York, NY 10017",
"account_holder_name": "Jane Doe",
"account_holder_address": "456 Oak Ave, San Francisco, CA 94110",
"account_number": "****7890",
"account_type": "支票账户",
"routing_number": "021000021",
"currency_code": "USD",
"statement_date": "2026-01-31",
"period_start_date": "2026-01-01",
"period_end_date": "2026-01-31",
"beginning_balance": 12500.00,
"ending_balance": 11835.47,
"accounts": [
{
"number": "****7890",
"beginning_balance": 12500.00,
"ending_balance": 11835.47,
"summaries": [
{ "name": "总存款", "total": 3200.00 },
{ "name": "总取款", "total": 3864.53 }
],
"transactions": [
{
"order": 0,
"date": "2026-01-05",
"description": "直接存款 - ACME Corp 薪资",
"credit_amount": 3200.00,
"debit_amount": null,
"balance": 15700.00,
"category": "收入"
},
{
"order": 1,
"date": "2026-01-12",
"description": "房租支付 - 456 Oak Ave",
"credit_amount": null,
"debit_amount": 2800.00,
"balance": 12900.00,
"category": "住房"
},
{
"order": 2,
"date": "2026-01-20",
"description": "PG&E 电费账单",
"credit_amount": null,
"debit_amount": 1064.53,
"balance": 11835.47,
"category": "公用事业"
}
]
}
],
"pdf_url": "https://scdn.veryfi.com/bank-statements/...",
"img_url": "https://scdn.veryfi.com/bank-statements/..."
}# 访问 API 认证凭据页面
https://app.veryfi.com/api/settings/keys/保存您的 API 密钥:
export VERYFI_CLIENT_ID="your_client_id_here"
export VERYFI_USERNAME="your_username_here"
export VERYFI_API_KEY="your_api_key_here"推荐做法:使用环境变量(最安全):
{
skills: {
entries: {
"veryfi-documents-ai": {
enabled: true,
// 密钥从环境变量加载:
// VERYFI_CLIENT_ID, VERYFI_USERNAME, VERYFI_API_KEY
},
},
},
}替代方案:将密钥存入配置文件(需谨慎使用):
{
skills: {
entries: {
"veryfi-documents-ai": {
enabled: true,
env: {
VERYFI_CLIENT_ID: "your_client_id_here",
VERYFI_USERNAME: "your_username_here",
VERYFI_API_KEY: "your_api_key_here",
},
},
},
},
}安全提示: 若将 API 密钥存储在 ~/.openclaw/openclaw.json 中:
chmod 600 ~/.openclaw/openclaw.jsoncurl -X POST "https://api.veryfi.com/api/v8/partner/documents/" \
-H "Content-Type: application/json" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-d '{
"file_url": "https://example.com/invoice.pdf"
}'curl -X POST "https://api.veryfi.com/api/v8/partner/any-documents/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@passport.jpg" \
-F "blueprint_name=passport"curl -X POST "https://api.veryfi.com/api/v8/partner/checks/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@check.jpg"curl -X POST "https://api.veryfi.com/api/v8/partner/w9s/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@w9.pdf"W-2 和 W-8 表单没有专用的端点。请使用 any-documents 端点,并指定相应的蓝图名称:
# W-2
curl -X POST "https://api.veryfi.com/api/v8/partner/any-documents/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@w2.pdf" \
-F "blueprint_name=w2"
# W-8
curl -X POST "https://api.veryfi.com/api/v8/partner/any-documents/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@w8.pdf" \
-F "blueprint_name=w8"注意: W-2 和 W-8 在
/classify/接口中显示为分类类型,但其数据提取通过 Any Document 端点完成。请勿向/api/v8/partner/w2s/或/api/v8/partner/w8s/发送请求——这些端点不存在。
所有提取端点的响应中都包含一个 ocr_text 字段,其中包含文档的原始纯文本内容。此功能适用于您希望自行处理文本或将其传递给大语言模型(LLM)的场景。
# 提取并使用 jq 获取 ocr_text
curl -X POST "https://api.veryfi.com/api/v8/partner/documents/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@document.pdf" \
| jq '.ocr_text'注意:
ocr_text是纯文本,不是 Markdown 格式。如需 Markdown 格式的输出,请在提取后将ocr_text传入 LLM 进行重格式化。
在不进行完整数据提取的情况下识别文档类型。适用于文档路由、上传前过滤或批量分类。
curl -X POST "https://api.veryfi.com/api/v8/partner/classify/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@document.pdf"注意: 默认情况下,API 会针对 15 种内置类型进行分类。您也可以通过传递
document_types数组来自定义分类类别(见下方示例)。
响应示例:
{
"id": 81023456,
"document_type": {
"score": 0.97,
"value": "invoice"
}
}默认文档类型:receipt(收据)、invoice(发票)、purchase_order(采购订单)、bank_statement(银行对账单)、check(支票)、w2(W-2)、w8(W-8)、w9(W-9)、statement(声明)、contract(合同)、credit_note(贷项通知单)、remittance_advice(汇款通知)、business_card(名片)、packing_slip(装箱单)、other(其他)。
若要对自定义类型进行分类,请传递 document_types 数组:
curl -X POST "https://api.veryfi.com/api/v8/partner/classify/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@document.pdf" \
-F 'document_types=["lease_agreement", "utility_bill", "pay_stub"]'用于布局分析,可获取元素坐标:
-F "bounding_boxes=true"
-F "confidence_details=true"| 文档类型 | 端点 | 说明 |
|---|---|---|
| 收据与发票 | /api/v8/partner/documents/ | 适用于收据、发票、采购订单 |
| 银行对账单 | /api/v8/partner/bank-statements/ | 适用于银行对账单 |
| 支票 | /api/v8/partner/checks/ | 适用于银行支票(加拿大称支票为 cheque) |
| W-9 表单 | /api/v8/partner/w9s/ | 适用于 W-9 表单 |
| W-2 / W-8 表单 | /api/v8/partner/any-documents/ | 使用 blueprint_name=w2 或 blueprint_name=w8 |
| 任意文档 | /api/v8/partner/any-documents/ | 用于提取任意文档数据;支持的蓝图列表如下 |
| 分类 | /api/v8/partner/classify/ | 在不进行完整提取的情况下识别文档类型 |
技能:Veryfi Documents AI
版本:1.0.1
分块:3/5
可用蓝图列表:
| blueprint_name | 文档类型 |
|---|---|
| passport | 美国或国际护照 |
| incorporation_document | 公司注册证书 |
| us_driver_license | 美国驾驶执照 |
| uk_drivers_license | 英国驾驶执照 |
| us_health_insurance_card | 美国健康保险卡 |
| prescription_medication_label | 处方药品标签 |
| medication_instructions | 药物使用说明 |
| vision_prescription | 视力处方 |
| auto_insurance_card | 汽车保险卡 |
| restaurant_menu | 餐厅菜单 |
| drinks_menu | 饮料菜单 |
| product_nutrition_facts | 产品营养成分标签 |
| goods_received_note | 货物签收单 |
| vendor_statement | 供应商对账单 |
| flight_itinerary | 航班行程单 |
| bill_of_lading | 提货单 |
| air_waybill | 空运运单 |
| freight_invoice | 货运发票 |
| shipping_label | 快递标签 |
| vehicle_registration | 车辆注册证明 |
| work_order | 工作订单 |
| settlement_letter | 结算函 |
| construction_estimate | 建筑工程估价单 |
| diploma | 毕业证或学位证书 |
| price_sheet | 价格表 |
| mortgage_application_form | 抵押贷款申请表 |
| lab_test_request_form | 化验单申请表 |
| construction_snapshot | 施工现场快照 |
| medical_prescription_list | 医疗处方清单 |
| v5c | 英国车辆注册证书(V5C) |
| bank_account_verification_letter | 银行账户验证信 |
| annual_mortgage_statement | 年度抵押贷款对账单 |
| investment_account_statement | 投资账户对账单 |
| certificate_of_good_standing | 良好存续证明 |
| w2 | 美国国税局 W-2 工资与税款申报表 |
| w8 | 美国国税局 W-8 外籍身份证明 |
缺少所需文档类型?
如果需要提取数据的文档类型(蓝图)未在列表中,请在此创建:
https://app.veryfi.com/inboxes/anydocs?tab=blueprints
边界框与置信度:
-F "bounding_boxes=true" 以获取元素坐标-F "confidence_details=true" 以获取字段级别的置信度分数支持的输入方式:
file — 多部分文件上传file_url — 可公开访问的 URLfile_data — base64 编码内容(需以 JSON 身体形式发送,包含 file_name 和 file_data 字段)/api/v8/partner/documents/){
"id": 62047612,
"created_date": "2026-02-19T00:00:00.000000Z",
"updated_date": "2026-02-19T00:00:05.000000Z",
"currency_code": "USD",
"date": "2026-02-18 14:22:00",
"due_date": "2026-03-18",
"document_type": "receipt",
"category": "餐饮与娱乐",
"is_duplicate": false,
"is_document": true,
"invoice_number": "INV-2026-001",
"account_number": "ACCT-12345",
"order_date": "2026-02-18",
"delivery_date": null,
"vendor": {
"name": "星巴克",
"address": "123 主街,旧金山,加利福尼亚州 94105",
"phone_number": "+1 415-555-0100",
"email": null,
"vat_number": null,
"reg_number": null
},
"bill_to": {
"name": "简·多",
"address": "456 橡树大道,旧金山,加利福尼亚州 94110"
},
"ship_to": {
"name": null,
"address": null
},
"line_items": [
{
"id": 1,
"order": 0,
"description": "大杯拿铁",
"quantity": 1,
"price": 5.95,
"total": 5.95,
"tax": 0.52,
"tax_rate": 8.75,
"discount": null,
"type": "food",
"sku": null,
"upc": null,
"category": "餐饮与娱乐",
"section": null,
"date": null,
"start_date": null,
"end_date": null
}
],
"tax_lines": [
{
"order": 0,
"name": "销售税",
"rate": 8.75,
"total": 0.52,
"base": 5.95
}
],
"subtotal": 5.95,
"tax": 0.52,
"tip": 0.00,
"discount": 0.00,
"total": 6.47,
"payment": {
"type": "visa",
"card_number": "1234"
},
"reference_number": null,
"notes": null,
"img_url": "https://scdn.veryfi.com/documents/...",
"pdf_url": "https://scdn.veryfi.com/documents/...",
"ocr_text": "STARBUCKS\n123 Main St...",
"meta": {
"total_pages": 1,
"processed_pages": 1,
"fraud": {
"score": 0.01,
"color": "green",
"decision": "非欺诈",
"types": []
}
}
}/api/v8/partner/checks/){
"id": 9301847,
"created_date": "2026-02-19T00:00:00.000000Z",
"updated_date": "2026-02-19T00:00:03.000000Z",
"amount": 1500.00,
"amount_text": "一千五百整",
"check_number": "4021",
"date": "2026-02-15",
"currency_code": "USD",
"check_type": "personal_check",
"payer_name": "约翰·史密斯",
"payer_address": "789 橡树街,奥斯汀,德克萨斯州 78701",
"receiver_name": "阿克米管道公司 LLC",
"receiver_address": null,
"bank_name": "富国银行",
"bank_address": "420 蒙哥马利街,旧金山,加利福尼亚州 94104",
"memo": "发票 #2026-038",
"is_signed": true,
"micr": {
"routing_number": "121000248",
"account_number": "****5678",
"serial_number": "4021",
"raw": "⑆121000248⑆ ****5678⑈ 4021",
"branch": null,
"institution": null
},
"fractional_routing_number": "12-1/1200",
"routing_from_fractional": "121000248",
"endorsement": {
"is_endorsed": true,
"is_signed": true,
"mobile_or_remote_deposit": {
"checkbox": false,
"instructions": false
}
},
"handwritten_fields": ["amount", "amount_text", "date", "receiver_name", "memo"],
"fraud": {
"score": 0.02,
"color": "green",
"types": [],
"pages": [
{
"is_lcd": { "score": 0.98, "value": false },
"ai_generated": { "score": 0.99, "value": false },
"four_corners_detected": true
}
]
},
"img_thumbnail_url": "https://scdn.veryfi.com/checks/...",
"pdf_url": "https://scdn.veryfi.com/checks/..."
}/api/v8/partner/bank-statements/)markdown
id: 4820193
created_date: "2026-02-19T12:45:00.000000Z"
updated_date: "2026-02-19T12:45:10.000000Z"
bank_name: Chase
bank_address: 270 Park Avenue, New York, NY 10017
account_holder_name: Jane Doe
account_holder_address: 456 Oak Ave, San Francisco, CA 94110
account_number: ****7890
account_type: Checking
routing_number: 021000021
currency_code: USD
statement_date: 2026-01-31
period_start_date: 2026-01-01
period_end_date: 2026-01-31
beginning_balance: 12500.00
ending_balance: 11835.47
minimum_due: null
due_date: null
accounts:
- number: ****7890
beginning_balance: 12500.00
ending_balance: 11835.47
summaries:
- name: Total Deposits
total: 3200.00
- name: Total Withdrawals
total: 3864.53
transactions:
- order: 0
date: 2026-01-05
posted_date: 2026-01-05
description: Direct Deposit - ACME Corp Payroll
credit_amount: 3200.00
debit_amount: null
balance: 15700.00
category: Income
vendor: ACME Corp
- order: 1
date: 2026-01-12
posted_date: 2026-01-12
description: Rent Payment - 456 Oak Ave
credit_amount: null
debit_amount: 2800.00
balance: 12900.00
category: Housing
vendor: null
fraud:
score: 0.01
color: green
types: []
pdf_url: https://scdn.veryfi.com/bank-statements/...
img_thumbnail_url: https://scdn.veryfi.com/bank-statements/...
重要提示: 上传至 Veryfi 的文档将通过 https://api.veryfi.com 传输,并在 AWS 服务器上进行处理。
上传敏感文档前,请注意:
最佳实践建议:
"your_api_key_here"Veryfi 对每个账户实施速率限制,具体限制取决于您的计划层级。
一般建议:
400 Bad Request:
file、file_url 或 file_data(base64 格式)file_data 时,应以 JSON 身体形式发送(非 multipart),并包含 file_name 和 file_data 字段message 字段以获取具体错误信息401 Unauthorized:
Client-Id、VERYFI_USERNAME 或 VERYFI_API_KEY 错误或已过期Authorization 头部格式为 apikey USERNAME:API_KEY(无额外空格)413 Payload Too Large:
429 Too Many Requests:
500 / 5xx 服务器错误:
缺少置信度分数:
confidence_details=true,以在响应中包含 score 和 ocr_score 字段bounding_boxes=true 可同时获取 bounding_box 和 bounding_region 坐标信息W-2 / W-8 接口返回 404:
/w2s/ 或 /w8s/ 接口 —— 请使用 /any-documents/ 并设置 blueprint_name=w2 或 blueprint_name=w8VERYFI_CLIENT_ID、VERYFI_USERNAME 和 VERYFI_API_KEY 存储在环境变量中,而非硬编码confidence_details=true 和 bounding_boxes=true/classify/ 接口进行分类,再将文档路由至相应的提取接口ocr_text 字段提供原始提取文本 —— 如需转换为 Markdown 或进一步处理,可将其传递给 LLMfile_name,以便 Veryfi 正确推断文件类型已收录 1 个 Skill