资源数据集NIST Structured Forms Reference Set of Binary Images (SFRS) 图像数据

NIST Structured Forms Reference Set of Binary Images (SFRS) 图像数据

2019-12-18 | |  110 |   0 |   0

The NIST Structured Forms Database consists of 5,590 pages of binary, black-and-white images of synthesized documents.

The documents in this database are 12 different tax forms from the IRS 1040 Package X for the year 1988. These include Forms 1040, 2106, 2441, 4562, and 6251 together with Schedules A, B, C, D, E, F, and SE.

Eight of these forms contain two pages or form faces; therefore, there are 20 different form faces represented in the database.

The document images in this database appear to be real forms prepared by individuals, but the images have been automatically derived and synthesized using a computer.

There are 900 simulated tax submissions represented in the database averaging 6.2 form faces per submission.

The database has the following features:

  • 900 simulated tax submissions

  • 5,590 images of completed structured form faces

  • 5,590 text files containing entry field answers

  • 20 tables of entry field types and contexts

Suitable for both document processing and automated data capture research, development, and evaluation, the data set can be used for:

  • forms identification

  • field isolation; locating the entry fields on the form

  • character segmentation: separating entry field values into characters

  • character recognition: identifying specific machine printed characters

This database is a valuable tool for measurement of system performance and system comparison on complex forms.


上一篇:TCGA-LUAD 肺癌CT图像数据

下一篇:NIST Structured Forms Reference Set of Binary Images (SFRS) II 图像数据

用户评价
全部评价

热门资源

  • GRAZ 图像分类数据

    GRAZ 图像分类数据

  • MIT Cars 汽车图像...

    MIT Cars 汽车图像数据

  • 凶杀案报告数据

    凶杀案报告数据

  • 猫和狗图像分类数...

    Kaggle 上的竞赛数据,用以区分猫和狗两类对象,...

  • Bosch 流水线降低...

    数据来自产品在Bosch真实生产线上制造过程中的设备...