Making Money using Digitize India - Frequently Asked Questions
Frequently Asked Questions on Digitize India
- What type of documents can I digitize using DIP?
- What type of data can I extract from the documents?
- How does DIP ensure the Data quality and accuracy?
- How does DIP ensure Data privacy and security?
- How do I get started on DIP?
- What is the desired outcome of the DIP?
What type of documents can I digitize using DIP?
You can digitize any document image that is human readable and has a
defined structure like a printed form or a register with defined rows &
columns. However it is suggested that you digitize only those documents which
are generated in high volume, have a similar document structure and need
frequent access.
What type of data can I extract from the documents?
DIP can process and extract multi-lingual text, numeric and
alphanumeric data from the document images.
How does DIP ensure the Data quality and accuracy?
DIP uses multiple levels of quality checks for verification and
validation of the data. It uses image validation technique to ensure that only
similar types of documents are processed in a batch. It uses pre-defined field
level validations to ensure correct data type entered by the crowd workforce
and multi-level data value comparisons through a maker-checker process for data
accuracy and quality check. Human validation is used for data fields that fail
the automated quality checks. In future DIP will be using pre-defined data
dictionaries and machine learning algorithms for higher levels of data accuracy
and quality.
How does DIP ensure Data privacy and security?
DIP is hosted on NIC's secure cloud infrastructure "Meghraj"
that provides restricted access only to authorized personnel. The data
transmission from the cloud to the crowd is secured through industry standard
encryption algorithms and protocols like SSL and HTTPS.
The data from the documents is distributed to the crowd in fragments
through a randomization algorithm that ensures that no individual gets more
than a fixed number of randomly assigned fields making it difficult to identify
the type of the data or the document.
The data extracts generated for an organization can be accessed only by
authorized personnel of the organization with system assigned ids and
passwords.
The identity and authentication of the crowd agents is done through
Aadhar number using the UIDAI database and every crowd agent is assigned a
unique user id and password.
The system maintains an audit log of all the transactions including
login details, locations, machine id etc. and will soon have a fraud engine to
monitor suspicious transactions.
How do I get started on DIP?
Identify the documents you need to digitize.
Verify their format to check that they are similar
Estimate the volume of documents you need to digitize
Verify the image quality to check that they are human readable
Identify the data fields per document you need to extract
Register as a department on the Digitize India portal or mail us the
information @
What is the desired outcome of the DIP?
We intend to leverage DIP to lead all organizations towards a paperless
office, make data available on demand to the citizens, free archived documents
storage spaces and enhance digital public service delivery.
source of faq