IDS Case Study: Using OCR to Tally Survey Responses

By Keiter Technologies

IDS Case Study: Using OCR to Tally Survey Responses

How can machine learning help simplify business tasks?

Data Science can help you achieve your business goals with tailored advice specific to your organization. Learn more about how Keiter Data Solutions solved a similar problem for one of our clients in the below case study.

Challenge and Opportunity

A client conducted a survey over a few weeks that had several thousand responses sent in, primarily by mail. These were all then scanned which made the responses digitally available but without any direct way to extract the text from the PDFs. Without some sort of Machine Learning solution, the client would have to manually count all the responses and link them to the correct users in their database.

Normally, a problem like this might merit training a custom model to correctly extract all relevant text from each PDF. However, because the survey was only being conducted over a few weeks this means we would not get enough data to train the model until the model was no longer useful.

Our Approach

We started by using an already trained Optical Character Recognition (OCR) model to do the best it could with reading the text off the PDF. Because the user’s ID was typed, the model was able to read that with almost perfect accuracy. These were then compared to an Excel export of user data to check for discrepancies.

The survey responses, however, consisted of check marked True or False boxes. Since the OCR model was not trained to be able to read hand checked boxes, it would do its best to interpret them as alphanumeric characters. After a quick analysis of these interpretations, we were able to confirm there was enough of a pattern to identify whether a box was checked or not with a high but imperfect accuracy. This was used to make the initial response predictions and export them to the spreadsheet with user IDs.

The remaining issue was that because it was a survey being used to vote on an internal decision, the final count needed to be perfectly accurate. Instead of spending extra time trying to increase the accuracy of our method, which would have always remained imperfect, we then sorted the PDFs into directories based on our predictions. This made it very quick and easy for us to manually confirm the PDF predictions and make the small number of corrections needed.

Results

Overall, we were able to save many hours of manual data entry which often leads to small errors from fatigue. Our sorting of the PDFs also makes it much easier to do any recounts that may be requested or needed. We see this as a great example of using the right tool for the right job. Often problems do not require a pure Machine Learning solution when many hours can be saved by allowing a little bit of well-placed human intervention.

The Keiter Technologies team has access to in-house resources beyond data analytics which provide insights and expertise in areas such as corporate tax, compliance, and business advisory services. Our proficiency across multiple disciplines, enables us to offer clients comprehensive data solution services. Our processes save valuable time and resources and help maximize benefits under programs like the ERC. In working with the Keiter Technologies team, your business will work a team of professionals that will partner with you to help you achieve your business goals and add value.

 

 

View All IDS Case Studies >

 

Learn More about our Innovative Data Solutions Services

 

 

 

 

Share this Insight:

About the Author


Keiter Technologies

Keiter Technologies

Keiter Technologies focuses on serving businesses with their strategic technology needs through data science, cybersecurity, and IT audit and consulting.

More Insights from Keiter Technologies

The information contained within this article is provided for informational purposes only and is current as of the date published. Online readers are advised not to act upon this information without seeking the service of a professional accountant, as this article is not a substitute for obtaining accounting, tax, or financial advice from a professional accountant.

Categories

Contact Us