IDP

Patent-Pending Status

By: Brad Porter | September 16, 2019

I am excited to announce that KnowledgeLake has obtained patent pending status for our recently developed intellectual property focused on intelligent content classification, information extraction, and automatic training via supervised learning. Our new approach leverages several recent innovations in machine learning and computer vision that enable us to solve these problems in a more efficient and robust manner.

The first invention listed in the patent is our unique approach to document clustering. KnowledgeLake starts by looking at documents as pictures instead of text on paper. Our approach leverages similar technology that is brought to bear in facial recognition but applied to the document classification problem space. This approach enables us to rapidly (in milliseconds) identify and categorize similar content. This method enables downstream processes to act on categorized content in an intelligent and efficient manner without waiting for full-page OCR which can take over 5 seconds per page.

Click to Tweet: KnowledgeLake has obtained patent pending status for recently developed intellectual property focused on intelligent content classification, information extraction, and automatic training via supervised learning.

The second invention in our patent is our unique approach to document classification and extraction. Most of the vendors in the capture industry have handled classifying and extracting information from documents with a common approach. The first drawback to the common approach is that it requires an expert to author complex and cumbersome regular expressions to search the text of a document to classify and extract information. The core reliance on OCR is also a major drawback as it is prone to accuracy issues—especially if the document is of mediocre quality.

Our approach differs in that we do not require engineering expertise to train the system. End users can teach the system to classify and extract information from documents with ease. Rolling out new workloads only takes a few hours versus weeks or months with other vendors. Our use of image-based identification enables us to quickly and accurately match patterns on documents without the burden of full-page OCR and with higher accuracy, especially in low-quality documents.

While we have already created a user experience for users to quickly and easily teach the system, we wanted our platform to be able to learn on its own. The goal of supervised learning is to recognize patterns and create heuristics that lead to improved business efficiencies. KnowledgeLake’s supervised learning sits in the background observing users during their workday.

As the system identifies patterns in the content and behavior of the users, it automatically adds this knowledge to the training set which removes the necessity for human input for future versions of the document. This will ensure that every customer’s environment will evolve with changing content and create an increased rate of ROI for all our customers.

Stay tuned, as we’ll have additional announcements coming soon!

Tag(s): IDP , Content Management

The History of Capture: From Paper to IDP

April 7, 2022 Document capture has a long and impressive heritage. From the early days of microfilm in the 1930s to the... Read More

5 min read

KnowledgeLake is Making a Splash

June 6, 2022 Those of you who know KnowledgeLake are well aware that we tend not to toot our own horn very much. However,... Read More

Cloud

4 min read

A Limitless and Searchable Document Repository in the Cloud: KnowledgeLake’s Tahoe Update

May 12, 2022 Many Intelligent Document Processing vendors focus solely on the front end of the document capture and... Read More

Patent-Pending Status

Other posts you might be interested in

The History of Capture: From Paper to IDP

KnowledgeLake is Making a Splash

A Limitless and Searchable Document Repository in the Cloud: KnowledgeLake’s Tahoe Update

Subscribe to the latest updates from KnowledgeLake