Classification

Classification is one of the widely used machine learning task in different business problems. The goal of classification is to classify the text documents or images into one or more defined categories. Some examples of classification are

  1. Classifying service email requests into different service categories
  2. Detecting spam or ham emails or messages
  3. Understanding the sentiment of product buyer for online shopping
  4. Classifying documents into different categories

There are three different types of classification:

Text Classification: This is also known as text tagging or text categorization, categorizing text into organized groups. Text classifiers with NLP have proven to be a great alternative to structure textual data quickly, cost-effectively, and in a scalable way. When the number of different formats for the given document categories are not fixed and there is a lot of variety, you can use text classification.

Image Classification: This is a supervised method that defines a set of target classes used to label the images. In contrast to text classification which learns from the textual context of the document, the image classification depends on the structure of the document. When the number of formats for the given document categories are fixed and the documents vary from each other in the visual structure, you can use image classification. For example, to classify the KYC documents such as PAN, Aadhaar, Passport, and so on, you can use image classification.

Search Based Classification: This helps to define your keywords for the customized classes. You can add multiple classes and keywords for defining a customized search-based training model. The search based classification is used when the number of formats are fixed, and it is possible to identify each document category with one or more keywords. The advantage of this model is that it is pretrained, and no training is required. You only need to update the keyword configuration for the given set of document classes. Therefore, building search based classification is comparatively easy and rapid. 

The defined Models with all the classifications can be further used to predict a set of content or images.