paint-brush
Text Classification in iOS using tensorflowlite [A How-To Guide]by@khurram-shehzad
308 reads
308 reads

Text Classification in iOS using tensorflowlite [A How-To Guide]

by Khurram ShehzadMarch 8th, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Text classification is task of categorising text according to its content. More general applications of text classifications are in email spam detection, sentiment analysis and topic labelling. For integrating text classification in iOS we will be using a pre-trained model on IMDB database. We will use tensorflowlite as outlaying machine learning inference engine. In its simplest form our app will take a sentence as input from user, feeds into classification client, that will classify text as either positive or negative depending on text content.
featured image - Text Classification in iOS using tensorflowlite [A How-To Guide]
Khurram Shehzad HackerNoon profile picture

Text classification is task of categorising text according to its content. It is the fundamental problem in the field of Natural Language Processing(NLP). More general applications of text classifications are in email spam detection, sentiment analysis and topic labelling etc.

For this post we restrict our text classification to only two classes also know as binary classification e.g into positive or negative text. For integrating text classification in iOS we will be using a pre-trained model on IMDB database. We will be using tensorflowlite as out underlaying machine learning inference engine.

In its simplest form our app will take a sentence as input from user, feeds into classification client, that will classify text as either positive or negative depending on text content.

Lets name our classification client as 

TextClassificationClient
 . Here is an overview of the client class

final class TextClassificationClient {
  // other stuff
  init?(modelFileInfo: FileInfo, labelsFileInfo: FileInfo, vocabFileInfo: FileInfo) {
    // we will initialise the tensorflowlite here
  }
}

The initialiser takes three parameters of type FileInfo which is a tuple defined below

let modelFileInfo = FileInfo(name: "text_classification", extension: "tflite")
let labelsFileInfo = FileInfo(name: "labels", extension: "txt")
let vocabFileInfo = FileInfo(name: "vocab", extension: "txt")

Model file is the trained model file which tensorflowlite takes to perform inference. Labels file is plain text file consisting of classes of classification e.g Positive and Negative. Vocab is again a plain text file containing the collection of some words and their corresponding embedding, these words were used during model training and also will be used during inference.

Model training is outside of scope of this post and as I mentioned in start we will use a pre-trained IMDB model from here. For detail overview about model training please see here.

In order to use tensorflowlite in iOS app we have to integrate it. For this purpose we will be using cocoapods. Cocopods is the most commonly used dependency management tool used for managing third party dependencies in iOS apps.

Add pod ‘

TensorFlowLiteSwift
’ in your project pod file and run pod install to install and integrate tensorflowlite in iOS app.

Let's get back to our 

TextClassificationClient
 class. In this class we have a method classify that will actually classify the text. Below is the implementation of this method

func classify(text: String) -> [Result] {
  let input = tokenizeInputText(text: text)
  let data = Data(copyingBufferOf: input[0])
  do {
    try interpreter.copy(data, toInputAt: 0)
    try interpreter.invoke()
    let outputTensor = try interpreter.output(at: 0)
    if outputTensor.dataType == .float32 {
      let outputArray = [Float](unsafeData: outputTensor.data) ?? []
      var output = [Result]()
      for (index, label) in labels.enumerated() {
        output.append(Result(id: "", title: label, confidence: outputArray[index]))
      }
      output.sort(by: >)
      return output
    }
  } catch {
    print(error)
  }
  return []
}

It takes a string as parameter and return an array of Result type. The first line takes string and converts it into tokens in order to feed it into inference engine(interpreter instance variable in this case). We run interpreter on tokenised text get the output from interpreter and return it. The Result type is defined as below

struct Result {
  let id: String
  let title: String
  let confidence: Float
}

Below is the screenshot of an inference

The complete working code of app can be found at Github repository here.