For this post we restrict our text classification to only two classes also know as binary classification e.g into positive or negative text. For integrating text classification in iOS we will be using a pre-trained model on IMDB database. We will be using tensorflowlite as out underlaying machine learning inference engine.
In its simplest form our app will take a sentence as input from user, feeds into classification client, that will classify text as either positive or negative depending on text content.
Lets name our classification client as
TextClassificationClient
. Here is an overview of the client classfinal class TextClassificationClient {
// other stuff
init?(modelFileInfo: FileInfo, labelsFileInfo: FileInfo, vocabFileInfo: FileInfo) {
// we will initialise the tensorflowlite here
}
}
The initialiser takes three parameters of type FileInfo which is a tuple defined below
let modelFileInfo = FileInfo(name: "text_classification", extension: "tflite")
let labelsFileInfo = FileInfo(name: "labels", extension: "txt")
let vocabFileInfo = FileInfo(name: "vocab", extension: "txt")
Model file is the trained model file which tensorflowlite takes to perform inference. Labels file is plain text file consisting of classes of classification e.g Positive and Negative. Vocab is again a plain text file containing the collection of some words and their corresponding embedding, these words were used during model training and also will be used during inference.
Model training is outside of scope of this post and as I mentioned in start we will use a pre-trained IMDB model from here. For detail overview about model training please see here.
In order to use tensorflowlite in iOS app we have to integrate it. For this purpose we will be using cocoapods. Cocopods is the most commonly used dependency management tool used for managing third party dependencies in iOS apps.
Add pod ‘
TensorFlowLiteSwift
’ in your project pod file and run pod install to install and integrate tensorflowlite in iOS app.Let's get back to our
TextClassificationClient
class. In this class we have a method classify that will actually classify the text. Below is the implementation of this methodfunc classify(text: String) -> [Result] {
let input = tokenizeInputText(text: text)
let data = Data(copyingBufferOf: input[0])
do {
try interpreter.copy(data, toInputAt: 0)
try interpreter.invoke()
let outputTensor = try interpreter.output(at: 0)
if outputTensor.dataType == .float32 {
let outputArray = [Float](unsafeData: outputTensor.data) ?? []
var output = [Result]()
for (index, label) in labels.enumerated() {
output.append(Result(id: "", title: label, confidence: outputArray[index]))
}
output.sort(by: >)
return output
}
} catch {
print(error)
}
return []
}
It takes a string as parameter and return an array of Result type. The first line takes string and converts it into tokens in order to feed it into inference engine(interpreter instance variable in this case). We run interpreter on tokenised text get the output from interpreter and return it. The Result type is defined as below
struct Result {
let id: String
let title: String
let confidence: Float
}
Below is the screenshot of an inference
The complete working code of app can be found at Github repository here.