In this article, we will go over a project on image recognition using Go. We will also create a Telegram bot, through which we can send images for recognition. The first thing we need is an already trained model. Yes, in this article we will not train our model. For this exercise, let's take a ready-made module from the docker image of ctava/tfcgo. To launch our project, we will need 4 terminals at the same time. In the first case, we will launch an image recognition server. In the second case, we will launch the bot. In the third case, we will launch a public tunnel for sending our bot "out". In the fourth - we will execute the command to register our bot. To start the recognition server, create a Dockerfile: FROM ctava/tfcgo

RUN mkdir -p /model && \
  curl -o /model/inception5h.zip -s && \
  unzip /model/inception5h.zip -d /model

WORKDIR / /src/imgrecognize
COPY src/ .
RUN build
ENTRYPOINT [ ]
EXPOSE "http://download.tensorflow.org/models/inception5h.zip" go go "/go/src/imgrecognize/imgrecognize" 8080 This way we will run our server in the image. Inside this image, we will have our server: src/imgrecognize. In addition, we will unpack the model in the directory: /model. For the server, the first thing we need is to set the value of the constant os.Setenv( , ) "TF_CPP_MIN_LOG_LEVEL" "2" This is necessary so as not to get an error: I tensorflow/core/platform/cpu_feature_guard.cc: ] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4 SSE4 AVX AVX2 FMA
unable to a tensor from image: Expected image (JPEG, PNG, or GIF), got empty file 140 .1 .2 make Here, we will not optimize our server, but simply run it through "ListenAndServe". On port 8080. Before starting the server, we will load our model (loadModel) and get our graph (modelGraph) and labels (labels). From the graph, which is stored in a file in the protobuf format "/model/tensorflow_inception_graph. pb". { model, err := ioutil.ReadFile(graphFile) err != { , , err
	}
	graph := tensorflow.NewGraph() err := graph.Import(model, ); err != { , , err
	} labelsFile, err := os.Open(labelsFile) err != { , , err
	} labelsFile.Close()
	scanner := bufio.NewScanner(labelsFile) labels [] scanner.Scan() {
		labels = (labels, scanner.Text())
	} graph, labels, scanner.Err()
} func loadModel () (*tensorflow.Graph, [] , error) string // Load inception model if nil return nil nil if "" nil return nil nil // Load labels if nil return nil nil defer var string for append return Actually, in "modelGraph" we keep the "structure" of our model and the key tools for working with it. And "labels" contains a "dictionary" for working with our model. Inside our HTTP handler, we are required to normalize the resulting image-normalizeImage. In order to pass it on to the recognition input in the future. To normalize, we convert our image from a Go value to a Tensor: tensor, err := tensorflow.NewTensor(buf.String()) After that, we get three variables graph, input, output, err := getNormalizedGraph() "graph" - we need to decode, resize, and normalize an image. The "input", together with the tensor, will be the "input point" for "communication" between our application and tensorflow. The "output" will be used as the output signal. Through "graph", we will also open a session to start normalization directly. session, err := tensorflow.NewSession(graph, ) nil Normalization Code: { buf bytes.Buffer
	_, err := io.Copy(&buf, imgBody) err != { , err
	}

	tensor, err := tensorflow.NewTensor(buf.String()) err != { , err
	}

	graph, input, output, err := getNormalizedGraph() err != { , err
	}

	session, err := tensorflow.NewSession(graph, ) err != { , err
	}

	normalized, err := session.Run( [tensorflow.Output]*tensorflow.Tensor{
			input: tensor,
		},
		[]tensorflow.Output{
			output,
		}, ) err != { , err
	} normalized[ ], } func normalizeImage (imgBody io.ReadCloser) (*tensorflow.Tensor, error) var if nil return nil if nil return nil if nil return nil nil if nil return nil map nil if nil return nil return 0 nil After normalizing the image, we create a session for inference over modelGraph. session, err := tensorflow.NewSession(modelGraph, ) nil With the help of this session (session), we will start the recognition itself. The input is our normalized image modelGraph.Operation( ).Output( ): normalizedImg, "input" 0 The result of the calculation (recognition) will be saved in the "outputRecognize"variable. From the received data we get the last 3 results: res := getTopFiveLabels(labels, outputRecognize[ ].Value().([][] )[ ]) { resultLabels []Label i, p := probabilities { i >= (labels) { }
		resultLabels = (resultLabels, Label{Label: labels[i], Probability: p})
	}
	sort.Sort(Labels(resultLabels)) resultLabels[:ResultCount]
} 0 float32 0 [] func getTopFiveLabels (labels [] , probabilities [] ) string float32 Label var for range if len break append return And for the HTTP response, we will give only one most likely result: msg := fmt.Sprintf( , res[ ].Label, res[ ].Probability* )
_, err = w.Write([] (msg)) "This is: %s (%.2f%%)" 0 0 100 byte All the code of our server for recognition: main ( tensorflow ) (
	ResultCount = ) (
	graphFile  = labelsFile = ) Label {
	Label Probability } Labels []Label { (l)
} {
	l[i], l[j] = l[j], l[i]
} { l[i].Probability > l[j].Probability
} (
	modelGraph *tensorflow.Graph
	labels     [] ) { err := os.Setenv( , ) err != {
		log.Fatalln(err)
	}

	modelGraph, labels, err = loadModel() err != {
		log.Fatalf( , err)
	}

	log.Println( )
	http.HandleFunc( , mainHandler)
	err = http.ListenAndServe( , ) err != {
		log.Fatalln(err)
	}
} {
	normalizedImg, err := normalizeImage(r.Body) err != {
		log.Fatalf( , err)
	} session, err := tensorflow.NewSession(modelGraph, ) err != {
		log.Fatalf( , err)
	}

	outputRecognize, err := session.Run( [tensorflow.Output]*tensorflow.Tensor{
			modelGraph.Operation( ).Output( ): normalizedImg,
		},
		[]tensorflow.Output{
			modelGraph.Operation( ).Output( ),
		}, ,
	) err != {
		log.Fatalf( , err)
	}

	res := getTopFiveLabels(labels, outputRecognize[ ].Value().([][] )[ ])
	log.Println( ) _, l := res {
		fmt.Printf( , l.Label, l.Probability* )
	}
	log.Println( )

	msg := fmt.Sprintf( , res[ ].Label, res[ ].Probability* )
	_, err = w.Write([] (msg)) err != {
		log.Fatalf( , err)
	}
} { model, err := ioutil.ReadFile(graphFile) err != { , , err
	}
	graph := tensorflow.NewGraph() err := graph.Import(model, ); err != { , , err
	} labelsFile, err := os.Open(labelsFile) err != { , , err
	} labelsFile.Close()
	scanner := bufio.NewScanner(labelsFile) labels [] scanner.Scan() {
		labels = (labels, scanner.Text())
	} graph, labels, scanner.Err()
} { resultLabels []Label i, p := probabilities { i >= (labels) { }
		resultLabels = (resultLabels, Label{Label: labels[i], Probability: p})
	}
	sort.Sort(Labels(resultLabels)) resultLabels[:ResultCount]
} { buf bytes.Buffer
	_, err := io.Copy(&buf, imgBody) err != { , err
	}

	tensor, err := tensorflow.NewTensor(buf.String()) err != { , err
	}

	graph, input, output, err := getNormalizedGraph() err != { , err
	}

	session, err := tensorflow.NewSession(graph, ) err != { , err
	}

	normalized, err := session.Run( [tensorflow.Output]*tensorflow.Tensor{
			input: tensor,
		},
		[]tensorflow.Output{
			output,
		}, ) err != { , err
	} normalized[ ], } {
	s := op.NewScope()
	input = op.Placeholder(s, tensorflow.String)
	decode := op.DecodeJpeg(s, input, op.DecodeJpegChannels( )) output = op.Sub(s,
		op.ResizeBilinear(s,
			op.ExpandDims(s,
				op.Cast(s, decode, tensorflow.Float),
				op.Const(s.SubScope( ), ( ))),
			op.Const(s.SubScope( ), [] { , })),
		op.Const(s.SubScope( ), ( )))
	graph, err = s.Finalize() graph, input, output, err
} package import "bufio" "bytes" "fmt" "io" "io/ioutil" "log" "net/http" "os" "sort" "github.com/tensorflow/tensorflow/tensorflow/go" "github.com/tensorflow/tensorflow/tensorflow/go/op" const 3 var "/model/tensorflow_inception_graph.pb" "/model/imagenet_comp_graph_label_strings.txt" type struct string float32 type func (l Labels) Len () int return len func (l Labels) Swap (i, j ) int func (l Labels) Less (i, j ) int bool return var string func main () // I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA // unable to make a tensor from image: Expected image (JPEG, PNG, or GIF), got empty file "TF_CPP_MIN_LOG_LEVEL" "2" if nil if nil "unable to load model: %v" "Run RECOGNITION server ...." "/" ":8080" nil if nil func mainHandler (w http.ResponseWriter, r *http.Request) if nil "unable to make a normalizedImg from image: %v" // Create a session for inference over modelGraph nil if nil "could not init session: %v" map "input" 0 "output" 0 nil if nil "could not run inference: %v" 0 float32 0 "--- recognition result:" for range "label: %s, probability: %.2f%%\n" 100 "---" "This is: %s (%.2f%%)" 0 0 100 byte if nil "could not write server response: %v" func loadModel () (*tensorflow.Graph, [] , error) string // Load inception model if nil return nil nil if "" nil return nil nil // Load labels if nil return nil nil defer var string for append return [] func getTopFiveLabels (labels [] , probabilities [] ) string float32 Label var for range if len break append return func normalizeImage (imgBody io.ReadCloser) (*tensorflow.Tensor, error) var if nil return nil if nil return nil if nil return nil nil if nil return nil map nil if nil return nil return 0 nil // Creates a graph to decode, rezise and normalize an image func getNormalizedGraph () (graph *tensorflow.Graph, input, output tensorflow.Output, err error) 3 // 3 RGB "make_batch" int32 0 "size" int32 224 224 "mean" float32 117 return Now, we need to build this image (build it). Of course, we can build an image and run it in the console using the appropriate commands. But it is more convenient to build these commands in a Makefile. So, let's create this handy file: recognition_build:
	docker build -t imgrecognition .

recognition_run:
	docker run -it -p : imgrecognition 8080 8080 After that, open the terminal and run the command: recognition_build && recognition_run make make Now, in the first terminal, we have a local HTTP server that can accept images. In response, it sends a text message containing information about what was recognized in the image. This is so to say the "core" of our project. Creating a Telegram Bot Next, we need to create a Telegram bot. We need to "build" the bot; to do this, we need to write a second HTTP server. The first server recognizes our images and uses port 8080. The second one will be the Bot's server and will use port 3000. First, we need to create a bot through your account in the app via . With this registration, you will receive the bot's name and its token. Don't tell anyone about this token. BotFather Let's put this token in the "BotToken" constant. You should get something like this: BotToken = const "1695571234:AAEbodyrfOjto2xNE5yjpQpW2Gyq0Ob5X24D5" Our bot's handler will decode the JSON response body. json.NewDecoder(r.Body).Decode(webhookBody) We are interested in the photo in the sent message By the unique image ID- let's collect a link to the image itself . And download it webhookBody.Message.Photo. photoSize.FileID fmt.Sprintf(GetFileUrl, BotToken, photoSize.FileID) downloadResponse, err = http.Get(downloadFileUrl). We will send the image bytes to the handler of our first server: msg := recognitionClient.Recognize(downloadResponse) In response, we get a certain message - a text string. After that, we simply send this string to the User, as is, in the Telegram Bot. The entire bot code: main ( ) (
	BotToken = GetFileUrl       = DownloadFileUrl  = SendMsgToUserUrl = ) webhookReqBody {
	Message Msg
} Msg {
	MessageId Text From {
		ID FirstName Username } Photo *[]PhotoSize Chat {
		ID FirstName Username } Date Voice {
		Duration MimeType FileId FileSize } } PhotoSize {
	FileID Width Height FileSize } ImgFileInfo {
	Ok Result {
		FileId FileUniqueId FileSize FilePath } } {
	log.Println( )
	err := http.ListenAndServe( , http.HandlerFunc(Handler)) err != {
		log.Fatalln(err)
	}
} { webhookBody := &webhookReqBody{}
	err := json.NewDecoder(r.Body).Decode(webhookBody) err != {
		log.Println( , err) } downloadResponse *http.Response webhookBody.Message.Photo == {
		log.Println( , webhookBody) } _, photoSize := *webhookBody.Message.Photo { imgFileInfoUrl := fmt.Sprintf(GetFileUrl, BotToken, photoSize.FileID)
		rr, err := http.Get(imgFileInfoUrl) err != {
			log.Println( , err) } rr.Body.Close() fileInfoJson, err := ioutil.ReadAll(rr.Body) err != {
			log.Println( , err) } imgInfo := &ImgFileInfo{}
		err = json.Unmarshal(fileInfoJson, imgInfo) err != {
			log.Println( +imgFileInfoUrl, err)
		} downloadFileUrl := fmt.Sprintf(DownloadFileUrl, BotToken, imgInfo.Result.FilePath)
		downloadResponse, err = http.Get(downloadFileUrl) err != {
			log.Println( +downloadFileUrl, err) } downloadResponse.Body.Close()
	} recognitionClient := recognition.New()
	msg := recognitionClient.Recognize(downloadResponse) err := sendResponseToUser(webhookBody.Message.Chat.ID, msg); err != {
		log.Println( , err) }
} sendMessageReqBody {
	ChatID Text } { msgBody := &sendMessageReqBody{
		ChatID: chatID,
		Text:   msg,
	} msgBytes, err := json.Marshal(msgBody) err != { err
	} res, err := http.Post(fmt.Sprintf(SendMsgToUserUrl, BotToken), , bytes.NewBuffer(msgBytes)) err != { err
	} res.StatusCode != http.StatusOK {
		buf := (bytes.Buffer)
		_, err := buf.ReadFrom(res.Body) err != { err
		} errors.New( + res.Status)
	} } package import "bytes" "encoding/json" "errors" "fmt" "io/ioutil" "log" "net/http" "github.com/romanitalian/recognition/src/bot/recognition" // Register Bot: curl -F "url=https://9068b6869da7.ngrok.io "  https://api.telegram.org/bot1695571234:AAEbodyrfOjto2xNE5yjpQpW2Gyq0Ob5X24D5/setWebhook const "1695571234:AAEbodyrfOjto2xNE5yjpQpW2Gyq0Ob5X24D5" "https://api.telegram.org/bot%s/getFile?file_id=%s" "https://api.telegram.org/file/bot%s/%s" "https://api.telegram.org/bot%s/sendMessage" type struct type struct int `json:"message_id"` string `json:"text"` struct int64 `json:"id"` string `json:"first_name"` string `json:"username"` `json:"from"` `json:"photo"` struct int64 `json:"id"` string `json:"first_name"` string `json:"username"` `json:"chat"` int `json:"date"` struct int64 `json:"duration"` string `json:"mime_type"` string `json:"file_id"` int64 `json:"file_size"` `json:"voice"` type struct string `json:"file_id"` int `json:"width"` int64 `json:"height"` int64 `json:"file_size"` type struct bool `json:"ok"` struct string `json:"file_id"` string `json:"file_unique_id"` int `json:"file_size"` string `json:"file_path"` `json:"result"` func main () "Run BOT server ...." ":3000" if nil // This handler is called everytime telegram sends us a webhook event func Handler (w http.ResponseWriter, r *http.Request) // First, decode the JSON response body if nil "could not decode request body" return // ------------------------- Download last img var if nil "no photo in webhook body. webhookBody: " return for range // GET JSON ABOUT OUR IMG (ORDER TO GET FILE_PATH) if nil "unable retrieve img by FileID" return defer // READ JSON if nil "unable read img by FileID" return // UNMARSHAL JSON if nil "unable unmarshal file description from api.telegram by url: " // GET FILE_PATH if nil "unable download file by file_path: " return defer // --------------------------- Send img to server recognition. if nil "error in sending reply: " return // The below code deals with the process of sending a response message // to the user // Create a struct to conform to the JSON body // of the send message request // https://core.telegram.org/bots/api#sendmessage type struct int64 `json:"chat_id"` string `json:"text"` // sendResponseToUser notify user - what found on image. func sendResponseToUser (chatID , msg ) int64 string error // Create the request body struct // Create the JSON body from the struct if nil return // Send a post request with your token "application/json" if nil return if new if nil return return "unexpected status: " return nil The client code that sends the image request from the Bot to the Recognition Server: recognition ( ) imgRecognitionAddress = Client {
	httpClient *http.Client
} { &Client{
		httpClient: &http.Client{},
	}
} { msg method := req, err := http.NewRequest(method, imgRecognitionAddress, downloadResponse.Body) err != {
		log.Println( , err) msg
	}
	req.Header.Add( , ) recognitionResponse, err := c.httpClient.Do(req) err != {
		log.Println(err) msg
	} {
		er := recognitionResponse.Body.Close() er != {
			log.Println(er)
		}
	}()

	recognitionResponseBody, err := ioutil.ReadAll(recognitionResponse.Body) err != {
		log.Println( , err) msg
	}
	msg = (recognitionResponseBody) msg
} package import "io/ioutil" "log" "net/http" const "http://localhost:8080/" type struct * func New () Client return func (c *Client) Recognize (downloadResponse *http.Response) string var string "POST" if nil "error from server recognition" return "Content-Type" "image/png" // do request to server recognition. if nil return defer func () if nil if nil "error on read response from server recognition" return string return By the way, to make our bot work correctly-register our handler. To do this, run: ngrok http 3000 Immediately after executing this command, you will see a list of public addresses. The last one will be an address with HTTPS - we need it. For example, it can be: https://9068b6869da7.ngrok.io. And directly register our bot-say Telegram where to send webhooks: curl -F https: "url=https://9068b6869da7.ngrok.io" //api.telegram.org/bot1695571234:AAEbodyrfOjto2xNE5yjpQpW2Gyq0Ob5X24D5/setWebhook Now you can send a file with a photo to your bot and get information about what is depicted on it. Thanks for your attention.

Creating an Image Recognizer on Golang Telegram Bot

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

104 Stories To Learn About Go

10 Questions for Aswin Ganesh, Noonie Nominee for Functional Programming

3 Golang Pitfalls Every Developer Needs to Know

207 Stories To Learn About Golang

2 Error-Free Options for Decimal handling in Golang

5 Advanced Go Testing Techniques

104 Stories To Learn About Go

10 Questions for Aswin Ganesh, Noonie Nominee for Functional Programming

3 Golang Pitfalls Every Developer Needs to Know

207 Stories To Learn About Golang

2 Error-Free Options for Decimal handling in Golang

5 Advanced Go Testing Techniques

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps