paint-brush
Go-CoNLLU Introduction: OSS Tool For Machine Learning Support in Goby@wagslane
145 reads

Go-CoNLLU Introduction: OSS Tool For Machine Learning Support in Go

by Lane WagnerAugust 14th, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Go-Conllu is an OSS tool for Machine Learning Support in Go. It is a simple and reliable way to import conllu data into your application as Go structs. GoDoc can be found here with the specifics of the GoDoc. The Universal Dependency Project hosts many annotations of textual data. In order to use these corpora, we need a parser that makes it simple for developers to utilize the data. Go-conllu parses Conllu and prints sentences and tokens to the console.

Company Mentioned

Mention Thumbnail
featured image - Go-CoNLLU Introduction: OSS Tool For Machine Learning Support in Go
Lane Wagner HackerNoon profile picture

Python is commonly seen as the AI/ML language, but is often a dull blade due to unsafe typing and being slow, like really slow. Many popular natural language processing toolkits only have Python APIs, and we want to see that change. At Nuvi, we use Go for the majority of our data processing tasks because we can write simple and fast code. Today we are open-sourcing a tool that has helped make our ML lives easier in Go. Say hello to go-conllu.

What is CoNLL-U?

The Conference on Natural Language Learning (CoNNL) has created multiple file-formats for storing natural language annotations. CoNLL-U is one such format and is used by the Universal Dependency Project, which hosts many annotations of textual data. In order to use these corpora, we need a parser that makes it simple for developers to utilize the data.

How Does Go-Conllu Help?

Go-conllu parses conllu data. It is a simple and reliable way to import conllu data into your application as Go structs.

The GoDoc can be found here with the specifics

Let's take a look at the example quick-start code from the Readme. First, download the package.

go get github.com/nuvi/go-conllu

Then in a new project:

package main

import (
	"fmt"
	"log"

	conllu "github.com/nuvi/go-conllu"
)

func main() {
	sentences, err := conllu.ParseFile("path/to/model.conllu")
	if err != nil {
		log.Fatal(err)
	}

	for _, sentence := range sentences {
		for _, token := range sentence.Tokens {
			fmt.Println(token)
		}
		fmt.Println()
	}
}

All the sentences and tokens in the corpus will be printed to the console.

If you need a .conllu corpus file you can download the Universal Dependencies English training model here: en_ewt-ud-train.conllu

Thanks For Reading

  • Follow us on Twitter @q_vault if you have any questions or comments
  • Take game-like coding courses on Qvault Classroom
  • Subscribe to our Newsletter for more educational articles

Previously published at https://qvault.io/2020/06/08/go-conllu-some-much-needed-machine-learning-support-in-go/