From CSV to Buxfer: an unexpected journey — Goxfer

Written by wilk | Published 2017/11/01
Tech Story Tags: golang | docker | docker-compose | api | mongodb

TLDRvia the TL;DR App

Part 4— Goxfer: a story about pushing a transactions dataset online

Preamble

In Collector — Part 3, I’ve coded the collector program, putting cleaned data inside a MongoDB instance.Now, it’s time to use the Buxfer’s APIs to put the whole dataset online!

Goxfer is a program that reads structured data from MongoDB and then pushes it online via Buxfer’s APIs.

Journey

In this article, I cover the fourth part of this journey:

  1. Part 1: Introduction
  2. Part 2: Cleaner
  3. Part 3: Collector
  4. Part 4 (this part): Goxfer
  5. Part 5: Conclusions

Setup

This new program will be coded with GoLang: this means I need to build a custom docker image and to install any dependencies via glide, one of the best package manager for Go.

Ok, let’s start with docker!First, I need to update the setup-golang service to build the container image:

Then, build the image and get inside the container:

$ docker-compose build setup-golang$ docker-compose run --rm setup-golang bash

Goxfer needs some dependencies, like mgo (the MongoDB driver for GoLang) and gorequest (a handful HTTP client): I’ve to get them with glide because it will trace them using lock files, very useful when it comes to restore the project on another system.

# from inside the setup-golang container$ glide create$ glide get github.com/parnurzeal/gorequest$ glide get gopkg.in/mgo.v2

With glide create two new files are added to the project:

  1. glide.yaml: it’s the glide manifest where all the dependencies are listed and marked with a fixed version, following the semantic versioning specs
  2. glide.lock: it’s a lock file to ensure deterministic restore procedure

With glide get the dependencies are downloaded inside the new local vendor folder and added inside the glide.yaml and glide.lock files.

Now, with docker-compose is possible to restore/initialise the project on any system:

$ docker-compose run --rm setup-golang

Goxfer: hands on!

So, what does Goxfer have to do?Let’s define the program procedure:

  1. connect to MongoDB
  2. initialise Buxfer’s session
  3. retrieve Buxfer’s accounts list
  4. fetch transactions from the DB and build transactions’ bulks
  5. push transactions’ bulks online via Buxfer’s APIs

I‘ve to forge it step by step.

Database

The very first thing I need to code is the database connection.mgo is a MongoDB driver and it just need the database host to establish the connection, plus the database name and the name of the collection it will query for the transactions.I need to define those informations inside the docker-compose.yml as environment variables for the goxfer service, already drafted in Part 1:

Great! Environment variables are always the best to configure things from outside, as Twelve Factors App teaches.

Now, I’ve to use the mgo driver to connect to MongoDB.But first, I’ve got to get those env vars… os.Getenv will do it for me:

It’s time to create the MongoDB connection and to get the transactions collection:

The database connection is required by the next steps so I’ve to stop the program if this execution fails: panic will be fine.

Buxfer’s session

Buxfer requires an authentication token for each request performed; so, before proceeding, I’ve to establish a new session.I’ll use gorequest that is an HTTP client with a practical and easy-to-use interface.The login API requires a username and a password, so I’ve to put new environment variables inside the docker-compose.yml file:

Then, I’ve to perform the HTTP request inside Goxfer:

I’ve also defined the LoginResponse struct to store the login response, of course.Then, I’ve checked for lib and http errors: if everything’s fine, then I can use the token taken from the response.

Accounts list

Before proceeding with the transactions, I need to get the accounts list because later I will need the id of each account.So, for now, I just have to map the expense and the income accounts with their ids:

Also here, I’ve used gorequest to perform the HTTP GET call.Buxfer’s APIs authentication is done through the session token so I’ve attached it to the querystring (.Query(“token=” + token)).The response is then stored inside the custom struct AccountsListReponse: also this step is required by the next ones, so I’ve to panic if at least an error has occurred.If everything went fine, I can check the accounts list.I’ve got only two accounts, so the control can be exclusive.

Now, I need to update the docker-compose.yml file with EXPENSE_ACCOUNT_BUXFER and INCOME_ACCOUNT_BUXFER constants, so the code above won’t break:

Great!Move on!

Transactions’ bulks

Now, what I want to do is to get all the transactions from MongoDB and then to push them on Buxfer.But, there’s a problem: Buxfer’s APIs are limited. In fact, it is possible to push just one transaction per time: https://www.buxfer.com/help/api#add_transaction Well, I don’t want to wait for the end of each request to perform a new one, neither to perform 1000 calls in parallel.So, I think I’ll go for pushing a bulk of 20 transactions per time.To do that, I need two things:

  1. pack the transactions inside a matrix (array of bulks (basically, array of arrays))
  2. (ab)use GoLang’s goroutines to perform HTTP calls in parallel

Each bulk has a fixed size, defined by the environment variable BULK_LENGTH, that I’m going to put inside the docker-compose.yml file:

And now, let’s define the procedure to get the transactions from the database and to populate the bulks matrix:

At the beginning, transactions are stored inside the variable results that is an array (a slice in GoLang) of Transaction (the transaction’s model).Then, I calculate the number of iterations to populate the transactions matrix, by dividing the length of the transactions’ list by BULK_LEN.The second iteration is done if the amount of transactions is not a multiple of BULK_LEN.

Pushing to Buxfer

Ok, this is the tough part.As I said, I want to push bulks of 20 transactions per time. This means I’ve to launch 20 goroutines in parallel and then wait until they have finished. All of them. Then, I can start with a new bulk and so on.

sync will help me handling concurrent goroutines execution: inside this package there’s the WaitGroup that basically provides three methods:

  1. Add: increment the workers counter
  2. Wait: stop the main thread until the workers counter is equal to zero
  3. Done: decrement the workers counter

So, what I’ve to do is:

  • loop through the bulks’ matrix
  • then for each row (aka bulk) I’ve to increment the WaitGroup counter by adding the length of the bulk
  • then I need to loop through the current bulk and for each transaction instantiate a new goroutine to push the data on Buxfer
  • for each goroutine that has finished, I need to decrement the WaitGroup counter via the Done method
  • eventually, I’ve to stop until the entire bulk has finished, using the Wait method of the WaitGroup, so the loop can continue with the next row of the matrix

During this procedure, I want to store how many transactions have been added and how many haven’t. Later I’ll log this information.

So, here’s the code, except for the actual request to Buxfer:

Now, the last thing to code is the addTransaction function.This will be a function that may return an error if something goes wrong while pushing the transaction online, following the way how GoLang treats errors like errors and not like exception:

Briefly: the HTTP request payload is composed by the following fields:

  • description: transaction’s description
  • amount: transaction’s amount
  • accountId: calculated matching the Buxfer accounts list, fetched previously
  • tags: transaction’s tags converted into a comma-separated string
  • date: date formatted with YYYY-MM-DD format
  • token: session token
  • type: transaction’s type, calculated the same way of accountId

Then the request is made and the response is stored inside the new struct AddResponseBody: if no error occurred, it will return nil.

Putting everything altogether (on github because it is too long to be displayed here): https://github.com/wilk/from-csv-to-buxfer/blob/master/go/src/goxfer.go

Run it, dude!

The final step: launch it 🚀!

$ docker-compose run --rm goxfer

Yuppie 🎉

End of part 4

Wow, this last part was tough 😵It requires a lot of work and a lot of testing. In fact, initially the API for adding transactions on Buxfer was different (and bugged) and it got changed (and fixed) at the end of October. I had to change the source code, performing new tests with PostMan and curl before running the program safety.

However, even if I tested everything beforehand, I got this:

Nothing to say.I was aware of this flaw (in fact, I didn’t prepare any log to trace unpushed transactions) but I wanted to try anyway, putting all of my trust into manual testing I did earlier.That required me to manually search inside thousands of logs: it was quite painful but I deserved it, so it’s ok 😬

But now, I’ve all of my transactions of 2016 online on Buxfer 🎉I would like to spend more words about this part but I’ll do in the next part: Conclusions!

If you enjoyed this article don’t forget to share it!See you in Part 5: Conclusions!

Sources

Source code is available here: https://github.com/wilk/from-csv-to-buxfer

Update

I’ve improved the unpushed transactions logs and I found that one of the raw transactions was corrupted: its date was 29/11/1898 so that’s why Buxfer refused to accept it.Anyway, Cleaner, Collector and Goxfer did the job and they did it really well!


Published by HackerNoon on 2017/11/01