Over the past decade, I’ve worked on and off as a journalist, sometimes full-time and sometimes as an occasional freelancer. It became clear to me early on that having some data skills might help me find interesting stories. But, I tried and failed to learn these skills several times over the years before finally making a breakthrough that has allowed me to research and pitch data stories successfully.
Here are some of the big mistakes I made along the way (and how you can avoid them to make your own data learning journey easier):
Doing data journalism requires some programming skills, but a mistake that I made from pretty much my first Google search was trying to “learn programming” rather than trying to learn the specific programming skills I needed for working with data.
This led me to Python, which is actually an excellent language for data work. However, it also led me to generic “learn Python” resources that weren’t specific to working with data. Over my repeated attempts — I’m ashamed to say I made this mistake several times — I built text adventure games, wrote scripts to open pop-up windows, and got deeply confused by object-oriented programming.
What I didn’t do was actually learn how to work with data.
That’s not the fault of any of the resources I used, which are probably excellent for building Python programming skills that can be applied in a variety of disciplines. But, I wasn’t trying to work in a variety of disciplines. I wanted to use data to find interesting stories. Struggling through projects that felt completely unrelated to my reason for learning Python was deeply de-motivating. It gave me an easy excuse to quit, so when I hit particularly frustrating challenges I did quit. Repeatedly.
What finally made the difference? A learning platform that taught Python specifically for data analysis and data science. (Although I have no interest in being a data scientist, the rudimentary skills required for the data journalism I wanted to do are essentially the same skills required of a data analyst.)
Finding a data-analysis-specific program meant that I was no longer wasting time on projects that felt irrelevant to my goals. I was using Python to work with data sets pretty much immediately, and the skills I was learning were immediately applicable with any data set I could find and download.
Another mistake I made in my early attempts to learn was not trying to work on my own data projects. I would simply find a Python book and work through the chapters and exercises it prescribed. I didn’t try to build anything on my own.
Part of the problem was that as mentioned above, I wasn’t learning data-specific skills, so I didn’t really know where to start. But part of the problem was that I didn’t understand how valuable not knowing how to do a project can be to the learning process.
Early on in my most recent and most successful attempt to learn Python, I got an idea for a project. Having done a lot of research on the topic of child trafficking in China, I was aware of a massive web forum cataloguing parent-reported missing child cases. Could I somehow collect, clean, and analyze that data? What kind of stories might it contain?
Although I was learning data-specific skills, I was still a beginner, and this project required me to do a whole bunch of things I’d never done before. It was a struggle. I spent hours on Google and StackOverflow, frustrated by roadblocks that would have inspired me to quit if I had been working on a textbook exercise. But because it was a topic I was personally invested in, I stuck with it.
Unfortunately, although I finished my child trafficking data project, it was never published. I got scooped, and it was 100% my fault.
When I got into analyzing the data, I quickly realized there were several interesting trends that would make for a good story. I pitched the idea to an editor at the best-paying publication I’d worked with, and the editor accepted. I was thrilled!
Then I discovered that I’d made a mistake in the data — a small number of cases were being counted twice.
It was a relatively easy fix in my code and fixing it didn’t impact any of the trends I’d found in the data, but it shook my confidence. I was no data journalist. I was just some hack who’d learned a little Python. When this article was published, I feared some other mistake I hadn’t caught would be revealed. I would be revealed as a fraud.
And so I delayed. I fiddled with the code. I re-scraped the forum to capture the latest cases. I played around with wording in my draft. There was no hard deadline, and I wasted a couple of months fiddling needlessly with the piece.
While I was tinkering, a team of scholars published this article. It was exactly what I had been working on, but with more depth and academic rigor. What had made my pitch unique and interesting — the data set I had scraped and analyzed — now made it completely redundant. I was so demoralized that I spiked the article myself.
My failure here was that I kept thinking I needed to learn more to be qualified. I needed to reach some kind of endpoint where I could say I had “learned Python.”
I now realize that no such endpoint exists. Programming languages are like spoken languages in that way. You could spend a decade learning and reach a point of high fluency, but you’ll never know everything.
And just as there’s no reason to wait until you’re fluent in Spanish to ask “¿Dónde está el baño?” there’s no reason to wait until you’re “fluent” in Python to put the skills you do have to practical use.
Learning the data skills that I could actually use for journalism proved to be easier than I thought, once I took the right approach. If you’re interested in learning data skills for journalism (or any other reason), you can avoid my mistakes by doing the following:
Find a data-specific learning platform or a resource so that you’re learning only what’s relevant to your goals.
Build your own data projects based on your personal passions and interests as soon as possible. Don’t wait for some arbitrary point where you know enough; learning things you don’t know is an important part of the process.
Don’t be afraid to start using your skills. If you make an honest mistake, it’s not the end of the world, and the alternative — developing new skills but not using them — is deeply unsatisfying.