Code Search is a superpower

Before starting a new coding task, I find it’s usually worthwhile to spend at least a little time trying to find a related working example, especially when the task touches on unfamiliar libraries or concepts.

If I can find a good example, skimming it shifts my brain in the right direction and it feels like I’m absorbing some of its lessons.

My project is more likely to go smoothly, even if I don’t use a single line of code from the example.

Finding that good example, however, isn’t always so easy and the search itself might waste a lot of time. This article offers tips on how to quickly find nice code samples.

The Treasure Hunt Takes Too Long

Usually my hope is to find a repository that uses the library or context I’m researching, some repo that’s generally well written and isn’t too huge or overwhelming.

I’ve found that search engines, even ones dedicated to code search, are mostly frustrating and unproductive.

Simple searches using reasonable keywords rarely work, persisting leads to madness.

Search Engine Filters

For our sideways search, we’ll use advanced search operators, mostly the

inurl:

and

site:

Google filters. Some other ones can be handy too, here's a nice reference: search filter cheat sheet.

Bing offers similar ones but not a URL filter. If you prefer Bing, their

intitle:

filter should usually be an ok substitute for inurl.

We’ll use these advanced filters to search non-code files, ones that are properly indexed by the search engine even when the code next to it is not.

With inurl, we can filter to specific text files like Dockerfiles or READMEs or, powerfully, dependency manifest files like Node’s package.json, Python’s requirements.txt, Ruby’s Gemfile, etc.

Some sites explicitly offer dependency graph search as a feature (including package managers and GitHub) but I find that Google’s page rank voodoo beats these.

Example Of a Successful Search

Let’s say my current task is roughing out a testing framework for an Electron project.

I might start my hunt with:

inurl:package.json "electron" "mocha"

. (If you're not familiar with Node, mocha is a popular testing framework.) The inurl filter restricts results to pages with package.json in the URL.

I’ll start opening tabs that look promising, swapping search terms now and then, maybe replacing “mocha” with “jest” (another test framework) then trying again with a site filter to limit results to GitHub (with

site:github.com

) then GitLab.

Want your example in TypeScript? Add a

"@types"

"tsc"

search term.

In a couple minutes, we find Facebook Flipper and Rocket.Chat, repositories that are popular, up-to-date, and well-maintained. They’re also huge and overwhelming. I’m sure there are great lessons to learn within, but we’re in a hurry. Our search also turns up some small, wholesome repositories, ones that are easier to skim like: sindresorhus/electron-serve, and checksum-validator.

I won’t say we struck gold, but evaluating these working examples offers a big productivity boost and we found them in just a few minutes.

One of the smaller projects shows us how to use spectron for headless testing, the other helps us consider using smoke tests. (Personally, the exercise helped me also decide to move all direct calls to Electron out of the User Interface via @wranggle/rpc, making the exercise extra-worthwhile.)

Local Collections

Let’s say we’re new to Django and want to see some real world examples of it making GROUP BY database queries.

Some potential search keywords pop out when skimming the official documentation on db aggregation. We might try GitHub’s Code search tab on some, (Login required; submit search then click “Code” tab) but I’ve had mostly bad luck with it.

We might try a Google search on

"from django.db.models import Q" filetype:py

and that would turn up some examples, but there's a better way.

Instead of searching directly for uses of our aggregation query, let’s build up a locally-saved collection of nice, well-written example projects and search that.

You may have seen GitHub Topics, but did you know you can search for topics too? Submit a search, then click on the “Topic” tab. These are self-tagged by the repository owner but it’s still a great way to find sample repositories.

There’s likely a nice CRM, CMS, blog, and e-commerce app in most any language. (Plus maybe even a ToDo list!)

GitHub’s Topic and Repository tabs help us find polished, popular, and well-maintained projects that we can add to our local collection.

I suggest saving a local copy of all that look good. You can then search them using your own IDE far more effectively than any online tool I’ve ever tried.

You’ll have far better results than using GitHub’s in-repository search and will be able to revisit these same examples in the future.

Sure enough, this approach reveals plenty usage examples for that aggregation query, and better yet, we can evaluate how they’re testing them.

Human And Awesome Human Content

When investigating some coding problem, regular searches turn up articles/tutorials, answers on StackOverflow/Gitter, discussions on HackerNews/Reddit, etc.

From the perspective of finding a pre-task primer, you know best if one of these would be more helpful than sample code.

Regardless, high quality resources of either type can often be found in a curated list, and the easiest to find are the awesome lists.

When I yet again forget the CSS syntax for flexbox items, a plain search always takes me to css-tricks, and it’s nice enough. But if I take a bit more time to find an awesome list from awesome.re or by searching on

awesome flexbox

, I'll find a great resource.

Code Search Sites

If you want to search for a specific code blurb, searchcode.com might be worth a try. Of the direct code search tools, it seems to be the best when you don’t already have a list of known repositories.

For that aggregation query, a search on

"from django.db.models import Q"

seems to return decent results.

The same search using GitHub’s code-search feature produces far worse results. Hopefully GitHub/Microsoft will continue to invest in code search. (I would love for them to add an “exact match” feature, and a feature that does a deep code search in each repository I follow or have starred.)

SourceGraph, a code search product, might be worth a look for teams that want to share and search their own curated list of repositories. You need to know which repositories you want to index ahead of time though. (I’ve never tried it but that feature sounds useful.)

Noticing When it’s a Waste Of Time

Let’s take a look at a rougher search process. Say our next task is to add Webpack code splitting to a different project. We haven’t yet started and we’re not (yet?) stuck, we’re just following the good general practice of doing our initial research.

A search on

"splitChunks" inurl:webpack.config.js

fails, returning very few results. Changing inurl to

intitle:webpack.config

returns slightly more but a lot of it is noise. We find some examples but spend a lot of time sifting.

We can’t do a dependency search as we did for our Electron task because splitChunks is part of WebPack core, it doesn’t add a dependency. Scooting further sideways with a dependency search on something else in the general neighborhood (mini-css-extract-plugin) produces decent results but getting to that keyword required unnecessary research.

Searching for a Gist (

"splitChunks" site:gist.github.com

) also leads us to good example uses, but not to a full example repository. (I don't know why Gist is indexed but the master repository is not. Robots.txt permits both.)

Overall, in this case, using human-curated resources would have been a better use of time. Or I should have stopped the search early, to possibly pick it up again after attempting the task myself.

When To Stop?

Searching for sample code can be an open-ended exercise. It’s really easy to spend unproductive hours searching for and reading samples.

Keep in mind that finding or understanding a given sample isn’t the goal, you have your own coding project to finish.

That stubborn drive to finish what you’ve started is usually a positive force, but if finding a sample isn’t vital and you’re not turning up anything useful, it’s better to move on.

Otherwise, maybe set a time range for the exercise. If you’re unfamiliar with the current topic, give it a relatively higher upper bound. If you hit the limit but the exercise still feels productive, adjust it upwards.

In contrast, move on early if the samples aren’t useful or if you just feel ready to start your own task. The best sign that the exercise was a success is if you’re getting ideas on how to do things better than the sample.

On the lower time bound, consider spending some minimum time looking at sample code before starting a new task. Reading code is way less fun than writing it but there’s a lot to learn in other people’s repositories, even if you already know how to accomplish your task.

If you have any other ideas or techniques for finding sample code, please pass them along!

Photo credits in order of appearance: