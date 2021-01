Ultimate Guide to Tracking Hyperparameters of ML Models

Machine learning algorithms are tunable by multiple gauges called hyperparameters. Recent deep learning models are tunable by tens of hyperparameters, that together with data augmentation parameters and training procedure parameters create quite complex space. In the reinforcement learning domain, you should also count environment params.

Data scientists shouldĀ controlĀ hyperparameterĀ spaceĀ well in order to makeĀ progress.

Here, we will show youĀ recentĀ practices,Ā tips & tricks,Ā andĀ toolsĀ to track hyperparameters efficiently and with minimal overhead. You will find yourself in control of most complex deep learning experiments!

Why should I track my hyperparameters? a.k.a. Why is that important?

Almost every deep learning experimentation guideline, likeĀ this deep learning book, advises you how to tune hyperparameters to make models work as expected. In theĀ experiment-analyze-learn loop, data scientists must control what changes are being made, so that the ā€œlearnā€ part of the loop is working.

Oh, forgot to say thatĀ random seed is a hyperparameterĀ as well (especially in the RL domain: checkĀ this redditĀ for example).

What is current practice in the hyperparameters tracking?

Letā€™s review one-by-one common practices for managing hyperparameters. We focus on how to build, keep and pass hyperparameters to your ML scripts.

Python dictionary

Very basic, very useful. Simply collect your hyperparameters in the Python dictionary, like in this simple example:

PARAMS = { 'batch_size' : 64 , 'n_epochs' : 1000 , 'shuffle' : True , 'activation' : 'elu' , 'dense_units' : 128 , 'dropout' : 0.2 , 'learning_rate' : 0.001 , 'early_stopping' : 20 , 'optimizer' : 'Adam' , }

Thanks to this approach youĀ keep all hyperparameters in the single Python objectĀ and you can easily use it across your training scripts. In order to make sure that you track those parameters in the machine learning project, itā€™s recommended just toĀ version control file where this dictionary is created.

You can check the entire exampleĀ here.

Pros:

Simple and straightforward because you already know the tool. Easy to make hierarchical structure with nested dictionaries. Almost no overhead in the code. Easy to merge multiple configuration files into a single dictionary.

Cons:

Hyperparameters are part of the codebase, while they should be separate ā€“ remember to distinguish between the logic and its parametrization. Saving params to disk is not obvious. You may not notice that you overwrite some values. Then, itā€™s difficult to learn how a particular setup is performing, because you may overwrite some magic numbers.

Note: Did you know about AttrDict?

It is aĀ Python libraryĀ that allows you toĀ access dictionary elements both as keys and as attributes. Itā€™s really convenient to use attribute syntax.

Here is an example of the nested dicts:

config = { 'neptune' : { 'project' : 'kamil/analysis' , 'tags' : [ 'xgb-tune' ]}, 'booster' : { 'max_depth' : 10 , 'eta' : 0.01 , 'gamma' : 0.001 , 'silent' : 1 , 'subsample' : 1 , 'lambda' : 1 , 'alpha' : 0.05 , 'objective' : 'reg:linear' , 'verbosity' : 0 , 'eval_metric' : 'rmse' , }, 'num_round' : 20 , }

You can accessĀ etaĀ like this:

config[ 'booster' ][ 'eta' ]

With attrdict, you can do it in more elegant way:

cfg = AttrDict(config) cfg.booster.eta

Configuration file

They are regular text files with some predefined structure and standard libraries to parse them, likeĀ JSON encoder and decoder, orĀ PyYAML. Common standards are json, yaml or cfg files.

Below is an example yaml file, that presents multiple hyperparameters for random forestĀ along with more general info like project and experiment name.

project: ORGANIZATION/home-credit name: home-credit-default-risk parameters: # Data preparation n_cv_splits: 5 validation_size: 0.2 stratified_cv: True shuffle: 1 # Random forest rf__n_estimators: 2000 rf__criterion: gini rf__max_features: 0.2 rf__max_depth: 40 rf__min_samples_split: 50 rf__min_samples_leaf: 20 rf__max_leaf_nodes: 60 rf__class_weight: balanced # Post Processing aggregation_method: rank_mean

Similarly to the dictionary-based style, you just need to version control this file to keep track of hyperparameters.

You canĀ read yaml file and access its elementsĀ by simply usingĀ yaml.load()Ā like this:

import yaml with open(config_path) as f: config = yaml.load(f, Loader=yaml.BaseLoader) # config is dict print(config[ 'parameters' ][ 'n_cv_splits' ]) # 5

AsĀ AttrDictĀ was just introduced, letā€™s modify this snippet andĀ access n_cv_splits value in more elegant way:

import yaml from attrdict import AttrDict with open(config_path) as f: config = yaml.load(f, Loader=yaml.BaseLoader) # config is dict cfg = AttrDict(config) print(cfg.parameters.n_cv_splits) # 5

Here is anĀ exampleĀ of a large yaml file used for storing feature selection, model parameters and much more. Entire project is alsoĀ publicly available.

Pros

Everything is located in a single place. Easy to re-use saved configuration files. Nice separation of script logic and its parametrization. Enhanced readability of the code.

Cons

It requires some programming discipline to put hyperparameters in the config file. If codebase changes rapidly (new features, new models and at the same time dropping older versions of the code), maintaining proper config files is an additional overhead. For large codebases, you may land with several config files, which can make things more complex and tedious to maintain.

Argparse

When experimenting, you usually go through multiple trials (or experiments) in order to understand relationships between hyperparameters and score, and to obtain the best performing model (we leave the discussion what it means that model performs well for another post).

In such a situation it comes in handy toĀ start new experimentsĀ from the commandĀ lineĀ andĀ specifyĀ valuesĀ of parameters directlyĀ in the CLI. Argparse is a Python module that makes it easy to writeĀ user-friendly command-line interfaces.

I think that an easy way toĀ understand argparseĀ is toĀ simplyĀ analyzeĀ an example. Below is a simple Python program that takes three optional positional arguments and prints them.

import argparse parser = argparse.ArgumentParser(description= 'Process hyper-parameters' ) parser.add_argument( '--lr' , type=float, default= 0.001 , help= 'learning rate' ) parser.add_argument( '--dropout' , type=float, default= 0.0 , help= 'dropout ratio' ) parser.add_argument( '--data_dir' , type=str, default= '/neptune/is/the/best/data/' , help= 'data directory for training' ) args = parser.parse_args() # Here is how to access passed values print(args.lr) print(args.dropout) print(args.data_dir)

If youĀ runĀ this program,Ā without any arguments, then defaults will be used:

python main.py

Output is:

0.001 0.0 /neptune/ is /the/best/data/

If youĀ specify parameters, then they are parsed, so that youĀ can use them in your training script:

python main.py --lr 0.005 --dropout 0.5

Output is:

0.005 0.5 /neptune/ is /the/best/data/

One important note about tracking: Be advised thatĀ argparse does not save or log parameters passed in the command line. Users have to save values of parameters themselves.

Pros

Conveniently start new experiments. Decide on the hyperparametersā€™ values on the fly. Easy to add new arguments to argparse.

Cons

Requires extra effort (not large though) to keep track of hyperparametersā€™ values across long experimentation-based projects.Ā Argparse does not save values anywhere. Similarly to configuration files, if your project grows rapidly you may find it difficult to maintain CLI parameters. If you pass parameters in a few places in the code, it becomes not that obvious how to use argparse efficiently. Similar is true if you build/merge parameters from multiple places.

Note: Did you know about Click?

As mentioned inĀ this post, there are few alternatives to argparse. One notable system isĀ Click.

It is a Python package for creating CLI in a composable way with minimum additional coding. With ā€œClickā€ you just decorate some functions like in thisĀ example, where the hello function is decorated:

import click @click.command() @click.option("--count", default=1, help="Number of greetings.") @click.option("--name", prompt="Your name", help= "The person to greet." ) def hello (count, name) : """Simple program that greets NAME for a total of COUNT times.""" for _ in range(count): click.echo( "Hello, %s!" % name) if __name__ == '__main__' : hello()

Then run like any other CLI command:

python hello.py --count= 3

Here is anĀ example image segmentation projectĀ that uses click extensively. Take a look at theĀ main.pyĀ and check it in detail.

Hydra

HydraĀ is a new project from Facebook AI that simplifies the configuration of more complex machine learning experiments.

The key ideas behind it are:

DynamicallyĀ create Ā aĀ hierarchical Ā configuration Ā by Ā composition ,

Ā aĀ Ā Ā Ā , Override it when needed through the command line,

Pass new parameters (not present in the config) via CLI ā€“ they will be handled for you

Hydra gives you the ability to prepare and override complex configuration setups (including config groups and hierarchies), while keeping track of any overridden values.

Similarly to argparse,Ā the best way to understand it (and how simple it is to work with hydra) is to analyze an example.

Letā€™s consider simplified configĀ yamlĀ file from the section about configuration files:

project: ORGANIZATION/home-credit name: home-credit-default-risk parameters: # Data preparation n_cv_splits: 5 validation_size: 0.2 stratified_cv: True shuffle: 1 # Random forest rf__n_estimators: 2000 rf__criterion: gini rf__max_depth: 40 rf__class_weight: balanced

Here isĀ minimalist-style hydra example:

import hydra from omegaconf import DictConfig @hydra.main(config_path='hydra-config.yaml') def train (cfg) : print(cfg.pretty()) # this prints config in a reader friendly way print(cfg.parameters.rf__n_estimators) # this is how to access single value from the config if __name__ == "__main__" : train()

When you run it, you should see this:

name: home-credit-default-risk parameters: n_cv_splits: 5 rf__class_weight: balanced rf__criterion: gini rf__max_depth: 40 rf__n_estimators: 2000 shuffle: 1 stratified_cv: true validation_size: 0.2 project: ORGANIZATION/home-credit 2000

What isĀ convenient in hydra is that you can override any value in the config from the CLIĀ like this:

python hydra-main.py parameters.n_cv_splits= 12 \ parameters.stratified_cv= False name=entirely-new-name

As a result you have new values in the config:

name: entirely-new-name parameters: n_cv_splits: 12 rf__class_weight: balanced rf__criterion: gini rf__max_depth: 40 rf__n_estimators: 2000 shuffle: 1 stratified_cv: false validation_size: 0.2 project: ORGANIZATION/home-credit 2000

AnotherĀ feature that provides nice flexibility is an option to pass new, previously unseen parameters right from the command line.

To enable this feature simply turn off strict mode in hydra.

@hydra.main(config_path='config.yaml', strict=False)

In the command below Iā€™m addingĀ rf__max_featuresĀ to the config and at the same time changingĀ rf__n_estimatorsĀ to 1500.Ā Note that config is the same as in previous examples. In code we only turned off strict mode:

python hydra-main.py parameters.rf__n_estimators= 1500 \ parameters.rf__max_features= 0.2

Output changed accordingly:

name: home-credit-default-risk parameters: n_cv_splits: 5 rf__class_weight: balanced rf__criterion: gini rf__max_depth: 40 rf__max_features: 0.2 rf__n_estimators: 1500 shuffle: 1 stratified_cv: true validation_size: 0.2 project: ORGANIZATION/home-credit 1500

The hydra project is being actively developed, so make sure to check their tutorialsĀ from time to time to see new features.

Pros

Composable configurations. Ability to override values very easily and still keep track of them. Config groups that bring organization to larger experiments.

Cons

Hydra shines in larger experiments, measured as a number of hyperparameters and their hierarchy. For smaller ones, other methods will do just right. You need to be careful to avoid accidental override of important parametersā€™ value. In order to track hyperparameters across experiments you need to save the config object (cfgĀ in examples above) manually.

How to use experiment tracking tools like Neptune to further increase efficiency and control?

One step further in managing hyperparameters is to use them in a broader context of experiment management. Here is an example of how neptune handles parametrization of ML experiments:

# define parameters PARAMS = { 'batch_size' : 64 , 'n_epochs' : 100 , 'shuffle' : True , 'activation' : 'elu' , 'dense_units' : 128 , 'dropout' : 0.2 , 'learning_rate' : 0.001 , 'early_stopping' : 10 , 'optimizer' : 'Adam' , } # create experiment neptune.create_experiment(params=PARAMS) # run training/validation code

In this way, each experiment has its own params setup saved to Neptune for further analysis and comparison across experiments. The main advantage of this approach is that youĀ associate parameters with other experiment-related data/metadata like evaluation metrics or resulting models.

Parametersā€™ values are displayed for each experiment, allowing you to visually inspect and analyze multiple runs.

Experiment tracking tools ā€“ like Neptune ā€“ display params in multiple different places, so that you can:

Compare selected experiments in greater detail while having all different params highlighted (example).

Search for experiments with particular values of parameter at hand (exampleĀ where we display only experiments with positive ā€œtimeseries_factorā€).

How to visualize hyperparameters?

If you are a heavy experimenter you probably came across the need to efficiently compare hundreds of runs andĀ visualize relationships between hyperparameters and score. One way to do it is to prepare a parallel coordinate plot, like the one below:

Parallel coordinates plot build with HiPlot.

Each vertical axis is one parameter, the score is the right-most (vertical) axis. Such visualization gives an immediate insight into the ranges of parameters that yields the best score. In principle, it should be interactive to allow users to explore the data freely and perform their own reasoning and interpretation.

Note:

Neptune is integrated with one great tool for building parallel coordinate pots:Ā HiPlot, developed atĀ Facebook AI ResearchĀ (FAIR). Take a closer lookĀ here.

Such a large number of runs (as depicted above) usually comes fromĀ hyperparameter optimizationĀ jobs, or hpo in short. Pythonā€™s open source landscape has a lot to offer in that matter.Ā HereĀ is one comparison of two popular hpo libs: optuna and hyperopt.

scikit-optimize

Another approach to inspect and understandĀ hpoĀ results is proposed by the creators of scikit-optimize. EachĀ hpoĀ job produces diagnostics charts that visualize relationships between hyperparameters and score.

Here is an example:

Skopt visualization fromĀ example optimization job.

optuna

Optuna is aĀ hyperparameter optimization framework to automate hyperparameter search. It offers its own suite of visualizations of the hyperparameters optimized for a given job.

Letā€™s study oneĀ example of the optuna hpo job:

Optuna diagnostics chart

Similarly to the previous example, a majorĀ goal of visualization is to help understand how hyperparameters relate to the scoreĀ that is being optimized.

If it sounds relevant to you, take a closer look at neptune-optuna integration, here in theĀ docs.

Final thoughts

Hyperparameters are the central piece of the larger picture, which is experiment management. In this post, we showed the recent state of the practice in hyperparameters tracking. With Neptune, you can bring it one level up, by making themĀ easily accessible, comparable, and shareable in the team instantaneously.

