Graphical visualizations of weather data are as important as the dataset itself. What tops this list is the ability to interpret the data in a legible format. Hence, graphical visualizations that help stakeholders understand the weather data are vital to any data service provider and decision-makers.
In reality, the process of obtaining accurate, hyperlocal weather data and presenting it in understandable media is perceived to be complex and technical. This is why we have worked on this blog to show you how easy the process to do so is.
The agro-weather data used here is from a verified source, and the visualization holds relevance to that particular region. By understanding this process in detail, policymakers or stakeholders can avoid any naive approach to visualizing the datasets, misinterpretation, and mistrust.
This blog is mainly constructed to understand some visual plots and charts and methods to use data. It aims to help viewers understand and make informed decisions from the visualizations.
I prefer to use Python libraries to create visualizations as it has an array of inbuilt libraries. The code in this blog is written and run on a Jupyter notebook. All you need to do is download the Jupyter notebook on your system and download the python libraries.
!pip install seaborn
!pip install matplotlib
Another environment that I use to run visualization codes/scripts is Google Colab. This provides an online environment linked to your Google account and uses Google Servers, so there is no need to use local systems resources.
Bar plots represent the value of a particular quantity indicated by the height of each rectangle. It is a good choice to portray quantitative variables against another variable to make comparisons against them.
sns.barplot(x="label",y='rainfall', data=df, ci=None)
plt.title("Crops Relation with rainfall", fontsize = 24)
plt.xticks(rotation=90, fontsize=14)
plt.yticks(rotation=0, fontsize=14)
plt.savefig('Crops Relation with pH value.png')
It is evidently visible here that rice is a crop that requires exceptionally high amounts of rainfall of nearly 280 mm.
A bar chart is a visual representation of one attribute with respect to another.
As the name suggests, the Box and Whisker plot consist of a box connected to two whiskers. The lower limit of the whisker is the ‘minimum,’ and the upper limit of the box is called the ‘maximum.’ The left side of the box is considered the ‘lower quartile,’ and the right side of the box indicates the upper limit. The center of the box is the ‘median.’
sns.boxplot(y='label',x='ph',data=df)
plt.title('Suitable pH value of the soil')
plt.savefig('Ph.png')
This kind of visualization is plotted to determine where the majority of the data values lie at a glance. In simple words, it denotes the range of values within which the value of the feature(column) lie. Coffee, jute, watermelon, muskmelon, and mungbean grow in soil with pH values ranging between 6 and 7. We can also understand that moth beans can grow at almost any soil pH level from the visual.
A heat map is the most important plot used by data scientists. It uses colored notations to denote the correlation of a feature with other columns in a dataset.
Correlation denotes how linearly related one feature is to another. In layman's terms, it shows how one feature varies based on a variation of another feature.
sns.heatmap(df.corr(),annot=True)
plt.title('Correlation Matrix')
plt.savefig('Correlation.png')
A higher value denotes more correlation between the features.‘annot’ is a parameter passed to display the correlation values in the plot when made True.
We can mix and match the conditions until we obtain valuable inferences.
Line plots are simple, intuitive visuals that display numerical values on one side and categorical values (groups identified by labels) on the other. This plot is the best option to show the relationship between features in a dataset.
sns.lineplot(data = df[(df['humidity']<65)], x = "K", y = "rainfall",hue="label")
plt.title('Humidity less than 65 and Phosphorus levels from 15-25 are good conditions for 6 crops')
plt.savefig('Crop Conditions.png')
When we check for crops that grow well under weather conditions such as humidity < 65% and phosphorus levels in the soil ranging from around 15-25, these conditions are suitable for the majority of the crops like lentils, moth beans, kidney beans, pigeon beans, and coffee.
These plots are suitable to represent subsets(hue) of a feature and its statistical relationship with other features in a dataset. By directly comparing the relationship between the crops and the core nutrients like Nitrogen, Potassium, and Phosphorus, we can provide suggestions to grow certain crops in those respective nutrient-rich soils.
sns.relplot(x="temperature", y="N", hue="label", style="label",data=df)
The insight that might seem obvious is that a specific type of crop requires similar conditions to grow. This makes it easy to classify a new crop based on the conditions.
For example, Muskmelon grows in temperatures between 25-30 degrees centigrade and grows well in soil with pretty high Nitrogen content(80-120).
When it comes to visualizing weather and soil data, it is essential to differentiate the approaches that look at data objectively and those that convey specific details about weather and its variability. More often than not, there are usually some tensions involved in data visualization and analytics. It surrounds the fact that raises the question of whether the data translation has been impartial and is helpful to users looking to take climate change or soil-fertility-related insights from it.
In this blog, I have elaborated on how I prefer to make visualizations and interpret the data. This method is one of several ways one can visualize to interpret any dataset. The source of data I have used in this article is an amalgamation of soil data from ICAR - Indian Agricultural Research Institute and the weather data (precipitation, humidity, and rainfall) retrieved via
But in the end, data visualization is inevitable and is the basis of research & business operations. As Edward Tufte quotes, "above all else, show the data"(