Data visualization is the final step in analytics, but also one of the most neglected. As two folks deeply embedded in the business intelligence and AI space, we’ve watched dashboards turn into forests of bar charts and pie charts, and often not because they're the best, but because they're familiar.
That got us thinking:
Can an AI recommend the best chart for any dataset automatically?
We teamed up to explore this idea. One of us focused on rule-based heuristics and data profiling logic; the other (shoutout to my co-author!) led the charge on LLM-driven enhancements. Together, we built a working prototype that marries structure with semantics.
This article is our build-in-public walkthrough: architecture, code, logic, and all.
The Problem with Traditional BI Charting
Let’s be honest: most BI tools (Tableau, Power BI, Looker, etc.) offer chart variety, but not guidance. They assume the user:
- Knows what chart suits their data
- Understands data types
- Can infer the story the chart should tell
In reality:
- Users default to safe/familiar charts
- Visuals often misrepresent the data
- Stakeholders misread them
We needed a tool that doesn’t just draw but thinks.
What If AI Picked the Chart?
The idea hit during a dashboard review: Why use a line chart for this categorical data?
Boom!! Opportunity spotted.
What if a system could:
- Read your CSV
- Profile columns
- Understand what you want to analyze
- Recommend the most effective chart
Our vision: Upload your data → Get a tailored chart suggestion (with explanation)
This would empower non-technical users and speed up analysts.
How We Designed It
We split the engine into three layers:
Data Profiling Layer
This step scans the dataset to identify column types, cardinality, nulls, and more.
pythonCopyEditdef profile_data(df):
profile = []
for col in df.columns:
dtype = df[col].dtype
unique_vals = df[col].nunique()
null_count = df[col].isnull().sum()
profile.append({
"column": col,
"dtype": str(dtype),
"unique_values": unique_vals,
"nulls": null_count
})
return pd.DataFrame(profile)
Output:
column | dtype | unique_values | nulls |
Region | Object | 5 | 0 |
Sales | float64 | 1000 | 3 |
Created_At | datetime | 12 | 0 |
Rule-Based & Heuristic Engine
We hardcoded rules based on common patterns. This covered 70–80% of cases.
pythonCopyEditdef suggest_chart(df):
profile = profile_data(df)
cat_cols = profile[profile['dtype'] == 'object']['column'].tolist()
num_cols = profile[profile['dtype'].str.contains('float|int')]['column'].tolist()
if len(cat_cols) == 1 and len(num_cols) == 1:
return f"Bar chart recommended for {cat_cols[0]} vs {num_cols[0]}"
elif len(num_cols) == 2:
return f"Scatter plot recommended for {num_cols[0]} vs {num_cols[1]}"
elif any('date' in str(dtype).lower() for dtype in df.dtypes):
return "Line chart recommended for time series visualization"
else:
return "Default to table or manual selection"
These logic rules acted as guardrails: rigid, but fast and reliable.
LLM-Enhanced Semantic Layer (Led by Co-Author!)
We added an LLM layer that interprets column names and user goals using natural language.
import openai
from openai import OpenAI
client=OpenAI(api_key="API_KEY_HERE")
models = client.models.list()
def get_llm_chart_suggestion(columns):
user_prompt = f"""
You are given a dataset or a description of data. Your task is to recommend the single most suitable type of chart or visualization to effectively represent the data. Your recommendation should:
- Be limited to one concise sentence.
- Focus on clarity and effectiveness of communication, based on the data structure and use case.
- Take into account:
- The type of data (categorical, numerical, time series, geographical, etc.)
- The number of variables (univariate, bivariate, multivariate)
- The intended analytical goal (e.g., comparison, distribution, trend over time, composition, correlation, ranking, or anomaly detection)
- The audience if mentioned (e.g., general public, business analysts, data scientists)
- The medium if known (e.g., slide, dashboard, report, mobile screen)
Avoid generating the chart or describing how to build it. Just recommend the name of the chart type (e.g., bar chart, line chart, pie chart, histogram, box plot, scatter plot, bubble chart, heatmap, treemap, choropleth map, etc.) that best fits the scenario. If more than one chart could be appropriate, choose the most effective and commonly accepted option.
Data is: {columns}
"""
response = client.chat.completions.create(
model="gpt-4o-mini", # or "gpt-3.5-turbo"
messages=[
{"role": "system", "content": "You are a data visualization expert."},
{"role": "user", "content": user_prompt}
]
)
print(response.choices[0].message.content)
Sample result:
“Use a stacked bar chart to compare revenue by product line across regions.”
This LLM-backed logic helped in ambiguous cases where rule-based logic struggled.
Optional: Auto-Render Charts
We even added a quick render option:
pythonCopyEditdef plot_bar(df, category_col, value_col):
grouped = df.groupby(category_col)[value_col].sum()
grouped.plot(kind='bar')
plt.title(f'{value_col} by {category_col}')
plt.show()
What This Engine Does
- Reads your dataset
- Profiles it using rules
- Suggests a chart using logic + LLM
- Optionally plots a quick chart
No more guessing. No more mismatched visuals. Just context-aware charting.
Challenges We Faced
Here’s where we struggled:
- Ambiguous Column Names:
val1
,x2
,abc123
: LLMs helped, but only so much. - Overlapping Chart Options: Bar vs. Stacked Bar vs. Line? Context is everything.
- Visualization Best Practices: Chart selection ≠ chart quality. Avoiding “chart junk” is another layer.
What’s Next?
Here’s how we plan to level this up:
- Fine-tune LLMs with real-world datasets
- Package this as a Power BI or Tableau extension
- Add a feedback loop: "Was this chart helpful?"
- Build a drag-and-drop UI (upload → chart preview)
Eventually, we want this to become your AI visualization assistant.
Key Takeaways
- Most BI tools assume users know what chart to pick. They don’t.
- Rules + heuristics handle most cases. LLMs handle the rest.
- The right chart = better decisions, better stories, better outcomes.
- Automation can make visualization accessible to all.
Bonus: Try It Yourself
We’re open-sourcing the prototype soon on GitHub.
Want to collaborate, test it out, or use it in your BI workflows?
Reach out. Fork. Contribute. Or just tell us what you’d improve.