Data visualization is the final step in analytics, but also one of the most neglected. As two folks deeply embedded in the business intelligence and AI space, we’ve watched dashboards turn into forests of bar charts and pie charts, and often not because they're the best, but because they're familiar. business intelligence and AI That got us thinking: Can an AI recommend the best chart for any dataset automatically? Can an AI recommend the best chart for any dataset automatically? Can an AI recommend the best chart for any dataset automatically? We teamed up to explore this idea. One of us focused on rule-based heuristics and data profiling logic; the other (shoutout to my co-author!) led the charge on LLM-driven enhancements. Together, we built a working prototype that marries structure with semantics. rule-based heuristics LLM-driven enhancements This article is our build-in-public walkthrough: architecture, code, logic, and all. This article is our build-in-public walkthrough: architecture, code, logic, and all. The Problem with Traditional BI Charting Let’s be honest: most BI tools (Tableau, Power BI, Looker, etc.) offer chart variety, but not guidance. They assume the user: variety guidance Knows what chart suits their dataUnderstands data typesCan infer the story the chart should tell Knows what chart suits their data Understands data types Can infer the story the chart should tell In reality: Users default to safe/familiar chartsVisuals often misrepresent the dataStakeholders misread them Users default to safe/familiar charts Visuals often misrepresent the data Stakeholders misread them We needed a tool that doesn’t just draw but thinks. thinks What If AI Picked the Chart? The idea hit during a dashboard review: Why use a line chart for this categorical data? Why use a line chart for this categorical data? Boom!! Opportunity spotted. What if a system could: Read your CSVProfile columnsUnderstand what you want to analyzeRecommend the most effective chart Read your CSV Profile columns Understand what you want to analyze Recommend the most effective chart Our vision: Upload your data → Get a tailored chart suggestion (with explanation) Upload your data → Get a tailored chart suggestion (with explanation) This would empower non-technical users and speed up analysts. non-technical speed up How We Designed It We split the engine into three layers: three layers Data Profiling Layer This step scans the dataset to identify column types, cardinality, nulls, and more. pythonCopyEditdef profile_data(df): profile = [] for col in df.columns: dtype = df[col].dtype unique_vals = df[col].nunique() null_count = df[col].isnull().sum() profile.append({ "column": col, "dtype": str(dtype), "unique_values": unique_vals, "nulls": null_count }) return pd.DataFrame(profile) pythonCopyEditdef profile_data(df): profile = [] for col in df.columns: dtype = df[col].dtype unique_vals = df[col].nunique() null_count = df[col].isnull().sum() profile.append({ "column": col, "dtype": str(dtype), "unique_values": unique_vals, "nulls": null_count }) return pd.DataFrame(profile) Output: columndtypeunique_valuesnullsRegionObject50Salesfloat6410003Created_Atdatetime120Add RowRemove RowAdd ColRemove Col columndtypeunique_valuesnullsRegionObject50Salesfloat6410003Created_Atdatetime120 columndtypeunique_valuesnullsRegionObject50Salesfloat6410003Created_Atdatetime120 columndtypeunique_valuesnulls column column dtype dtype unique_values unique_values nulls nulls RegionObject50 Region Object 5 0 Salesfloat6410003 Sales float64 1000 3 Created_Atdatetime120 Created_At datetime 12 0 Add RowRemove RowAdd ColRemove Col Add Row Remove Row Add Col Remove Col Rule-Based & Heuristic Engine We hardcoded rules based on common patterns. This covered 70–80% of cases. pythonCopyEditdef suggest_chart(df): profile = profile_data(df) cat_cols = profile[profile['dtype'] == 'object']['column'].tolist() num_cols = profile[profile['dtype'].str.contains('float|int')]['column'].tolist() if len(cat_cols) == 1 and len(num_cols) == 1: return f"Bar chart recommended for {cat_cols[0]} vs {num_cols[0]}" elif len(num_cols) == 2: return f"Scatter plot recommended for {num_cols[0]} vs {num_cols[1]}" elif any('date' in str(dtype).lower() for dtype in df.dtypes): return "Line chart recommended for time series visualization" else: return "Default to table or manual selection" pythonCopyEditdef suggest_chart(df): profile = profile_data(df) cat_cols = profile[profile['dtype'] == 'object']['column'].tolist() num_cols = profile[profile['dtype'].str.contains('float|int')]['column'].tolist() if len(cat_cols) == 1 and len(num_cols) == 1: return f"Bar chart recommended for {cat_cols[0]} vs {num_cols[0]}" elif len(num_cols) == 2: return f"Scatter plot recommended for {num_cols[0]} vs {num_cols[1]}" elif any('date' in str(dtype).lower() for dtype in df.dtypes): return "Line chart recommended for time series visualization" else: return "Default to table or manual selection" These logic rules acted as guardrails: rigid, but fast and reliable. guardrails LLM-Enhanced Semantic Layer (Led by Co-Author!) We added an LLM layer that interprets column names and user goals using natural language. LLM layer import openai from openai import OpenAI client=OpenAI(api_key="API_KEY_HERE") models = client.models.list() def get_llm_chart_suggestion(columns): user_prompt = f""" You are given a dataset or a description of data. Your task is to recommend the single most suitable type of chart or visualization to effectively represent the data. Your recommendation should: - Be limited to one concise sentence. - Focus on clarity and effectiveness of communication, based on the data structure and use case. - Take into account: - The type of data (categorical, numerical, time series, geographical, etc.) - The number of variables (univariate, bivariate, multivariate) - The intended analytical goal (e.g., comparison, distribution, trend over time, composition, correlation, ranking, or anomaly detection) - The audience if mentioned (e.g., general public, business analysts, data scientists) - The medium if known (e.g., slide, dashboard, report, mobile screen) Avoid generating the chart or describing how to build it. Just recommend the name of the chart type (e.g., bar chart, line chart, pie chart, histogram, box plot, scatter plot, bubble chart, heatmap, treemap, choropleth map, etc.) that best fits the scenario. If more than one chart could be appropriate, choose the most effective and commonly accepted option. Data is: {columns} """ response = client.chat.completions.create( model="gpt-4o-mini", # or "gpt-3.5-turbo" messages=[ {"role": "system", "content": "You are a data visualization expert."}, {"role": "user", "content": user_prompt} ] ) print(response.choices[0].message.content) import openai from openai import OpenAI client=OpenAI(api_key="API_KEY_HERE") models = client.models.list() def get_llm_chart_suggestion(columns): user_prompt = f""" You are given a dataset or a description of data. Your task is to recommend the single most suitable type of chart or visualization to effectively represent the data. Your recommendation should: - Be limited to one concise sentence. - Focus on clarity and effectiveness of communication, based on the data structure and use case. - Take into account: - The type of data (categorical, numerical, time series, geographical, etc.) - The number of variables (univariate, bivariate, multivariate) - The intended analytical goal (e.g., comparison, distribution, trend over time, composition, correlation, ranking, or anomaly detection) - The audience if mentioned (e.g., general public, business analysts, data scientists) - The medium if known (e.g., slide, dashboard, report, mobile screen) Avoid generating the chart or describing how to build it. Just recommend the name of the chart type (e.g., bar chart, line chart, pie chart, histogram, box plot, scatter plot, bubble chart, heatmap, treemap, choropleth map, etc.) that best fits the scenario. If more than one chart could be appropriate, choose the most effective and commonly accepted option. Data is: {columns} """ response = client.chat.completions.create( model="gpt-4o-mini", # or "gpt-3.5-turbo" messages=[ {"role": "system", "content": "You are a data visualization expert."}, {"role": "user", "content": user_prompt} ] ) print(response.choices[0].message.content) Sample result: Sample result: “Use a stacked bar chart to compare revenue by product line across regions.” “Use a stacked bar chart to compare revenue by product line across regions.” “Use a stacked bar chart to compare revenue by product line across regions.” This LLM-backed logic helped in ambiguous cases where rule-based logic struggled. ambiguous Optional: Auto-Render Charts We even added a quick render option: pythonCopyEditdef plot_bar(df, category_col, value_col): grouped = df.groupby(category_col)[value_col].sum() grouped.plot(kind='bar') plt.title(f'{value_col} by {category_col}') plt.show() pythonCopyEditdef plot_bar(df, category_col, value_col): grouped = df.groupby(category_col)[value_col].sum() grouped.plot(kind='bar') plt.title(f'{value_col} by {category_col}') plt.show() What This Engine Does Reads your datasetProfiles it using rulesSuggests a chart using logic + LLMOptionally plots a quick chart Reads your dataset Profiles it using rules Suggests a chart using logic + LLM Optionally plots a quick chart No more guessing. No more mismatched visuals. Just context-aware charting. Challenges We Faced Here’s where we struggled: Ambiguous Column Names: val1, x2, abc123 : LLMs helped, but only so much.Overlapping Chart Options: Bar vs. Stacked Bar vs. Line? Context is everything.Visualization Best Practices: Chart selection ≠ chart quality. Avoiding “chart junk” is another layer. Ambiguous Column Names: val1, x2, abc123 : LLMs helped, but only so much. Ambiguous Column Names val1 x2 abc123 Overlapping Chart Options: Bar vs. Stacked Bar vs. Line? Context is everything. Overlapping Chart Options Visualization Best Practices: Chart selection ≠ chart quality. Avoiding “chart junk” is another layer. Visualization Best Practices What’s Next? Here’s how we plan to level this up: Fine-tune LLMs with real-world datasetsPackage this as a Power BI or Tableau extensionAdd a feedback loop: "Was this chart helpful?"Build a drag-and-drop UI (upload → chart preview) Fine-tune LLMs with real-world datasets Package this as a Power BI or Tableau extension Add a feedback loop: "Was this chart helpful?" Build a drag-and-drop UI (upload → chart preview) Eventually, we want this to become your AI visualization assistant. AI visualization assistant Key Takeaways Most BI tools assume users know what chart to pick. They don’t.Rules + heuristics handle most cases. LLMs handle the rest.The right chart = better decisions, better stories, better outcomes.Automation can make visualization accessible to all. Most BI tools assume users know what chart to pick. They don’t. Rules + heuristics handle most cases. LLMs handle the rest. The right chart = better decisions, better stories, better outcomes. Automation can make visualization accessible to all. Bonus: Try It Yourself Bonus: Try It Yourself We’re open-sourcing the prototype soon on GitHub. Want to collaborate, test it out, or use it in your BI workflows? Reach out. Fork. Contribute. Or just tell us what you’d improve. Reach out. Fork. Contribute. Or just tell us what you’d improve.