Pantry Pilot Proves Usefulness by Automating Restaurant Food Costing with AI

Welcome to the Proof of Usefulness Hackathon report series, curated by HackerNoon’s editors to spotlight standout solutions to real-world problems. Whether you’re a solopreneur, part of an early-stage startup, or a developer building something that truly matters, the Proof of Usefulness Hackathon is your chance to test your product’s utility, get featured on HackerNoon, and compete for $150k+ in prizes. Submit your project to get started!

In this interview, we talk to the minds behind Pantry Pilot—an Agentic ERP system designed for commercial kitchens.

What does Pantry Pilot do? And why is now the time for it to exist?

Pantry Pilot is an Agentic ERP (Enterprise Resource Planning) system designed specifically for commercial kitchens. It automates the most hated task in the hospitality industry: manual data entry. By using multimodal AI agents to read unstructured supplier invoices and handwritten recipes, and integrating directly with Point of Sale systems (like Clover), Pantry Pilot tracks real-time inventory depletion and calculates the exact profitability of every dish served. It turns a lagging financial statement into a real-time dashboard.

What is your traction to date? How many people does Pantry Pilot reach?

We are currently executing a high-fidelity private pilot with 5 commercial restaurant locations across the Texas Triangle (Bryan, College Station, and Navasota).

Who does your Pantry Pilot serve? What’s exciting about your users and customers?

Pantry Pilot serves independent restaurant operators, executive chefs, and caterers. What excites us most about these users is that they are craftsmen who are currently forced to act as accountants. They entered this industry to create food and hospitality, not to manage spreadsheets or type data from crumpled invoices. It is exciting to give a local independent chef the same data intelligence as a global franchise.

What technologies were used in the making of Pantry Pilot? And why did you choose ones most essential to your techstack?

We built Pantry Pilot on a modern Python stack designed for heavy data lifting and real-time interaction.

FastAPI & Celery: We chose FastAPI for its high-performance async capabilities, which are crucial when handling long-running AI inferences. We pair this with Celery and Redis to offload the heavy thinking tasks (OCR and recursive costing) to background workers, ensuring the UI remains snappy even while the system is digesting a 50-page PDF invoice.
Google Gemini: This is the engine of our Agentic workflow. We chose Gemini specifically for its massive context window and superior multimodal capabilities. It allows our agents to look at complex, messy invoice images and understand the visual layout of tabular data better than traditional text-only models.
HTMX: Instead of the complexity of a React/Vue SPA, we used HTMX to deliver a Correction Deck UI. This allows us to maintain state on the server (where the database is) while still giving the user a fluid, app-like experience when verifying AI data.
Digital Ocean & PostgreSQL: We host on Digital Ocean for simplicity and reliability, using PostgreSQL’s JSONB features to create a hybrid relational/document schema that can adapt to the unpredictable structure of supplier data.

What is traction to date for Pantry Pilot? Around the web, who’s been noticing?

The project has established a functional V1 MVP which is currently processing real financial data for our pilot locations. Development is actively progressing toward a V2 deployment that integrates POS APIs for real-time tracking, with portion size optimization features also underway.

Pantry Pilot scored a 56 proof of usefulness score (proofofusefulness.com/pantry-pilot-report) - how do you feel about that? Needs reassessed or just right?

The algorithm nailed the most important metric: Real World Utility (+22 points). This validates that we are solving an actual, painful problem (restaurant profitability) rather than building a solution in search of a problem. The overall score sits at 56 primarily because we were penalized for Evidence of Traction, which is fair. We are in a closed, high-fidelity pilot with five locations rather than a public launch with thousands of signups. We have maximized the utility of the code; now we simply need to scale the access.

What excites you about this Pantry Pilot's potential usefulness?

Most independent restaurants fail because they lack the data infrastructure of major chains; they know what they sold, but not what it cost. This project bridges that gap. By using AI to turn messy, unstructured physical invoices into structured data that talks to the POS, we turn lagging indicators into leading indicators. It’s an economic survival tool!

Walk us through your most concrete evidence of usefulness. Not vanity metrics or projections - what's the one data point that proves people genuinely need what you've built?

When I onboard a restaurant, I ask them about what it costs to make their most popular item. Invariably they think for a moment and either say "I don't know" or "I think it is X." After a week of using the system, I ask them the same question, and they can tell me the price down to the penny. That clarity changes how they run their business.

How do you measure genuine user adoption versus tourists who sign up but never return? What's your retention story?

We have carefully selected our first set of users based on sales volume and their willingness to provide reliability feedback. Our retention story is built on sticky data. Once a restaurant sees their live margins, going back to blind guessing feels like flying a plane without instruments. They don't just return; they rely on it daily.

If we re-score your project in 12 months, which criterion will show the biggest improvement, and what are you doing right now to make that happen?

Right now, our traction score is constrained because we are operating in pilot mode where onboarding requires manual API key exchanges. To change this, we are currently rewriting our entire onboarding architecture (specifically the onboarding.py module) to support Self-Serve SaaS. We are training our agents to handle the initial pantry setup automatically, removing the need for human intervention during account creation. In 12 months, we expect to move from 5 hand-held pilot locations to hundreds of self-onboarded restaurants, transforming our traction metric from number of pilots to millions of dollars in digitized inventory.

How Did You Hear About HackerNoon? Share With Us About Your Experience With HackerNoon.

I received an email about the Hackathon. I regularly read and contribute to HackerNoon!

You mentioned having five active test locations in Texas. What specific feedback from these early pilot users has most significantly shaped the features you are building for the V2 deployment?

The most critical feedback has been the need for Sub-Recipe logic. Translating raw inventory items into batch recipes (e.g., turning onions and tomatoes into a 5-gallon batch of Salsa, which is then used in a Taco). The second is the need for end-to-end integration: connecting Point of Sales to inventory depletion and syncing the final data to QuickBooks.

With the upcoming integration of POS APIs, how do you plan to handle the wide variety of legacy and modern POS systems currently used by independent restaurants to ensure seamless growth?

We are tackling the fragmentation of the POS landscape with a two-tiered strategy: Standardized Adapters for the modern web, and Agentic Fallbacks for legacy systems.

For modern cloud-based systems (like Clover, Toast, and Square), we don't build custom business logic for every integration. Instead, we’ve designed a Universal Sale Data Model within our system. We build lightweight, interchangeable adapters that simply normalize the external API data into our internal format. This allows us to scale to new providers by writing a translation layer rather than rewriting the core engine.

The biggest barrier to growth in this industry is legacy on-premise hardware (e.g., old Micros or Aloha systems) that lack accessible APIs. Rather than ignoring these restaurants, we utilize our existing AI infrastructure. If a legacy system can print a Product Mix (PMIX) report to PDF or paper, our Vision Agents can ingest, read, and digitize that sales report exactly the same way they process a supplier invoice. This makes Pantry Pilot compatible with virtually any POS system from day one, regardless of its age.

Your system uses AI to digitize unstructured physical invoices. How does Pantry Pilot handle edge cases, such as handwriting or damaged paper, to ensure the financial accuracy required for utility?

We handle AI uncertainty by treating digitization as a workflow, not a magic button. We acknowledge that while Large Language Models are probabilistic, accounting must be deterministic.

Unlike traditional OCR which reads character-by-character, our Vision Agents (powered by Gemini) read contextually. If a coffee stain obscures the quantity of Onions, the agent looks at the Pack Size (50LB Bag) and the Total Price to mathematically deduce the missing number using a strict Hierarchy of Evidence defined in our system prompts.

Furthermore, AI data is never pushed directly to the financial ledger. Instead, low-confidence extractions are staged in a Correction Queue. This UI presents the user with the original image side-by-side with the extracted data, highlighting fields that need verification. Our agents don't just extract numbers; they write notes explaining why they extracted a value (e.g., Extracted '2' from the Description column because the Quantity column was empty). This allows the human operator to trust the result or spot the error immediately, ensuring 100% financial accuracy before the data ever touches the P&L.

Meet our sponsors

Bright Data: Bright Data is the leading web data infrastructure company, empowering over 20,000 organizations with ethical, scalable access to real-time public web information. From startups to industry leaders, we deliver the datasets that fuel AI innovation and real-world impact. Ready to unlock the web? Learn more at brightdata.com.

Neo4j: GraphRAG combines retrieval-augmented generation with graph-native context, allowing LLMs to reason over structured relationships instead of just documents. With Neo4j, you can build GraphRAG pipelines that connect your data and surface clearer insights. Learn more.

Storyblok: Storyblok is a headless CMS built for developers who want clean architecture and full control. Structure your content once, connect it anywhere, and keep your front end truly independent. API-first. AI-ready. Framework-agnostic. Future-proof. Start for free.

Algolia: Algolia provides a managed retrieval layer that lets developers quickly build web search and intelligent AI agents. Learn more.