👋 Why We Built This Social platforms make it easy to react — but not to respond. You double-tap to like.You comment if you have time.But what if you could justask the post a question, instantly? ask the post a question When I was selected for Meta’s invite-only LlamaCon Hackathon, I teamed up with fellow engineer Rashmi to test an idea we couldn’t stop thinking about: what if posts could talk back? LlamaCon Hackathon Rashmi We built Installama, an AI overlay that lets users triple-tap any post — image, caption, meme — and open an instant conversational window powered by Meta’s LLaMA API. Installama 🔄 How Installama Works : 🎥 Watch the full demo Installama Watch the full demo Watch the full demo The UX is natural and invisible: Double tap → Like (as usual) Triple tap → AI wakes up and talks to the post Double tap → Like (as usual) Double tap Triple tap → AI wakes up and talks to the post Triple tap The AI response is: Context-aware (reads caption + tags) Conversational (answers or reacts) Fast and expressive Context-aware (reads caption + tags) Conversational (answers or reacts) Fast and expressive You no longer need to comment or DM. You just tap — and the post responds like a person. 🧱 The Tech Stack We collaborated on design and UX, but I led the engineering side: Meta’s LLaMA API – for generating fast, context-aware responses Next.js + Tailwind – for clean frontend and mobile-style tapping logic Supabase – for storing taps, metadata, and AI session states Dynamic prompt chaining – pulling metadata (caption, category, tags) into each prompt to give LLaMA richer context Meta’s LLaMA API – for generating fast, context-aware responses Meta’s LLaMA API Next.js + Tailwind – for clean frontend and mobile-style tapping logic Next.js + Tailwind Supabase – for storing taps, metadata, and AI session states Supabase Dynamic prompt chaining – pulling metadata (caption, category, tags) into each prompt to give LLaMA richer context Dynamic prompt chaining It works for both logged-in and guest users. No typing required. Just tap. 🏆What Happened at Meta’s LlamaCon This was Meta’s first official hackathon showcasing its powerful LLaMA API. Our project, Installama, was: Installama ✅ Accepted into the hackathon after a selective application process ✅ Publicly showcased on the official Cerebral Valley LlamaCon page ✅ Reviewed by Meta engineers and Cerebral Valley organizers ✅ Awarded API power-user status based on technical implementation ✅ Accepted into the hackathon after a selective application process ✅ Publicly showcased on the official Cerebral Valley LlamaCon page Publicly showcased official Cerebral Valley LlamaCon page ✅ Reviewed by Meta engineers and Cerebral Valley organizers ✅ Awarded API power-user status based on technical implementation API power-user status This recognition validated that our project wasn’t just cool — it was forward-thinking. 💡 Why Installama Matters Installama Most LLM products treat AI like a chatbot-in-a-box. Installama flips that by embedding AI into natural user behavior — tapping. It’s: Installama natural user behavior Seamless Frictionless Intent-driven Seamless Frictionless Intent-driven And most importantly, it opens the door for gesture-based AI interfaces — a powerful new way to make LLMs feel human. gesture-based AI interfaces Imagine: Talking to a meme Triple-tapping a headline to ask it questions Commenting without commenting Talking to a meme Talking to a meme Triple-tapping a headline to ask it questions Triple-tapping a headline to ask it questions Commenting without commenting Commenting without commenting It’s not chat — it’s interaction. 🎯 What’s Next We’re evolving Installama into a full-featured AI UX framework. My current roadmap includes: 🎙️ Voice-triggered taps – for accessibility and gesture-free interaction 🧠 Reaction history – letting users see what others asked or how the post responded ✍️ Creator-mode replies – where influencers can pre-train the AI to respond in their voice 🎙️ Voice-triggered taps – for accessibility and gesture-free interaction Voice-triggered taps 🧠 Reaction history – letting users see what others asked or how the post responded Reaction history ✍️ Creator-mode replies – where influencers can pre-train the AI to respond in their voice Creator-mode replies 🔗 Under the Hood 👇 Key Features Triple-tap gesture triggers AI overlay Meta’s LLaMA API generates human-like responses Prompt chaining incorporates caption, tags, and image analysis Works for guest and logged-in users Streaming response via Server-Sent Events (SSE) Frontend fails over to client-side analysis if backend is unavailable 🧱 Architecture Overview Triple-tap gesture triggers AI overlay Triple-tap gesture triggers AI overlay Meta’s LLaMA API generates human-like responses Meta’s LLaMA API generates human-like responses Prompt chaining incorporates caption, tags, and image analysis Prompt chaining incorporates caption, tags, and image analysis Works for guest and logged-in users Works for guest and logged-in users Streaming response via Server-Sent Events (SSE) Streaming response via Server-Sent Events (SSE) Frontend fails over to client-side analysis if backend is unavailable 🧱 Architecture Overview Frontend fails over to client-side analysis if backend is unavailable 🧱 Architecture Overview 🧱 Architecture Overview [User Triple Taps Post] ↓ [Frontend: Tap Handler] ↓ [Supabase Logs Tap + Session] ↓ [Backend: Gemini Vision → Prompt Chain → LLaMA API\] ↓ [Frontend: Render AI Response Overlay] [User Triple Taps Post] ↓ [Frontend: Tap Handler] ↓ [Supabase Logs Tap + Session] ↓ [Backend: Gemini Vision → Prompt Chain → LLaMA API\] ↓ [Frontend: Render AI Response Overlay] ⚙️Tap Detection Logic ⚙️Tap Detection Logic Detects triple taps with 300ms timeout. Uses a React Native hook. // useTripleTap.js export const useTripleTap = (onTripleTapCallback) => { const [tapCount, setTapCount] = useState(0); const lastTapTimeRef = useRef(0); const MAX_DELAY = 300; const onHandlerStateChange = useCallback((event) => { if (event.nativeEvent.state === State.ACTIVE) { const now = Date.now(); if (now - lastTapTimeRef.current > MAX_DELAY) setTapCount(1); else setTapCount((prev) => prev + 1); lastTapTimeRef.current = now; if (tapCount === 2) { setTapCount(0); Haptics.impactAsync(Haptics.ImpactFeedbackStyle.Medium); onTripleTapCallback(); } } }, [tapCount, onTripleTapCallback]); useEffect(() => { if (tapCount > 0) { const timer = setTimeout(() => setTapCount(0), MAX_DELAY); return () => clearTimeout(timer); } }, [tapCount]); return onHandlerStateChange; }; // useTripleTap.js export const useTripleTap = (onTripleTapCallback) => { const [tapCount, setTapCount] = useState(0); const lastTapTimeRef = useRef(0); const MAX_DELAY = 300; const onHandlerStateChange = useCallback((event) => { if (event.nativeEvent.state === State.ACTIVE) { const now = Date.now(); if (now - lastTapTimeRef.current > MAX_DELAY) setTapCount(1); else setTapCount((prev) => prev + 1); lastTapTimeRef.current = now; if (tapCount === 2) { setTapCount(0); Haptics.impactAsync(Haptics.ImpactFeedbackStyle.Medium); onTripleTapCallback(); } } }, [tapCount, onTripleTapCallback]); useEffect(() => { if (tapCount > 0) { const timer = setTimeout(() => setTapCount(0), MAX_DELAY); return () => clearTimeout(timer); } }, [tapCount]); return onHandlerStateChange; }; 🔐Supabase Session Logging 🔐Supabase Session Logging Stores tap data and session metadata. await supabase.from('tap_events').insert([ { post_id: postId, tap_type: 'triple', session_id: sessionId, user_id: user?.id, timestamp: new Date().toISOString() } ]); await supabase.from('tap_events').insert([ { post_id: postId, tap_type: 'triple', session_id: sessionId, user_id: user?.id, timestamp: new Date().toISOString() } ]); 🧠Prompt Chaining with LLaMA 🧠Prompt Chaining with LLaMA System → Vision → Prompt → LLaMA. Prompt includes caption, tags, image analysis. const messages = [ { role: 'system', content: `You are Installama, an AI that replies like a social media post. You have access to Gemini image analysis:\n${imageAnalysis}\nCaption: ${caption}\nTags: ${hashtags}` }, { role: 'user', content: question || "What's interesting about this post?" } ]; const response = await axios.post(LLAMA_API_URL, { model: "Llama-4-Maverick-17B-128E-Instruct-FP8", messages: messages }, { headers: { 'Authorization': `Bearer ${LLAMA_API_KEY}`, 'Content-Type': 'application/json' } }); const messages = [ { role: 'system', content: `You are Installama, an AI that replies like a social media post. You have access to Gemini image analysis:\n${imageAnalysis}\nCaption: ${caption}\nTags: ${hashtags}` }, { role: 'user', content: question || "What's interesting about this post?" } ]; const response = await axios.post(LLAMA_API_URL, { model: "Llama-4-Maverick-17B-128E-Instruct-FP8", messages: messages }, { headers: { 'Authorization': `Bearer ${LLAMA_API_KEY}`, 'Content-Type': 'application/json' } }); 🧬Image Analysis (Gemini Vision API) Image converted to base64. Gemini returns structured text describing subject, mood, tone. const model = genAI.getGenerativeModel({ model: "gemini-pro-vision" }); const result = await model.generateContent({ contents: [{ role: "user", parts: [ { text: "Describe this image in detail..." }, { inlineData: { data: base64Image, mimeType } } ] }] }); const imageAnalysis = result.response.text().trim(); const model = genAI.getGenerativeModel({ model: "gemini-pro-vision" }); const result = await model.generateContent({ contents: [{ role: "user", parts: [ { text: "Describe this image in detail..." }, { inlineData: { data: base64Image, mimeType } } ] }] }); const imageAnalysis = result.response.text().trim(); 📡Streaming Responses (SSE) 📡Streaming Responses (SSE) AI responses streamed word-by-word. // Express endpoint app.post('/api/analyze-image-stream', async (req, res) => { res.setHeader('Content-Type', 'text/event-stream'); const response = await streamLlamaResponse(...); const words = response.split(' '); for (let i = 0; i < words.length; i += 3) { const chunk = words.slice(i, i + 3).join(' '); res.write(`data: ${JSON.stringify({ type: 'chunk', text: chunk })}\n\n`); await new Promise(r => setTimeout(r, 100)); } }); // Express endpoint app.post('/api/analyze-image-stream', async (req, res) => { res.setHeader('Content-Type', 'text/event-stream'); const response = await streamLlamaResponse(...); const words = response.split(' '); for (let i = 0; i < words.length; i += 3) { const chunk = words.slice(i, i + 3).join(' '); res.write(`data: ${JSON.stringify({ type: 'chunk', text: chunk })}\n\n`); await new Promise(r => setTimeout(r, 100)); } }); 🧾Frontend Fallbacks 🧾Frontend Fallbacks If backend fails, Gemini runs client-side. try { const response = await axios.post('/api/analyze-image', { imageUrl, caption }); return response.data; } catch { const imageAnalysis = await GeminiVisionService.analyzeImageFromUrl(imageUrl); return { success: true, response: `Image analysis: ${imageAnalysis}`, source: 'client-side' }; } try { const response = await axios.post('/api/analyze-image', { imageUrl, caption }); return response.data; } catch { const imageAnalysis = await GeminiVisionService.analyzeImageFromUrl(imageUrl); return { success: true, response: `Image analysis: ${imageAnalysis}`, source: 'client-side' }; } 🔒 Safety & Moderation 🔒 Safety & Moderation Guest sessions are anonymous Moderation via OpenAI’s API (planned) Rate limits to prevent abuse Guest sessions are anonymous Moderation via OpenAI’s API (planned) Rate limits to prevent abuse Installama isn’t just a chatbot — it’s a gesture-first AI interaction layer.No input box. No commands. Just tap. Installama gesture-first AI interaction layer 🧾 Tags #llama #meta #llm #socialai #ux #frontend #tripleTap #promptengineering #humanai #llamacon #llama #meta #llm #socialai #ux #frontend #tripleTap #promptengineering #humanai #llamacon