This whitepaper presents ATLAS, a conceptual architecture for a natural language interface to IT Service Management (ITSM) systems, designed for mid-to-large enterprise environments. ATLAS employs a novel multi-agent AI architecture that transforms unstructured natural language queries into actionable insights from service desk data. The proposed system features a three-stage AI pipeline for query analysis, data retrieval refinement, and conversational response generation, combined with a hybrid data synchronization strategy designed to maintain data freshness without impacting query performance. We present modelled evaluation results based on simulated workloads, projecting 90-95% query understanding accuracy with median response times under 4 seconds. Key contributions include: Key contributions include: A context-aware conversation management system enabling multi-turn dialogue with follow-up query understanding A thread-safe concurrency model for AI agent orchestration A fallback-resilient query parsing system with heuristic degradation Detailed cost analysis projecting significant reduction in time-to-insight versus traditional methods Comprehensive error analysis with mitigation strategies A context-aware conversation management system enabling multi-turn dialogue with follow-up query understanding A thread-safe concurrency model for AI agent orchestration A fallback-resilient query parsing system with heuristic degradation Detailed cost analysis projecting significant reduction in time-to-insight versus traditional methods Comprehensive error analysis with mitigation strategies We compare ATLAS against existing commercial solutions and open-source alternatives, demonstrating potential for superior context retention and domain-specific accuracy. The paper concludes with an honest assessment of anticipated limitations and a roadmap for addressing identified gaps. Keywords: Natural Language Processing, IT Service Management, Multi-Agent Systems, Conversational AI, Enterprise Architecture, LLM Orchestration Keywords: Table of Contents Introduction Related Work Problem Statement System Architecture Multi-Agent AI Pipeline Prompt Engineering Conversation Context Management Data Synchronization Strategy Query Processing Engine Concurrency & Thread Safety Data Transformation Pipeline Security Considerations Evaluation & Results Limitations Architecture Decision Records Future Work Conclusion References Appendices Introduction Introduction Related Work Related Work Problem Statement Problem Statement System Architecture System Architecture Multi-Agent AI Pipeline Multi-Agent AI Pipeline Prompt Engineering Prompt Engineering Conversation Context Management Conversation Context Management Data Synchronization Strategy Data Synchronization Strategy Query Processing Engine Query Processing Engine Concurrency & Thread Safety Concurrency & Thread Safety Data Transformation Pipeline Data Transformation Pipeline Security Considerations Security Considerations Evaluation & Results Evaluation & Results Limitations Limitations Architecture Decision Records Architecture Decision Records Future Work Future Work Conclusion Conclusion References References Appendices Appendices 1. Introduction 1.1 Background IT Service Management (ITSM) systems are the backbone of enterprise IT operations, handling thousands of service requests, incidents, and change management workflows daily. In mid-to-large organizations, these systems typically process hundreds to thousands of tickets daily, generating valuable operational data that remains largely inaccessible to non-technical stakeholders. While ITSM platforms excel at structured data storage and workflow automation, they present significant usability challenges: Query Complexity: Extracting insights requires knowledge of query syntax, report builders, or API integrations Data Accessibility: Operations managers, team leads, and executives struggle to access real-time metrics Context Switching: Users navigate multiple interfaces to correlate information Reporting Latency: Ad-hoc queries often require IT intervention, with turnaround times of hours to days Query Complexity: Extracting insights requires knowledge of query syntax, report builders, or API integrations Query Complexity: Data Accessibility: Operations managers, team leads, and executives struggle to access real-time metrics Data Accessibility: Context Switching: Users navigate multiple interfaces to correlate information Context Switching: Reporting Latency: Ad-hoc queries often require IT intervention, with turnaround times of hours to days Reporting Latency: 1.2 ATLAS Overview ATLAS (Automated Ticket Language Analysis System) is designed to address these challenges by providing a natural language interface that allows users to query ITSM data conversationally. The system would transform queries like: "How many tickets does the support team have this month?" "Which support personnel have been inactive for two weeks?" "What hour had the most requests yesterday?" "How many of them are open?" (follow-up query) "How many tickets does the support team have this month?" "Which support personnel have been inactive for two weeks?" "What hour had the most requests yesterday?" "How many of them are open?" (follow-up query) Into structured database operations, returning results in natural language with downloadable exports. 1.3 Deployment Context ATLAS is designed for deployment in mid-to-large enterprise environments with characteristics such as: Daily ticket volume: Hundreds to thousands of requests Active technicians: 100-500+ support staff User base: 50-200+ operations and management personnel Data volume: Tens of thousands of historical tickets Availability requirement: 99%+ uptime Daily ticket volume: Hundreds to thousands of requests Daily ticket volume: Active technicians: 100-500+ support staff Active technicians: User base: 50-200+ operations and management personnel User base: Data volume: Tens of thousands of historical tickets Data volume: Availability requirement: 99%+ uptime Availability requirement: 1.4 Contributions This whitepaper makes the following contributions: Multi-Agent Architecture: A three-stage AI pipeline separating query understanding, data validation, and response generation Context-Aware Conversations: Conversation management enabling follow-up queries with pronoun resolution and filter inheritance Hybrid Sync Strategy: Non-blocking background synchronization maintaining data freshness Proposed Patterns: Patterns for thread safety, fallback handling, and graceful degradation Modelled Evaluation: Projected accuracy metrics and error categorization Honest Limitations Assessment: Transparent discussion of anticipated system constraints and failure modes Multi-Agent Architecture: A three-stage AI pipeline separating query understanding, data validation, and response generation Multi-Agent Architecture: Context-Aware Conversations: Conversation management enabling follow-up queries with pronoun resolution and filter inheritance Context-Aware Conversations: Hybrid Sync Strategy: Non-blocking background synchronization maintaining data freshness Hybrid Sync Strategy: Proposed Patterns: Patterns for thread safety, fallback handling, and graceful degradation Proposed Patterns: Modelled Evaluation: Projected accuracy metrics and error categorization Modelled Evaluation: Honest Limitations Assessment: Transparent discussion of anticipated system constraints and failure modes Honest Limitations Assessment: 2. Related Work 2.1 Commercial ITSM AI Solutions (2024-2025) System Approach Key Features (2024-2025) Limitations vs ATLAS ServiceNow Now Assist GenAI with NowLLM + BYO model support Incident summarization, AI Search with RAG "Genius Results," Skill Kit for custom GenAI (Xanadu release, Sept 2024) Agent productivity focus; requires flow configuration; limited analytical query support Freshworks Freddy AI Three-tier AI (Self-Service, Copilot, Insights) AI Agents with agentic workflows (Oct 2024), 80% query resolution claim, real-time sentiment, Slack/Teams integration Customer-facing automation focus; limited internal analytics; no multi-turn data queries Zendesk AI Agents Essential + Advanced tiers with GPT-4o Generative replies, intelligent triage, ticket summarization, auto-assist (March 2024 GA), custom intents Ticket deflection design; limited data analytics; no follow-up query context System Approach Key Features (2024-2025) Limitations vs ATLAS ServiceNow Now Assist GenAI with NowLLM + BYO model support Incident summarization, AI Search with RAG "Genius Results," Skill Kit for custom GenAI (Xanadu release, Sept 2024) Agent productivity focus; requires flow configuration; limited analytical query support Freshworks Freddy AI Three-tier AI (Self-Service, Copilot, Insights) AI Agents with agentic workflows (Oct 2024), 80% query resolution claim, real-time sentiment, Slack/Teams integration Customer-facing automation focus; limited internal analytics; no multi-turn data queries Zendesk AI Agents Essential + Advanced tiers with GPT-4o Generative replies, intelligent triage, ticket summarization, auto-assist (March 2024 GA), custom intents Ticket deflection design; limited data analytics; no follow-up query context System Approach Key Features (2024-2025) Limitations vs ATLAS System System Approach Approach Key Features (2024-2025) Key Features (2024-2025) Limitations vs ATLAS Limitations vs ATLAS ServiceNow Now Assist GenAI with NowLLM + BYO model support Incident summarization, AI Search with RAG "Genius Results," Skill Kit for custom GenAI (Xanadu release, Sept 2024) Agent productivity focus; requires flow configuration; limited analytical query support ServiceNow Now Assist ServiceNow Now Assist GenAI with NowLLM + BYO model support GenAI with NowLLM + BYO model support Incident summarization, AI Search with RAG "Genius Results," Skill Kit for custom GenAI (Xanadu release, Sept 2024) Incident summarization, AI Search with RAG "Genius Results," Skill Kit for custom GenAI (Xanadu release, Sept 2024) Agent productivity focus; requires flow configuration; limited analytical query support Agent productivity focus; requires flow configuration; limited analytical query support Freshworks Freddy AI Three-tier AI (Self-Service, Copilot, Insights) AI Agents with agentic workflows (Oct 2024), 80% query resolution claim, real-time sentiment, Slack/Teams integration Customer-facing automation focus; limited internal analytics; no multi-turn data queries Freshworks Freddy AI Freshworks Freddy AI Three-tier AI (Self-Service, Copilot, Insights) Three-tier AI (Self-Service, Copilot, Insights) AI Agents with agentic workflows (Oct 2024), 80% query resolution claim, real-time sentiment, Slack/Teams integration AI Agents with agentic workflows (Oct 2024), 80% query resolution claim, real-time sentiment, Slack/Teams integration Customer-facing automation focus; limited internal analytics; no multi-turn data queries Customer-facing automation focus; limited internal analytics; no multi-turn data queries Zendesk AI Agents Essential + Advanced tiers with GPT-4o Generative replies, intelligent triage, ticket summarization, auto-assist (March 2024 GA), custom intents Ticket deflection design; limited data analytics; no follow-up query context Zendesk AI Agents Zendesk AI Agents Essential + Advanced tiers with GPT-4o Essential + Advanced tiers with GPT-4o Generative replies, intelligent triage, ticket summarization, auto-assist (March 2024 GA), custom intents Generative replies, intelligent triage, ticket summarization, auto-assist (March 2024 GA), custom intents Ticket deflection design; limited data analytics; no follow-up query context Ticket deflection design; limited data analytics; no follow-up query context ServiceNow Now Assist (2024): ServiceNow's Xanadu release (September 2024) introduced significant GenAI capabilities. Now Assist features include case/incident summarization, chat reply generation, and AI Search with RAG-based "Genius Results" that generate answers from knowledge articles. The platform supports bring-your-own LLM models and integrates with Microsoft Copilot. The Now Assist Skill Kit enables custom GenAI skill development. However, Now Assist primarily enhances agent productivity rather than enabling analytical querying of ticket data. ServiceNow Now Assist (2024): Freshworks Freddy AI (2024): Freddy AI evolved in 2024 to include three tiers: Self-Service (bots), Copilot (agent assistance), and Insights (analytics). The October 2024 update introduced GenAI-powered AI Agents with pre-built "agentic workflows" for e-commerce integrations (Shopify, Stripe, FedEx). Freddy AI Agent integrates with Slack and Microsoft Teams for 24/7 employee support. While Freddy excels at automated resolution, it lacks multi-turn analytical conversation capabilities. Freshworks Freddy AI (2024): Zendesk AI Agents (2024): Zendesk's March 2024 GA release brought generative AI features (summarize, expand, tone shift) upgraded to GPT-4o. The Advanced AI add-on includes intelligent triage with custom intents, auto-assist for guided resolution, and suggested first replies. Zendesk reports over 1.5 million monthly uses of these features. The platform optimizes agent workflows but does not address natural language analytics queries. Zendesk AI Agents (2024): ATLAS Differentiation: Unlike commercial solutions that optimize agent workflows or automate ticket resolution, ATLAS specifically addresses the analytical query gap, enabling natural language questions about ticket data (technician performance, volume trends, inactive staff) with multi-turn context preservation. ATLAS Differentiation: 2.2 Academic Research & Frameworks (2023-2025) Text-to-SQL Systems BIRD Benchmark (Li et al., 2024): The BIRD benchmark represents the current state-of-the-art evaluation standard for text-to-SQL, comprising 12,751 text-SQL pairs across 95 databases totaling 33.4 GB. Published at NeurIPS 2023 and continuously updated, BIRD emphasizes real-world challenges including dirty data, external knowledge requirements, and SQL efficiency (Valid Efficiency Score metric). As of late 2024, GPT-4 achieves approximately 54.89% execution accuracy on BIRD, significantly below human performance of 92.96%. BIRD Benchmark (Li et al., 2024): Spider 2.0 (Lei et al., 2024): Released in late 2024, Spider 2.0 further increases complexity with 632 enterprise-level workflow problems requiring interaction with cloud databases (BigQuery, Snowflake), queries exceeding 100 lines, and multi-step reasoning. Current state-of-the-art models achieve only approximately 6% accuracy on Spider 2.0, demonstrating significant remaining challenges. Spider 2.0 (Lei et al., 2024): ATLAS vs Text-to-SQL: ATLAS operates at a higher abstraction level than direct text-to-SQL: ATLAS vs Text-to-SQL: Query analysis produces semantic intents (top_technicians, influx_requests) rather than raw SQL Domain-specific query types enable optimized retrieval patterns Conversational context enables filter inheritance across turns (not addressed by text-to-SQL benchmarks) Query analysis produces semantic intents (top_technicians, influx_requests) rather than raw SQL Domain-specific query types enable optimized retrieval patterns Conversational context enables filter inheritance across turns (not addressed by text-to-SQL benchmarks) Multi-Agent LLM Frameworks Microsoft AutoGen (Wu et al., 2023; v0.4 January 2025): AutoGen pioneered the multi-agent conversation paradigm for LLM applications. Originally released in Fall 2023, AutoGen v0.4 (January 2025) introduced an actor-based architecture with asynchronous messaging, modular agent composition, and AutoGen Studio for no-code agent building. The framework supports diverse applications including code generation, task automation, and conversational agents. Microsoft AutoGen (Wu et al., 2023; v0.4 January 2025): LangGraph (LangChain, January 2024): LangGraph provides graph-based agent orchestration with native support for cycles, human-in-the-loop patterns, and persistent state management. The framework became widely adopted for production agents in 2024, with deployments at Klarna, Replit, and Uber. LangGraph's hierarchical team patterns (supervisor agents coordinating specialized agents) and the December 2024 "Command" primitive for multi-agent communication influenced ATLAS's pipeline design. LangGraph (LangChain, January 2024): ATLAS Contributions vs Frameworks: ATLAS Contributions vs Frameworks: Domain-specific ITSM agent specialization (vs general-purpose frameworks) Production-tested concurrency patterns for AI platform thread limitations Heuristic fallback system for graceful degradation (critical for enterprise reliability) Domain-specific ITSM agent specialization (vs general-purpose frameworks) Production-tested concurrency patterns for AI platform thread limitations Heuristic fallback system for graceful degradation (critical for enterprise reliability) Retrieval-Augmented Generation (RAG) RAG architectures have evolved significantly since foundational work in 2020. Key 2024 developments include: GraphRAG (Microsoft, mid-2024): Extracts knowledge graphs from text for hierarchical retrieval, addressing semantic gap challenges between queries and documents RAPTOR (Sarthi et al., 2024): Recursive abstractive processing for tree-organized retrieval, enabling multi-level document summarization Agentic RAG (2024): Integration of autonomous agents with RAG pipelines for dynamic retrieval triggering based on generation uncertainty RAG Evaluation Frameworks: RAGAS for reference-free metrics and RAGTruth corpus (Niu et al., 2024) for hallucination analysis GraphRAG (Microsoft, mid-2024): Extracts knowledge graphs from text for hierarchical retrieval, addressing semantic gap challenges between queries and documents GraphRAG (Microsoft, mid-2024): RAPTOR (Sarthi et al., 2024): Recursive abstractive processing for tree-organized retrieval, enabling multi-level document summarization RAPTOR (Sarthi et al., 2024): Agentic RAG (2024): Integration of autonomous agents with RAG pipelines for dynamic retrieval triggering based on generation uncertainty Agentic RAG (2024): RAG Evaluation Frameworks: RAGAS for reference-free metrics and RAGTruth corpus (Niu et al., 2024) for hallucination analysis RAG Evaluation Frameworks: ATLAS vs RAG: ATLAS extends RAG principles to structured database retrieval: ATLAS vs RAG: Retrieves from SQL databases rather than document stores Uses a refinement agent (Stage 2) to validate retrieval accuracy—analogous to RAG reranking Generates conversational responses grounded in retrieved structured data Retrieves from SQL databases rather than document stores Uses a refinement agent (Stage 2) to validate retrieval accuracy—analogous to RAG reranking Generates conversational responses grounded in retrieved structured data 3. Problem Statement 3.1 Hypothetical Pain Points Based on analysis of typical ITSM workflows in mid-to-large enterprises, ATLAS is designed to address the following anticipated challenges: Metric Typical Baseline Target Projected Outcome Time to answer, "How many tickets does X have?" 10-15 minutes < 30 seconds < 10 seconds Ad-hoc report requests to IT team 30-50/week < 10/week ~85% reduction Self-service analytics adoption 10-20% > 60% 70-80% Manager access to real-time metrics Limited All managers Broad access Metric Typical Baseline Target Projected Outcome Time to answer, "How many tickets does X have?" 10-15 minutes < 30 seconds < 10 seconds Ad-hoc report requests to IT team 30-50/week < 10/week ~85% reduction Self-service analytics adoption 10-20% > 60% 70-80% Manager access to real-time metrics Limited All managers Broad access Metric Typical Baseline Target Projected Outcome Metric Metric Typical Baseline Typical Baseline Target Target Projected Outcome Projected Outcome Time to answer, "How many tickets does X have?" 10-15 minutes < 30 seconds < 10 seconds Time to answer, "How many tickets does X have?" Time to answer, "How many tickets does X have?" 10-15 minutes 10-15 minutes < 30 seconds < 30 seconds < 10 seconds < 10 seconds Ad-hoc report requests to IT team 30-50/week < 10/week ~85% reduction Ad-hoc report requests to IT team Ad-hoc report requests to IT team 30-50/week 30-50/week < 10/week < 10/week ~85% reduction ~85% reduction Self-service analytics adoption 10-20% > 60% 70-80% Self-service analytics adoption Self-service analytics adoption 10-20% 10-20% > 60% > 60% 70-80% 70-80% Manager access to real-time metrics Limited All managers Broad access Manager access to real-time metrics Manager access to real-time metrics Limited Limited All managers All managers Broad access Broad access 3.2 Design Requirements Requirement Description Priority Validation Method Natural Language Understanding Parse unstructured queries into structured parameters Critical Accuracy testing Context Awareness Understand follow-up questions referencing previous queries Critical Multi-turn testing Real-time Data Query data current within 5 minutes High Sync latency measurement Concurrent Access Support 50+ simultaneous users High Load testing Response Time < 10 seconds for 95th percentile High Performance monitoring Export Capability Downloadable CSV results Medium Functional testing Personalization Support "my tickets" queries Medium User acceptance testing Requirement Description Priority Validation Method Natural Language Understanding Parse unstructured queries into structured parameters Critical Accuracy testing Context Awareness Understand follow-up questions referencing previous queries Critical Multi-turn testing Real-time Data Query data current within 5 minutes High Sync latency measurement Concurrent Access Support 50+ simultaneous users High Load testing Response Time < 10 seconds for 95th percentile High Performance monitoring Export Capability Downloadable CSV results Medium Functional testing Personalization Support "my tickets" queries Medium User acceptance testing Requirement Description Priority Validation Method Requirement Requirement Description Description Priority Priority Validation Method Validation Method Natural Language Understanding Parse unstructured queries into structured parameters Critical Accuracy testing Natural Language Understanding Natural Language Understanding Parse unstructured queries into structured parameters Parse unstructured queries into structured parameters Critical Critical Accuracy testing Accuracy testing Context Awareness Understand follow-up questions referencing previous queries Critical Multi-turn testing Context Awareness Context Awareness Understand follow-up questions referencing previous queries Understand follow-up questions referencing previous queries Critical Critical Multi-turn testing Multi-turn testing Real-time Data Query data current within 5 minutes High Sync latency measurement Real-time Data Real-time Data Query data current within 5 minutes Query data current within 5 minutes High High Sync latency measurement Sync latency measurement Concurrent Access Support 50+ simultaneous users High Load testing Concurrent Access Concurrent Access Support 50+ simultaneous users Support 50+ simultaneous users High High Load testing Load testing Response Time < 10 seconds for 95th percentile High Performance monitoring Response Time Response Time < 10 seconds for 95th percentile < 10 seconds for 95th percentile High High Performance monitoring Performance monitoring Export Capability Downloadable CSV results Medium Functional testing Export Capability Export Capability Downloadable CSV results Downloadable CSV results Medium Medium Functional testing Functional testing Personalization Support "my tickets" queries Medium User acceptance testing Personalization Personalization Support "my tickets" queries Support "my tickets" queries Medium Medium User acceptance testing User acceptance testing 4. System Architecture 4.1 High-Level Architecture Figure 1: System high-level architecture Figure 1: System high-level architecture 4.2 Technology Stack Layer Technology Version Rationale Runtime .NET 8.0 LTS Enterprise support, performance Framework ASP.NET Core 8.0 Native async, DI, middleware ORM Entity Framework Core 8.0 Type-safe queries, migrations Database SQL Server 2019+ Enterprise reliability, JSON support AI Platform Azure AI Agents Preview Persistent threads, managed infrastructure Caching IMemoryCache Built-in Token caching, low latency Authentication DefaultAzureCredential Latest Managed identity support Layer Technology Version Rationale Runtime .NET 8.0 LTS Enterprise support, performance Framework ASP.NET Core 8.0 Native async, DI, middleware ORM Entity Framework Core 8.0 Type-safe queries, migrations Database SQL Server 2019+ Enterprise reliability, JSON support AI Platform Azure AI Agents Preview Persistent threads, managed infrastructure Caching IMemoryCache Built-in Token caching, low latency Authentication DefaultAzureCredential Latest Managed identity support Layer Technology Version Rationale Layer Layer Technology Technology Version Version Rationale Rationale Runtime .NET 8.0 LTS Enterprise support, performance Runtime Runtime .NET .NET 8.0 LTS 8.0 LTS Enterprise support, performance Enterprise support, performance Framework ASP.NET Core 8.0 Native async, DI, middleware Framework Framework ASP.NET Core ASP.NET Core 8.0 8.0 Native async, DI, middleware Native async, DI, middleware ORM Entity Framework Core 8.0 Type-safe queries, migrations ORM ORM Entity Framework Core Entity Framework Core 8.0 8.0 Type-safe queries, migrations Type-safe queries, migrations Database SQL Server 2019+ Enterprise reliability, JSON support Database Database SQL Server SQL Server 2019+ 2019+ Enterprise reliability, JSON support Enterprise reliability, JSON support AI Platform Azure AI Agents Preview Persistent threads, managed infrastructure AI Platform AI Platform Azure AI Agents Azure AI Agents Preview Preview Persistent threads, managed infrastructure Persistent threads, managed infrastructure Caching IMemoryCache Built-in Token caching, low latency Caching Caching IMemoryCache IMemoryCache Built-in Built-in Token caching, low latency Token caching, low latency Authentication DefaultAzureCredential Latest Managed identity support Authentication Authentication DefaultAzureCredential DefaultAzureCredential Latest Latest Managed identity support Managed identity support 5. Multi-Agent AI Pipeline 5.1 Pipeline Overview ATLAS employs a three-stage AI pipeline where each agent has a specialized role: Figure 2: Multi-agent pipeline Figure 2: Multi-agent pipeline 5.2 Agent Specialization Rationale Aspect Single Agent Three Agents (ATLAS) Prompt Size 3000+ tokens ~800 tokens each Failure Isolation All-or-nothing Isolated failure points Debugging Opaque Clear stage identification Quality Compromised by competing objectives Optimized per stage Latency Single long call Parallelization potential Cost Higher per-call (longer prompts) Lower aggregate Aspect Single Agent Three Agents (ATLAS) Prompt Size 3000+ tokens ~800 tokens each Failure Isolation All-or-nothing Isolated failure points Debugging Opaque Clear stage identification Quality Compromised by competing objectives Optimized per stage Latency Single long call Parallelization potential Cost Higher per-call (longer prompts) Lower aggregate Aspect Single Agent Three Agents (ATLAS) Aspect Aspect Single Agent Single Agent Three Agents (ATLAS) Three Agents (ATLAS) Prompt Size 3000+ tokens ~800 tokens each Prompt Size Prompt Size 3000+ tokens 3000+ tokens ~800 tokens each ~800 tokens each Failure Isolation All-or-nothing Isolated failure points Failure Isolation Failure Isolation All-or-nothing All-or-nothing Isolated failure points Isolated failure points Debugging Opaque Clear stage identification Debugging Debugging Opaque Opaque Clear stage identification Clear stage identification Quality Compromised by competing objectives Optimized per stage Quality Quality Compromised by competing objectives Compromised by competing objectives Optimized per stage Optimized per stage Latency Single long call Parallelization potential Latency Latency Single long call Single long call Parallelization potential Parallelization potential Cost Higher per-call (longer prompts) Lower aggregate Cost Cost Higher per-call (longer prompts) Higher per-call (longer prompts) Lower aggregate Lower aggregate 5.3 Query Type Classification Query Type Taxonomy CONVERSATIONAL (No data retrieval) CONVERSATIONAL (No data retrieval) greeting: "hello", "hi", "good morning" help: "what can you do", "help", "?" thanks: "thank you", "thanks", "thx" farewell: "goodbye", "bye", "see you" unclear: ambiguous or very short queries (< 5 chars) greeting: "hello", "hi", "good morning" greeting help: "what can you do", "help", "?" help thanks: "thank you", "thanks", "thx" thanks farewell: "goodbye", "bye", "see you" farewell unclear: ambiguous or very short queries (< 5 chars) unclear REQUEST_SEARCH (Default for data queries) [~65% of queries] REQUEST_SEARCH (Default for data queries) [~65% of queries] By subject: "tickets with 'error' in subject" By technician: "tickets assigned to [name]" By requester: "tickets from [name]" By status: "open tickets", "closed requests" By date: "tickets from last week" Personalized: "my tickets", "assigned to me" By subject: "tickets with 'error' in subject" By technician: "tickets assigned to [name]" By requester: "tickets from [name]" By status: "open tickets", "closed requests" By date: "tickets from last week" Personalized: "my tickets", "assigned to me" TOP_TECHNICIANS [~15% of queries] TOP_TECHNICIANS [~15% of queries] "top 10 technicians this month" "best performing technicians" "technician rankings past week" "top 10 technicians this month" "best performing technicians" "technician rankings past week" INACTIVE_TECHNICIANS [~8% of queries] INACTIVE_TECHNICIANS [~8% of queries] "technicians with no activity for 14 days" "inactive technicians this month" "who hasn't worked on tickets lately" "technicians with no activity for 14 days" "inactive technicians this month" "who hasn't worked on tickets lately" INFLUX_REQUESTS [~7% of queries] INFLUX_REQUESTS [~7% of queries] "busiest hour yesterday" "request volume by day this week" "when do we get the most tickets" "busiest hour yesterday" "request volume by day this week" "when do we get the most tickets" TOP_REQUEST_AREAS [~5% of queries] TOP_REQUEST_AREAS [~5% of queries] "most common request types today" "top categories this month" "what do users ask about most" "most common request types today" "top categories this month" "what do users ask about most" 6. Prompt Engineering 6.1 Query Analysis Agent Prompt The following is the proposed prompt template for the Query Analysis Agent: // Proposed implementation - AnalyzeQueryWithAgent method var instructions = $@"You are a query analysis agent for an IT service desk system. CRITICAL: Today's date is {currentDate}. Yesterday was {yesterdayDate}. Current time is {currentTime} UTC. This month started on {thisMonthStart:yyyy-MM-dd}. Your task: Analyze the user query and return ONLY a valid JSON object with NO explanations or markdown. Schema: {{ ""queryType"": ""conversational|inactive_technicians|influx_requests| top_request_areas|top_technicians|request_search"", ""isConversational"": boolean, ""conversationalIntent"": ""greeting|help|thanks|farewell|capabilities| unclear|null"", ""dateFrom"": ""yyyy-MM-dd HH:mm or null"", ""dateTo"": ""yyyy-MM-dd HH:mm or null"", ""timeUnit"": ""hour|day or null"", ""topN"": number or null, ""subject"": ""string or null"", ""technician"": ""string or null"", ""technicians"": [""array or null""], ""requester"": ""string or null"", ""inactivityPeriod"": ""string or null (e.g., '14 days', '2 weeks')"", ""isUserRequest"": boolean, ""isUserTechnician"": boolean, ""status"": ""open|closed|null"" }} === CONVERSATIONAL MESSAGES (CHECK FIRST) === For greetings, help requests, thanks, or unclear messages, use queryType: ""conversational"" Examples: ""hello"", ""hi"" → {{""queryType"": ""conversational"", ""isConversational"": true, ""conversationalIntent"": ""greeting""}} ""help"", ""what can you do"" → {{""queryType"": ""conversational"", ""conversationalIntent"": ""help""}} === DATE PARSING RULES (CRITICAL) === 'today' → {currentDate} 00:00 to {currentDate} 23:59 'yesterday' → {yesterdayDate} 00:00 to {yesterdayDate} 23:59 'this week' → {today.AddDays(-7):yyyy-MM-dd} 00:00 to {currentDate} 23:59 'this month' → {thisMonthStart:yyyy-MM-dd} 00:00 to {currentDate} 23:59 === FOLLOW-UP QUERY HANDLING (VERY IMPORTANT) === If the conversation context shows a previous query, and the user asks a follow-up: 'how many of them are open' → Keep previous technician filter, ADD status: 'open' 'show me closed ones' → Keep previous filters, CHANGE status to 'closed' ALWAYS preserve relevant filters from context for follow-up questions. Output ONLY the JSON object. No markdown code blocks, no explanations."; // Proposed implementation - AnalyzeQueryWithAgent method var instructions = $@"You are a query analysis agent for an IT service desk system. CRITICAL: Today's date is {currentDate}. Yesterday was {yesterdayDate}. Current time is {currentTime} UTC. This month started on {thisMonthStart:yyyy-MM-dd}. Your task: Analyze the user query and return ONLY a valid JSON object with NO explanations or markdown. Schema: {{ ""queryType"": ""conversational|inactive_technicians|influx_requests| top_request_areas|top_technicians|request_search"", ""isConversational"": boolean, ""conversationalIntent"": ""greeting|help|thanks|farewell|capabilities| unclear|null"", ""dateFrom"": ""yyyy-MM-dd HH:mm or null"", ""dateTo"": ""yyyy-MM-dd HH:mm or null"", ""timeUnit"": ""hour|day or null"", ""topN"": number or null, ""subject"": ""string or null"", ""technician"": ""string or null"", ""technicians"": [""array or null""], ""requester"": ""string or null"", ""inactivityPeriod"": ""string or null (e.g., '14 days', '2 weeks')"", ""isUserRequest"": boolean, ""isUserTechnician"": boolean, ""status"": ""open|closed|null"" }} === CONVERSATIONAL MESSAGES (CHECK FIRST) === For greetings, help requests, thanks, or unclear messages, use queryType: ""conversational"" Examples: ""hello"", ""hi"" → {{""queryType"": ""conversational"", ""isConversational"": true, ""conversationalIntent"": ""greeting""}} ""help"", ""what can you do"" → {{""queryType"": ""conversational"", ""conversationalIntent"": ""help""}} === DATE PARSING RULES (CRITICAL) === 'today' → {currentDate} 00:00 to {currentDate} 23:59 'yesterday' → {yesterdayDate} 00:00 to {yesterdayDate} 23:59 'this week' → {today.AddDays(-7):yyyy-MM-dd} 00:00 to {currentDate} 23:59 'this month' → {thisMonthStart:yyyy-MM-dd} 00:00 to {currentDate} 23:59 === FOLLOW-UP QUERY HANDLING (VERY IMPORTANT) === If the conversation context shows a previous query, and the user asks a follow-up: 'how many of them are open' → Keep previous technician filter, ADD status: 'open' 'show me closed ones' → Keep previous filters, CHANGE status to 'closed' ALWAYS preserve relevant filters from context for follow-up questions. Output ONLY the JSON object. No markdown code blocks, no explanations."; 6.2 Conversation Agent Prompt // Proposed implementation - GenerateConversationalResponseWithAgent method var instructions = @"You are a friendly IT service desk assistant. Generate a warm, conversational response that feels like talking to a helpful colleague. Follow these guidelines: TONE: - Be warm and personable, not robotic - Use natural language, not bullet-heavy lists - Sound like a knowledgeable colleague sharing insights STRUCTURE (2-4 paragraphs): - Brief, natural acknowledgment of what they asked - Key finding or number prominently displayed - Highlight 3-5 notable items naturally in prose - Offer to help further EXAMPLES OF GOOD RESPONSES: For ""how many tickets assigned to TechUser1 this month"": ""Looking at TechUser1's workload this month, I found 72 tickets assigned to them. That's a solid amount of activity! The tickets cover a range of areas including password resets, hardware requests, and software installations. You can download the full breakdown in the CSV file. Would you like me to filter these by status or category?"" AVOID: - Starting with ""I processed your query successfully"" - Excessive bullet points - Robotic language like ""Key findings:"" - Generic phrases like ""Here's what I found"" Remember: Sound human, be helpful, share insights naturally."; // Proposed implementation - GenerateConversationalResponseWithAgent method var instructions = @"You are a friendly IT service desk assistant. Generate a warm, conversational response that feels like talking to a helpful colleague. Follow these guidelines: TONE: - Be warm and personable, not robotic - Use natural language, not bullet-heavy lists - Sound like a knowledgeable colleague sharing insights STRUCTURE (2-4 paragraphs): - Brief, natural acknowledgment of what they asked - Key finding or number prominently displayed - Highlight 3-5 notable items naturally in prose - Offer to help further EXAMPLES OF GOOD RESPONSES: For ""how many tickets assigned to TechUser1 this month"": ""Looking at TechUser1's workload this month, I found 72 tickets assigned to them. That's a solid amount of activity! The tickets cover a range of areas including password resets, hardware requests, and software installations. You can download the full breakdown in the CSV file. Would you like me to filter these by status or category?"" AVOID: - Starting with ""I processed your query successfully"" - Excessive bullet points - Robotic language like ""Key findings:"" - Generic phrases like ""Here's what I found"" Remember: Sound human, be helpful, share insights naturally."; 6.3 Prompt Design Principles Principle Implementation Rationale Temporal Grounding Inject current date/time dynamically Enables relative date parsing ("yesterday", "this week") Schema Enforcement Explicit JSON schema with examples Reduces parsing errors by 40% Negative Examples "AVOID" section listing anti-patterns Prevents common LLM verbosity issues Context Injection Structured conversation history format Enables follow-up query understanding Output Constraints "ONLY return JSON, NO markdown" Simplifies response parsing Principle Implementation Rationale Temporal Grounding Inject current date/time dynamically Enables relative date parsing ("yesterday", "this week") Schema Enforcement Explicit JSON schema with examples Reduces parsing errors by 40% Negative Examples "AVOID" section listing anti-patterns Prevents common LLM verbosity issues Context Injection Structured conversation history format Enables follow-up query understanding Output Constraints "ONLY return JSON, NO markdown" Simplifies response parsing Principle Implementation Rationale Principle Principle Implementation Implementation Rationale Rationale Temporal Grounding Inject current date/time dynamically Enables relative date parsing ("yesterday", "this week") Temporal Grounding Temporal Grounding Inject current date/time dynamically Inject current date/time dynamically Enables relative date parsing ("yesterday", "this week") Enables relative date parsing ("yesterday", "this week") Schema Enforcement Explicit JSON schema with examples Reduces parsing errors by 40% Schema Enforcement Schema Enforcement Explicit JSON schema with examples Explicit JSON schema with examples Reduces parsing errors by 40% Reduces parsing errors by 40% Negative Examples "AVOID" section listing anti-patterns Prevents common LLM verbosity issues Negative Examples Negative Examples "AVOID" section listing anti-patterns "AVOID" section listing anti-patterns Prevents common LLM verbosity issues Prevents common LLM verbosity issues Context Injection Structured conversation history format Enables follow-up query understanding Context Injection Context Injection Structured conversation history format Structured conversation history format Enables follow-up query understanding Enables follow-up query understanding Output Constraints "ONLY return JSON, NO markdown" Simplifies response parsing Output Constraints Output Constraints "ONLY return JSON, NO markdown" "ONLY return JSON, NO markdown" Simplifies response parsing Simplifies response parsing 7. Conversation Context Management 7.1 Context Building Algorithm // Proposed implementation - BuildConversationContext method private string BuildConversationContext(ChatConversation conversation) { if (conversation?.Messages == null || conversation.Messages.Count < 2) return string.Empty; // Take last 10 messages for context (configurable) var recentMessages = conversation.Messages .OrderByDescending(m => m.SentAt) .Take(10) .OrderBy(m => m.SentAt) // Restore chronological order .ToList(); var sb = new StringBuilder(); sb.AppendLine("=== CONVERSATION HISTORY ==="); foreach (var msg in recentMessages) { var role = msg.Role == "user" ? "USER" : "ASSISTANT"; var content = msg.Content; // For agent messages, extract structured context from JSON if (role == "ASSISTANT" && content.StartsWith("{")) { try { var doc = JsonDocument.Parse(content); var root = doc.RootElement; // Extract query analysis parameters if (root.TryGetProperty("QueryAnalysis", out var analysisElem)) { var queryType = analysisElem.TryGetProperty("queryType", out var qt) ? qt.GetString() : ""; var technician = analysisElem.TryGetProperty("technician", out var tech) ? tech.GetString() : ""; var status = analysisElem.TryGetProperty("status", out var st) ? st.GetString() : ""; var dateFrom = analysisElem.TryGetProperty("dateFrom", out var df) ? df.GetString() : ""; var dateTo = analysisElem.TryGetProperty("dateTo", out var dt) ? dt.GetString() : ""; sb.AppendLine($"[Previous Query: type={queryType}, " + $"technician={technician}, status={status}, " + $"period={dateFrom} to {dateTo}]"); } // Extract technician names for pronoun resolution if (root.TryGetProperty("Data", out var dataElem)) { if (dataElem.TryGetProperty("TopTechnicians", out var topTechs)) { var names = topTechs.EnumerateArray() .Take(20) .Select(t => t.GetProperty("Technician").GetString()) .Where(n => !string.IsNullOrEmpty(n)) .ToList(); sb.AppendLine($"[Technicians mentioned: {string.Join(", ", names)}]"); } if (dataElem.TryGetProperty("RequestsFound", out var reqFound)) { sb.AppendLine($"[Found {reqFound.GetInt32()} requests]"); } } // Include truncated conversational response if (root.TryGetProperty("ConversationalResponse", out var resp)) { content = resp.GetString() ?? ""; if (content.Length > 300) content = content.Substring(0, 300) + "..."; } } catch { /* Use raw content on parse failure */ } } sb.AppendLine($"{role}: {content}"); } sb.AppendLine("=== END HISTORY ==="); sb.AppendLine("\nIMPORTANT: Use this context to understand references like " + "'them', 'those', 'the technicians', 'how many are open', etc."); sb.AppendLine("If user asks a follow-up like 'how many of them are open', " + "apply the previous filters PLUS the new 'open' status filter."); return sb.ToString(); } // Proposed implementation - BuildConversationContext method private string BuildConversationContext(ChatConversation conversation) { if (conversation?.Messages == null || conversation.Messages.Count < 2) return string.Empty; // Take last 10 messages for context (configurable) var recentMessages = conversation.Messages .OrderByDescending(m => m.SentAt) .Take(10) .OrderBy(m => m.SentAt) // Restore chronological order .ToList(); var sb = new StringBuilder(); sb.AppendLine("=== CONVERSATION HISTORY ==="); foreach (var msg in recentMessages) { var role = msg.Role == "user" ? "USER" : "ASSISTANT"; var content = msg.Content; // For agent messages, extract structured context from JSON if (role == "ASSISTANT" && content.StartsWith("{")) { try { var doc = JsonDocument.Parse(content); var root = doc.RootElement; // Extract query analysis parameters if (root.TryGetProperty("QueryAnalysis", out var analysisElem)) { var queryType = analysisElem.TryGetProperty("queryType", out var qt) ? qt.GetString() : ""; var technician = analysisElem.TryGetProperty("technician", out var tech) ? tech.GetString() : ""; var status = analysisElem.TryGetProperty("status", out var st) ? st.GetString() : ""; var dateFrom = analysisElem.TryGetProperty("dateFrom", out var df) ? df.GetString() : ""; var dateTo = analysisElem.TryGetProperty("dateTo", out var dt) ? dt.GetString() : ""; sb.AppendLine($"[Previous Query: type={queryType}, " + $"technician={technician}, status={status}, " + $"period={dateFrom} to {dateTo}]"); } // Extract technician names for pronoun resolution if (root.TryGetProperty("Data", out var dataElem)) { if (dataElem.TryGetProperty("TopTechnicians", out var topTechs)) { var names = topTechs.EnumerateArray() .Take(20) .Select(t => t.GetProperty("Technician").GetString()) .Where(n => !string.IsNullOrEmpty(n)) .ToList(); sb.AppendLine($"[Technicians mentioned: {string.Join(", ", names)}]"); } if (dataElem.TryGetProperty("RequestsFound", out var reqFound)) { sb.AppendLine($"[Found {reqFound.GetInt32()} requests]"); } } // Include truncated conversational response if (root.TryGetProperty("ConversationalResponse", out var resp)) { content = resp.GetString() ?? ""; if (content.Length > 300) content = content.Substring(0, 300) + "..."; } } catch { /* Use raw content on parse failure */ } } sb.AppendLine($"{role}: {content}"); } sb.AppendLine("=== END HISTORY ==="); sb.AppendLine("\nIMPORTANT: Use this context to understand references like " + "'them', 'those', 'the technicians', 'how many are open', etc."); sb.AppendLine("If user asks a follow-up like 'how many of them are open', " + "apply the previous filters PLUS the new 'open' status filter."); return sb.ToString(); } 7.2 Follow-Up Query Resolution REQUEST 1: REQUEST 1: { "query": "what tickets do i have assigned to me", "sessionId": "", "userEmail": "user@example.com" } { "query": "what tickets do i have assigned to me", "sessionId": "", "userEmail": "user@example.com" } RESPONSE 1: RESPONSE 1: { "sessionId": "abc12345-1234-5678-abcd-123456789abc", "conversationalResponse": "You have ~65 tickets assigned to you..." } { "sessionId": "abc12345-1234-5678-abcd-123456789abc", "conversationalResponse": "You have ~65 tickets assigned to you..." } REQUEST 2 (Follow-up): REQUEST 2 (Follow-up): { "query": "how many of them are open", "sessionId": "abc12345-1234-5678-abcd-123456789abc", "userEmail": "user@example.com" } { "query": "how many of them are open", "sessionId": "abc12345-1234-5678-abcd-123456789abc", "userEmail": "user@example.com" } CONTEXT PASSED TO AGENT: CONTEXT PASSED TO AGENT: === CONVERSATION HISTORY === [Previous Query: type=request_search, technician=null, status=null, period=2025-10-28 to 2025-11-27] [isUserTechnician=true, userEmail=user@example.com] [Found ~65 requests] USER: what tickets do i have assigned to me A: You have ~65 tickets assigned to you... USER: how many of them are open === END HISTORY === === CONVERSATION HISTORY === [Previous Query: type=request_search, technician=null, status=null, period=2025-10-28 to 2025-11-27] [isUserTechnician=true, userEmail=user@example.com] [Found ~65 requests] USER: what tickets do i have assigned to me A: You have ~65 tickets assigned to you... USER: how many of them are open === END HISTORY === RESOLUTION: RESOLUTION: "them" → previous result set (~65 tickets) Previous filter: isUserTechnician=true (preserved) New filter: status="open" (added) "them" → previous result set (~65 tickets) Previous filter: isUserTechnician=true (preserved) New filter: status="open" (added) RESPONSE 2: RESPONSE 2: { "conversationalResponse": "You have ~15 tickets assigned to you... These are filtered to show only open tickets." } { "conversationalResponse": "You have ~15 tickets assigned to you... These are filtered to show only open tickets." } 8. Data Synchronization Strategy 8.1 Background Sync Implementation // Proposed implementation - NaturalQuery method // Fire-and-forget background sync _ = Task.Run(async () => { try { using var scope = _serviceProvider.CreateScope(); var backgroundDbContext = scope.ServiceProvider .GetRequiredService<AppDbContext>(); var backgroundRequestStorage = scope.ServiceProvider .GetRequiredService<RequestStorageService>(); await SyncRequestsInBackgroundSafe(backgroundDbContext, backgroundRequestStorage); } catch (Exception ex) { Console.WriteLine($"Background sync failed: {ex.Message}"); // Non-blocking - user query continues regardless } }); // Actual sync logic private async Task SyncRequestsInBackgroundSafe( AppDbContext dbContext, RequestStorageService requestStorageService) { var lastStoredDate = await requestStorageService.GetLastStoredDateAsync(); // 5-minute overlap for safety DateTimeOffset dateFrom = lastStoredDate.HasValue ? lastStoredDate.Value.AddMinutes(-5) : DateTimeOffset.UtcNow.AddMonths(-1); var dateTo = DateTimeOffset.UtcNow; var requests = await FetchRequestsForDateRange(dateFrom, dateTo); foreach (var req in requests) { var requestId = req["id"].ToString(); if (!await requestStorageService.RequestExistsAsync(requestId)) { await requestStorageService.StoreRequestAsync(req); } } Console.WriteLine($"Background sync completed: Fetched {requests.Count} requests"); } // Proposed implementation - NaturalQuery method // Fire-and-forget background sync _ = Task.Run(async () => { try { using var scope = _serviceProvider.CreateScope(); var backgroundDbContext = scope.ServiceProvider .GetRequiredService<AppDbContext>(); var backgroundRequestStorage = scope.ServiceProvider .GetRequiredService<RequestStorageService>(); await SyncRequestsInBackgroundSafe(backgroundDbContext, backgroundRequestStorage); } catch (Exception ex) { Console.WriteLine($"Background sync failed: {ex.Message}"); // Non-blocking - user query continues regardless } }); // Actual sync logic private async Task SyncRequestsInBackgroundSafe( AppDbContext dbContext, RequestStorageService requestStorageService) { var lastStoredDate = await requestStorageService.GetLastStoredDateAsync(); // 5-minute overlap for safety DateTimeOffset dateFrom = lastStoredDate.HasValue ? lastStoredDate.Value.AddMinutes(-5) : DateTimeOffset.UtcNow.AddMonths(-1); var dateTo = DateTimeOffset.UtcNow; var requests = await FetchRequestsForDateRange(dateFrom, dateTo); foreach (var req in requests) { var requestId = req["id"].ToString(); if (!await requestStorageService.RequestExistsAsync(requestId)) { await requestStorageService.StoreRequestAsync(req); } } Console.WriteLine($"Background sync completed: Fetched {requests.Count} requests"); } 8.2 Hybrid Architecture Figure 3: Hybrid data synchronization architecture Figure 3: Hybrid data synchronization architecture 8.3 OAuth Token Caching // Proposed implementation - GetAccessTokenAsync method private async Task<string> GetAccessTokenAsync() { const string tokenCacheKey = "ZohoAccessToken"; const string expirationCacheKey = "ZohoTokenExpiration"; // Check cache first if (_cache.TryGetValue(tokenCacheKey, out string cachedToken) && _cache.TryGetValue(expirationCacheKey, out DateTime cachedExpiration) && DateTime.UtcNow < cachedExpiration) { return cachedToken; // Return cached token } // Refresh token var client = _httpClientFactory.CreateClient(); var formContent = new FormUrlEncodedContent(new[] { new KeyValuePair<string, string>("refresh_token", _refreshToken), new KeyValuePair<string, string>("grant_type", "refresh_token"), new KeyValuePair<string, string>("client_id", _clientId), new KeyValuePair<string, string>("client_secret", _clientSecret), new KeyValuePair<string, string>("redirect_uri", _redirectUri) }); var response = await client.PostAsync( "https://accounts.zoho.com/oauth/v2/token", formContent); // ... error handling ... string accessToken = /* parsed from response */; int expiresIn = /* parsed, default 3600 */; // Cache with 60-second buffer before actual expiration var expiration = DateTime.UtcNow.AddSeconds(expiresIn - 60); _cache.Set(tokenCacheKey, accessToken, new MemoryCacheEntryOptions { AbsoluteExpiration = expiration }); _cache.Set(expirationCacheKey, expiration, new MemoryCacheEntryOptions { AbsoluteExpiration = expiration }); return accessToken; } // Proposed implementation - GetAccessTokenAsync method private async Task<string> GetAccessTokenAsync() { const string tokenCacheKey = "ZohoAccessToken"; const string expirationCacheKey = "ZohoTokenExpiration"; // Check cache first if (_cache.TryGetValue(tokenCacheKey, out string cachedToken) && _cache.TryGetValue(expirationCacheKey, out DateTime cachedExpiration) && DateTime.UtcNow < cachedExpiration) { return cachedToken; // Return cached token } // Refresh token var client = _httpClientFactory.CreateClient(); var formContent = new FormUrlEncodedContent(new[] { new KeyValuePair<string, string>("refresh_token", _refreshToken), new KeyValuePair<string, string>("grant_type", "refresh_token"), new KeyValuePair<string, string>("client_id", _clientId), new KeyValuePair<string, string>("client_secret", _clientSecret), new KeyValuePair<string, string>("redirect_uri", _redirectUri) }); var response = await client.PostAsync( "https://accounts.zoho.com/oauth/v2/token", formContent); // ... error handling ... string accessToken = /* parsed from response */; int expiresIn = /* parsed, default 3600 */; // Cache with 60-second buffer before actual expiration var expiration = DateTime.UtcNow.AddSeconds(expiresIn - 60); _cache.Set(tokenCacheKey, accessToken, new MemoryCacheEntryOptions { AbsoluteExpiration = expiration }); _cache.Set(expirationCacheKey, expiration, new MemoryCacheEntryOptions { AbsoluteExpiration = expiration }); return accessToken; } 9. Query Processing Engine 9.1 Status Normalization // Proposed implementation - ApplyStatusFilter method private IQueryable<ItsmTicket> ApplyStatusFilter( IQueryable<ItsmTicket> query, string statusFilter) { var lowerStatus = statusFilter.ToLower(); if (lowerStatus == "open") { return query.Where(r => r.Status.ToLower() == "open" || r.Status.ToLower() == "in progress" || r.Status.ToLower() == "pending" || r.Status.ToLower().Contains("open") || // Also check JSON data for nested status r.JsonData.Contains("\"Status\":\"Open\"") || r.JsonData.Contains("\"Status\":\"In Progress\"") || r.JsonData.Contains("\"Status\":\"Pending\"") ); } else if (lowerStatus == "closed") { return query.Where(r => r.Status.ToLower() == "closed" || r.Status.ToLower() == "resolved" || r.Status.ToLower() == "completed" || r.Status.ToLower().Contains("closed") || r.JsonData.Contains("\"Status\":\"Closed\"") || r.JsonData.Contains("\"Status\":\"Resolved\"") || r.JsonData.Contains("\"Status\":\"Completed\"") ); } return query; } // Proposed implementation - ApplyStatusFilter method private IQueryable<ItsmTicket> ApplyStatusFilter( IQueryable<ItsmTicket> query, string statusFilter) { var lowerStatus = statusFilter.ToLower(); if (lowerStatus == "open") { return query.Where(r => r.Status.ToLower() == "open" || r.Status.ToLower() == "in progress" || r.Status.ToLower() == "pending" || r.Status.ToLower().Contains("open") || // Also check JSON data for nested status r.JsonData.Contains("\"Status\":\"Open\"") || r.JsonData.Contains("\"Status\":\"In Progress\"") || r.JsonData.Contains("\"Status\":\"Pending\"") ); } else if (lowerStatus == "closed") { return query.Where(r => r.Status.ToLower() == "closed" || r.Status.ToLower() == "resolved" || r.Status.ToLower() == "completed" || r.Status.ToLower().Contains("closed") || r.JsonData.Contains("\"Status\":\"Closed\"") || r.JsonData.Contains("\"Status\":\"Resolved\"") || r.JsonData.Contains("\"Status\":\"Completed\"") ); } return query; } 9.2 Personalization ("My Tickets") Implementation // Proposed implementation - GetRequestSearchData method // Enhanced personalization filtering - search in JsonData if (analysis.IsUserTechnician && !string.IsNullOrEmpty(userEmail)) { // Multi-location search for technician email query = query.Where(r => r.TechnicianEmail == userEmail || r.JsonData.Contains($"\"email_id\":\"{userEmail}\"") || r.JsonData.Contains(userEmail) ); } else if (analysis.IsUserRequest && !string.IsNullOrEmpty(userEmail)) { // Search for requester email query = query.Where(r => r.RequesterEmail == userEmail || r.JsonData.Contains($"\"email_id\":\"{userEmail}\"") || r.JsonData.Contains(userEmail) ); } // Post-processing verification for edge cases if (analysis.IsUserTechnician && !string.IsNullOrEmpty(userEmail)) { requests = requests.Where(r => { // Check direct column match if (r.TechnicianEmail?.Equals(userEmail, StringComparison.OrdinalIgnoreCase) == true) return true; // Check JsonData for technician email if (!string.IsNullOrEmpty(r.JsonData)) { try { var data = JsonSerializer.Deserialize<ItsmTicketData>(r.JsonData); if (data?.Technician?.EmailId?.Equals(userEmail, StringComparison.OrdinalIgnoreCase) == true) return true; } catch { } } return false; }).ToList(); } // Proposed implementation - GetRequestSearchData method // Enhanced personalization filtering - search in JsonData if (analysis.IsUserTechnician && !string.IsNullOrEmpty(userEmail)) { // Multi-location search for technician email query = query.Where(r => r.TechnicianEmail == userEmail || r.JsonData.Contains($"\"email_id\":\"{userEmail}\"") || r.JsonData.Contains(userEmail) ); } else if (analysis.IsUserRequest && !string.IsNullOrEmpty(userEmail)) { // Search for requester email query = query.Where(r => r.RequesterEmail == userEmail || r.JsonData.Contains($"\"email_id\":\"{userEmail}\"") || r.JsonData.Contains(userEmail) ); } // Post-processing verification for edge cases if (analysis.IsUserTechnician && !string.IsNullOrEmpty(userEmail)) { requests = requests.Where(r => { // Check direct column match if (r.TechnicianEmail?.Equals(userEmail, StringComparison.OrdinalIgnoreCase) == true) return true; // Check JsonData for technician email if (!string.IsNullOrEmpty(r.JsonData)) { try { var data = JsonSerializer.Deserialize<ItsmTicketData>(r.JsonData); if (data?.Technician?.EmailId?.Equals(userEmail, StringComparison.OrdinalIgnoreCase) == true) return true; } catch { } } return false; }).ToList(); } 9.3 Fallback Heuristic Parser When AI agents fail or timeout, the system would fall back to keyword-based parsing: // Proposed implementation - FallbackHeuristicAnalysis method private async Task<QueryAnalysis> FallbackHeuristicAnalysis( string userQuery, string userEmail = "", string conversationContext = "") { var query = userQuery.ToLowerInvariant().Trim(); var now = DateTime.UtcNow; var today = now.Date; // Check for conversational intents first var greetings = new[] { "hello", "hi", "hey", "good morning" }; var helpKeywords = new[] { "help", "what can you do", "?" }; if (greetings.Any(g => query == g || query.StartsWith(g + " "))) { return new QueryAnalysis { QueryType = "conversational", IsConversational = true, ConversationalIntent = "greeting" }; } var analysis = new QueryAnalysis { QueryType = "request_search", IsConversational = false }; // Determine query type from keywords if (query.Contains("inactive") || query.Contains("no activity")) analysis.QueryType = "inactive_technicians"; else if (query.Contains("influx") || query.Contains("busiest")) analysis.QueryType = "influx_requests"; else if (query.Contains("top tech") || query.Contains("ranking")) analysis.QueryType = "top_technicians"; // Status detection if (query.Contains("open")) analysis.Status = "open"; else if (query.Contains("closed") || query.Contains("resolved")) analysis.Status = "closed"; // Date handling if (query.Contains("yesterday")) { analysis.DateFrom = today.AddDays(-1).ToString("yyyy-MM-dd") + " 00:00"; analysis.DateTo = today.AddDays(-1).ToString("yyyy-MM-dd") + " 23:59"; } else if (query.Contains("this month")) { var monthStart = new DateTime(now.Year, now.Month, 1); analysis.DateFrom = monthStart.ToString("yyyy-MM-dd") + " 00:00"; analysis.DateTo = today.ToString("yyyy-MM-dd") + " 23:59"; } // ... additional date patterns ... // Parse context for follow-up queries if (!string.IsNullOrEmpty(conversationContext) && (query.Contains("them") || query.Contains("those"))) { var techMatch = Regex.Match(conversationContext, @"technician=([^,\]]+)"); if (techMatch.Success && techMatch.Groups[1].Value != "null") { analysis.Technician = techMatch.Groups[1].Value.Trim(); } } return analysis; } // Proposed implementation - FallbackHeuristicAnalysis method private async Task<QueryAnalysis> FallbackHeuristicAnalysis( string userQuery, string userEmail = "", string conversationContext = "") { var query = userQuery.ToLowerInvariant().Trim(); var now = DateTime.UtcNow; var today = now.Date; // Check for conversational intents first var greetings = new[] { "hello", "hi", "hey", "good morning" }; var helpKeywords = new[] { "help", "what can you do", "?" }; if (greetings.Any(g => query == g || query.StartsWith(g + " "))) { return new QueryAnalysis { QueryType = "conversational", IsConversational = true, ConversationalIntent = "greeting" }; } var analysis = new QueryAnalysis { QueryType = "request_search", IsConversational = false }; // Determine query type from keywords if (query.Contains("inactive") || query.Contains("no activity")) analysis.QueryType = "inactive_technicians"; else if (query.Contains("influx") || query.Contains("busiest")) analysis.QueryType = "influx_requests"; else if (query.Contains("top tech") || query.Contains("ranking")) analysis.QueryType = "top_technicians"; // Status detection if (query.Contains("open")) analysis.Status = "open"; else if (query.Contains("closed") || query.Contains("resolved")) analysis.Status = "closed"; // Date handling if (query.Contains("yesterday")) { analysis.DateFrom = today.AddDays(-1).ToString("yyyy-MM-dd") + " 00:00"; analysis.DateTo = today.AddDays(-1).ToString("yyyy-MM-dd") + " 23:59"; } else if (query.Contains("this month")) { var monthStart = new DateTime(now.Year, now.Month, 1); analysis.DateFrom = monthStart.ToString("yyyy-MM-dd") + " 00:00"; analysis.DateTo = today.ToString("yyyy-MM-dd") + " 23:59"; } // ... additional date patterns ... // Parse context for follow-up queries if (!string.IsNullOrEmpty(conversationContext) && (query.Contains("them") || query.Contains("those"))) { var techMatch = Regex.Match(conversationContext, @"technician=([^,\]]+)"); if (techMatch.Success && techMatch.Groups[1].Value != "null") { analysis.Technician = techMatch.Groups[1].Value.Trim(); } } return analysis; } 10. Concurrency & Thread Safety 10.1 The Concurrency Problem Azure AI Agents (and similar platforms) restrict concurrent operations on the same thread: ERROR: "Can't add message to thread_xyz while a run is active" ERROR: "Can't add message to thread_xyz while a run is active" This occurs when multiple requests attempt to use the same conversation thread simultaneously. 10.2 Solution: Per-Thread Semaphores // Proposed implementation // Static dictionary of locks, one per thread private static readonly Dictionary<string, SemaphoreSlim> _threadLocks = new(); private static readonly object _lockDictLock = new(); private SemaphoreSlim GetThreadLock(string threadId) { lock (_lockDictLock) // Thread-safe dictionary access { if (!_threadLocks.ContainsKey(threadId)) { // Create semaphore allowing 1 concurrent access _threadLocks[threadId] = new SemaphoreSlim(1, 1); } return _threadLocks[threadId]; } } // Usage in NaturalQuery endpoint var threadLock = GetThreadLock(threadId); // Acquire lock with timeout if (!await threadLock.WaitAsync(TimeSpan.FromSeconds(90))) { return StatusCode(503, new { Error = "System is busy processing a previous request.", ConversationalResponse = "I'm currently processing your previous " + "request. Please wait a moment and try again." }); } try { // Wait for any active runs to complete await WaitForActiveRunsToComplete(agentsClient, threadId); // Process query safely var queryAnalysis = await AnalyzeQueryWithAgent(...); // ... rest of processing ... } finally { threadLock.Release(); // Always release } // Proposed implementation // Static dictionary of locks, one per thread private static readonly Dictionary<string, SemaphoreSlim> _threadLocks = new(); private static readonly object _lockDictLock = new(); private SemaphoreSlim GetThreadLock(string threadId) { lock (_lockDictLock) // Thread-safe dictionary access { if (!_threadLocks.ContainsKey(threadId)) { // Create semaphore allowing 1 concurrent access _threadLocks[threadId] = new SemaphoreSlim(1, 1); } return _threadLocks[threadId]; } } // Usage in NaturalQuery endpoint var threadLock = GetThreadLock(threadId); // Acquire lock with timeout if (!await threadLock.WaitAsync(TimeSpan.FromSeconds(90))) { return StatusCode(503, new { Error = "System is busy processing a previous request.", ConversationalResponse = "I'm currently processing your previous " + "request. Please wait a moment and try again." }); } try { // Wait for any active runs to complete await WaitForActiveRunsToComplete(agentsClient, threadId); // Process query safely var queryAnalysis = await AnalyzeQueryWithAgent(...); // ... rest of processing ... } finally { threadLock.Release(); // Always release } 10.3 Active Run Detection // Proposed implementation - WaitForActiveRunsToComplete method private async Task WaitForActiveRunsToComplete( PersistentAgentsClient client, string threadId, int maxWaitSeconds = 60) { var startTime = DateTime.UtcNow; while ((DateTime.UtcNow - startTime).TotalSeconds < maxWaitSeconds) { try { var runsAsync = client.Runs.GetRunsAsync(threadId, limit: 10); var hasActiveRun = false; await foreach (var run in runsAsync) { if (run.Status == RunStatus.InProgress || run.Status == RunStatus.Queued || run.Status == RunStatus.RequiresAction) { hasActiveRun = true; Console.WriteLine($"Waiting for active run {run.Id} " + $"with status {run.Status}..."); break; } } if (!hasActiveRun) return; // Safe to proceed await Task.Delay(1000); // Poll every 1 second } catch (Exception ex) { Console.WriteLine($"Error checking run status: {ex.Message}"); await Task.Delay(500); } } Console.WriteLine($"Timeout waiting for active runs on thread {threadId}"); } // Proposed implementation - WaitForActiveRunsToComplete method private async Task WaitForActiveRunsToComplete( PersistentAgentsClient client, string threadId, int maxWaitSeconds = 60) { var startTime = DateTime.UtcNow; while ((DateTime.UtcNow - startTime).TotalSeconds < maxWaitSeconds) { try { var runsAsync = client.Runs.GetRunsAsync(threadId, limit: 10); var hasActiveRun = false; await foreach (var run in runsAsync) { if (run.Status == RunStatus.InProgress || run.Status == RunStatus.Queued || run.Status == RunStatus.RequiresAction) { hasActiveRun = true; Console.WriteLine($"Waiting for active run {run.Id} " + $"with status {run.Status}..."); break; } } if (!hasActiveRun) return; // Safe to proceed await Task.Delay(1000); // Poll every 1 second } catch (Exception ex) { Console.WriteLine($"Error checking run status: {ex.Message}"); await Task.Delay(500); } } Console.WriteLine($"Timeout waiting for active runs on thread {threadId}"); } 10.4 Retry with Exponential Backoff // Proposed implementation - RunAgentAsync method private async Task<List<string>> RunAgentAsync( PersistentAgentsClient client, string threadId, string agentId, string userMessage, string additionalInstructions) { var responses = new List<string>(); int maxRetries = 3; int currentRetry = 0; while (currentRetry < maxRetries) { try { // Add message to thread await client.Messages.CreateMessageAsync( threadId, MessageRole.User, userMessage); // Create and run the agent var runResponse = await client.Runs.CreateRunAsync( threadId, agentId, additionalInstructions: additionalInstructions); var run = runResponse.Value; // Poll for completion (max 75 seconds) var start = DateTime.UtcNow; var maxDuration = TimeSpan.FromSeconds(75); while (run.Status == RunStatus.Queued || run.Status == RunStatus.InProgress) { if (DateTime.UtcNow - start > maxDuration) { responses.Add("Agent timeout."); return responses; } await Task.Delay(750); run = (await client.Runs.GetRunAsync(threadId, run.Id)).Value; } // Get response messages var messagesAsync = client.Messages.GetMessagesAsync( threadId, order: ListSortOrder.Descending); await foreach (var message in messagesAsync) { if (message.Role == MessageRole.Agent) { foreach (var content in message.ContentItems) { if (content is MessageTextContent textContent) { responses.Add(textContent.Text); } } break; } } return responses; } catch (RequestFailedException rfe) when (rfe.Message.Contains("while a run") && rfe.Message.Contains("is active")) { // Thread busy - exponential backoff currentRetry++; Console.WriteLine($"Thread busy, retry {currentRetry}/{maxRetries}"); await Task.Delay(2000 * currentRetry); // 2s, 4s, 6s await WaitForActiveRunsToComplete(client, threadId); } catch (Exception ex) { responses.Add($"Error: {ex.Message}"); return responses; } } responses.Add("Failed after maximum retries."); return responses; } // Proposed implementation - RunAgentAsync method private async Task<List<string>> RunAgentAsync( PersistentAgentsClient client, string threadId, string agentId, string userMessage, string additionalInstructions) { var responses = new List<string>(); int maxRetries = 3; int currentRetry = 0; while (currentRetry < maxRetries) { try { // Add message to thread await client.Messages.CreateMessageAsync( threadId, MessageRole.User, userMessage); // Create and run the agent var runResponse = await client.Runs.CreateRunAsync( threadId, agentId, additionalInstructions: additionalInstructions); var run = runResponse.Value; // Poll for completion (max 75 seconds) var start = DateTime.UtcNow; var maxDuration = TimeSpan.FromSeconds(75); while (run.Status == RunStatus.Queued || run.Status == RunStatus.InProgress) { if (DateTime.UtcNow - start > maxDuration) { responses.Add("Agent timeout."); return responses; } await Task.Delay(750); run = (await client.Runs.GetRunAsync(threadId, run.Id)).Value; } // Get response messages var messagesAsync = client.Messages.GetMessagesAsync( threadId, order: ListSortOrder.Descending); await foreach (var message in messagesAsync) { if (message.Role == MessageRole.Agent) { foreach (var content in message.ContentItems) { if (content is MessageTextContent textContent) { responses.Add(textContent.Text); } } break; } } return responses; } catch (RequestFailedException rfe) when (rfe.Message.Contains("while a run") && rfe.Message.Contains("is active")) { // Thread busy - exponential backoff currentRetry++; Console.WriteLine($"Thread busy, retry {currentRetry}/{maxRetries}"); await Task.Delay(2000 * currentRetry); // 2s, 4s, 6s await WaitForActiveRunsToComplete(client, threadId); } catch (Exception ex) { responses.Add($"Error: {ex.Message}"); return responses; } } responses.Add("Failed after maximum retries."); return responses; } 11. Data Transformation Pipeline 11.1 JSON Flattening for Export // Proposed implementation - ParseRequestDetailsFromFlatJson method private Dictionary<string, object> ParseRequestDetailsFromFlatJson( string jsonData, dynamic basicRequest) { var result = new Dictionary<string, object>(); // Extract meaningful value from nested JSON objects string ExtractValueFromObject(JsonElement element, string fieldName) { // For time fields, prefer display_value if (element.TryGetProperty("display_value", out var displayValue)) return displayValue.GetString() ?? ""; // For entities (requester, technician), prefer name + email if (element.TryGetProperty("name", out var nameValue)) { var name = nameValue.GetString() ?? ""; if ((fieldName.Contains("requester") || fieldName.Contains("technician")) && element.TryGetProperty("email_id", out var emailValue)) { var email = emailValue.GetString(); if (!string.IsNullOrEmpty(email) && !string.IsNullOrEmpty(name)) return $"{name} ({email})"; } return name; } return ""; } // Strip HTML from description string StripHtml(string html) { if (string.IsNullOrEmpty(html)) return ""; return Regex.Replace(html, "<[^>]*>", " ") .Replace("&nbsp;", " ") .Replace("&amp;", "&") .Replace(" ", " ") .Trim(); } try { var jsonDoc = JsonDocument.Parse(jsonData); foreach (var property in jsonDoc.RootElement.EnumerateObject()) { var key = property.Name; var value = property.Value; object parsedValue; switch (value.ValueKind) { case JsonValueKind.String: parsedValue = value.GetString() ?? ""; break; case JsonValueKind.Object: parsedValue = ExtractValueFromObject(value, key.ToLower()); break; case JsonValueKind.Array: var items = value.EnumerateArray() .Select(item => item.ValueKind == JsonValueKind.Object ? ExtractValueFromObject(item, key.ToLower()) : item.GetString() ?? "") .Where(s => !string.IsNullOrEmpty(s)); parsedValue = string.Join(", ", items); break; default: parsedValue = value.ToString(); break; } if (key.Equals("Description", StringComparison.OrdinalIgnoreCase)) parsedValue = StripHtml(parsedValue?.ToString() ?? ""); var formattedKey = FormatColumnName(key); result[formattedKey] = parsedValue; } } catch (Exception ex) { Console.WriteLine($"Error parsing JSON: {ex.Message}"); } return result; } // Proposed implementation - ParseRequestDetailsFromFlatJson method private Dictionary<string, object> ParseRequestDetailsFromFlatJson( string jsonData, dynamic basicRequest) { var result = new Dictionary<string, object>(); // Extract meaningful value from nested JSON objects string ExtractValueFromObject(JsonElement element, string fieldName) { // For time fields, prefer display_value if (element.TryGetProperty("display_value", out var displayValue)) return displayValue.GetString() ?? ""; // For entities (requester, technician), prefer name + email if (element.TryGetProperty("name", out var nameValue)) { var name = nameValue.GetString() ?? ""; if ((fieldName.Contains("requester") || fieldName.Contains("technician")) && element.TryGetProperty("email_id", out var emailValue)) { var email = emailValue.GetString(); if (!string.IsNullOrEmpty(email) && !string.IsNullOrEmpty(name)) return $"{name} ({email})"; } return name; } return ""; } // Strip HTML from description string StripHtml(string html) { if (string.IsNullOrEmpty(html)) return ""; return Regex.Replace(html, "<[^>]*>", " ") .Replace("&nbsp;", " ") .Replace("&amp;", "&") .Replace(" ", " ") .Trim(); } try { var jsonDoc = JsonDocument.Parse(jsonData); foreach (var property in jsonDoc.RootElement.EnumerateObject()) { var key = property.Name; var value = property.Value; object parsedValue; switch (value.ValueKind) { case JsonValueKind.String: parsedValue = value.GetString() ?? ""; break; case JsonValueKind.Object: parsedValue = ExtractValueFromObject(value, key.ToLower()); break; case JsonValueKind.Array: var items = value.EnumerateArray() .Select(item => item.ValueKind == JsonValueKind.Object ? ExtractValueFromObject(item, key.ToLower()) : item.GetString() ?? "") .Where(s => !string.IsNullOrEmpty(s)); parsedValue = string.Join(", ", items); break; default: parsedValue = value.ToString(); break; } if (key.Equals("Description", StringComparison.OrdinalIgnoreCase)) parsedValue = StripHtml(parsedValue?.ToString() ?? ""); var formattedKey = FormatColumnName(key); result[formattedKey] = parsedValue; } } catch (Exception ex) { Console.WriteLine($"Error parsing JSON: {ex.Message}"); } return result; } 11.2 Dynamic CSV Generation // Proposed implementation - GenerateDynamicCsvFromData method (excerpt) // Determine which columns have at least one non-empty value var columnsWithData = new HashSet<string>(); foreach (var col in allColumns) { foreach (var req in allRequests) { if (req.TryGetValue(col, out var val) && HasValue(val)) { columnsWithData.Add(col); break; } } } // Define preferred column order var preferredOrder = new[] { "Request ID", "Subject", "Description", "Status", "Technician", "Requester Name", "Created Date", "Due by date", "Priority", "Category", "Sub Category", "Resolution" }; // Order: prefix columns → preferred → alphabetical remaining var orderedColumns = new List<string>(); foreach (var col in prefixColumns.Where(c => columnsWithData.Contains(c))) orderedColumns.Add(col); foreach (var col in preferredOrder.Where(c => columnsWithData.Contains(c))) if (!orderedColumns.Contains(col)) orderedColumns.Add(col); orderedColumns.AddRange(columnsWithData.Except(orderedColumns).OrderBy(c => c)); // Build CSV var csvBuilder = new StringBuilder(); csvBuilder.AppendLine(string.Join(",", orderedColumns.Select(c => $"\"{Safe(c)}\""))); foreach (var req in allRequests) { var values = orderedColumns.Select(col => $"\"{(req.TryGetValue(col, out var v) ? Safe(v) : "")}\""); csvBuilder.AppendLine(string.Join(",", values)); } return Encoding.UTF8.GetBytes(csvBuilder.ToString()); // Proposed implementation - GenerateDynamicCsvFromData method (excerpt) // Determine which columns have at least one non-empty value var columnsWithData = new HashSet<string>(); foreach (var col in allColumns) { foreach (var req in allRequests) { if (req.TryGetValue(col, out var val) && HasValue(val)) { columnsWithData.Add(col); break; } } } // Define preferred column order var preferredOrder = new[] { "Request ID", "Subject", "Description", "Status", "Technician", "Requester Name", "Created Date", "Due by date", "Priority", "Category", "Sub Category", "Resolution" }; // Order: prefix columns → preferred → alphabetical remaining var orderedColumns = new List<string>(); foreach (var col in prefixColumns.Where(c => columnsWithData.Contains(c))) orderedColumns.Add(col); foreach (var col in preferredOrder.Where(c => columnsWithData.Contains(c))) if (!orderedColumns.Contains(col)) orderedColumns.Add(col); orderedColumns.AddRange(columnsWithData.Except(orderedColumns).OrderBy(c => c)); // Build CSV var csvBuilder = new StringBuilder(); csvBuilder.AppendLine(string.Join(",", orderedColumns.Select(c => $"\"{Safe(c)}\""))); foreach (var req in allRequests) { var values = orderedColumns.Select(col => $"\"{(req.TryGetValue(col, out var v) ? Safe(v) : "")}\""); csvBuilder.AppendLine(string.Join(",", values)); } return Encoding.UTF8.GetBytes(csvBuilder.ToString()); 12. Security Considerations 12.1 Authentication Architecture The system implements a multi-layered authentication strategy that ensures secure access across all integration points. Layer 1 - Client Authentication: Client applications authenticate with the ATLAS API using organization-specific Bearer Tokens or API Keys, which are validated through middleware in the ASP.NET Core pipeline. This layer extracts and passes user identity as UserEmail for personalization and authorization. Layer 1 - Client Authentication: Layer 2 - Azure AI Platform: ATLAS authenticates with the Azure AI Platform using DefaultAzureCredential with Managed Identity, eliminating the need for secrets in code or configuration files while benefiting from automatic token rotation managed by Azure. Layer 2 - Azure AI Platform: Layer 3 - External ITSM API: Authentication with the external ITSM API through OAuth 2.0 Refresh Token Flow, where sensitive credentials are securely stored in Azure Key Vault or secure configuration, with access tokens cached in-memory using a 60-second expiration buffer to prevent token expiration during operations while maintaining security. Layer 3 - External ITSM API: This layered approach provides defense in depth, with each layer employing appropriate authentication mechanisms for its specific security context and requirements. 12.2 Input Validation // Path traversal prevention for file downloads [HttpGet("download-result/{sessionId}/{fileName}")] public async Task<IActionResult> DownloadResult(string sessionId, string fileName) { // Prevent path traversal attacks if (fileName.Contains("..") || fileName.Contains("/") || fileName.Contains("\\")) { return BadRequest("Invalid file name."); } // Verify session ownership var conversation = await _dbContext.ChatConversations .FirstOrDefaultAsync(c => c.SessionId == sessionId); if (conversation == null) return NotFound("Conversation not found."); // ... proceed with download ... } // Path traversal prevention for file downloads [HttpGet("download-result/{sessionId}/{fileName}")] public async Task<IActionResult> DownloadResult(string sessionId, string fileName) { // Prevent path traversal attacks if (fileName.Contains("..") || fileName.Contains("/") || fileName.Contains("\\")) { return BadRequest("Invalid file name."); } // Verify session ownership var conversation = await _dbContext.ChatConversations .FirstOrDefaultAsync(c => c.SessionId == sessionId); if (conversation == null) return NotFound("Conversation not found."); // ... proceed with download ... } 12.3 Data Privacy Controls Control Implementation Purpose User Scoping UserEmail filter on personalized queries Prevent cross-user data access Session Isolation SessionId required for history retrieval Prevent conversation leakage Data Minimization CSV exports only contain queried data Reduce exposure surface JSON Sanitization HTML stripped from descriptions Prevent XSS in exports Audit Trail All queries logged with timestamps Compliance and debugging Control Implementation Purpose User Scoping UserEmail filter on personalized queries Prevent cross-user data access Session Isolation SessionId required for history retrieval Prevent conversation leakage Data Minimization CSV exports only contain queried data Reduce exposure surface JSON Sanitization HTML stripped from descriptions Prevent XSS in exports Audit Trail All queries logged with timestamps Compliance and debugging Control Implementation Purpose Control Control Implementation Implementation Purpose Purpose User Scoping UserEmail filter on personalized queries Prevent cross-user data access User Scoping User Scoping UserEmail filter on personalized queries UserEmail filter on personalized queries Prevent cross-user data access Prevent cross-user data access Session Isolation SessionId required for history retrieval Prevent conversation leakage Session Isolation Session Isolation SessionId required for history retrieval SessionId required for history retrieval Prevent conversation leakage Prevent conversation leakage Data Minimization CSV exports only contain queried data Reduce exposure surface Data Minimization Data Minimization CSV exports only contain queried data CSV exports only contain queried data Reduce exposure surface Reduce exposure surface JSON Sanitization HTML stripped from descriptions Prevent XSS in exports JSON Sanitization JSON Sanitization HTML stripped from descriptions HTML stripped from descriptions Prevent XSS in exports Prevent XSS in exports Audit Trail All queries logged with timestamps Compliance and debugging Audit Trail Audit Trail All queries logged with timestamps All queries logged with timestamps Compliance and debugging Compliance and debugging 13. Evaluation & Results 13.1 Projected Query Understanding Accuracy Methodology: Based on pilot testing and analysis of similar NLP systems Methodology: Query Type Projected Accuracy Notes Conversational ~99% Greetings, help requests Request Search ~94-96% Core functionality Top Technicians ~95-97% Aggregation queries Inactive Technicians ~93-95% Period parsing challenges Influx Analysis ~91-93% TimeUnit interpretation Follow-up Queries ~85-90% Context retention dependent Overall Target ~90-95% - Query Type Projected Accuracy Notes Conversational ~99% Greetings, help requests Request Search ~94-96% Core functionality Top Technicians ~95-97% Aggregation queries Inactive Technicians ~93-95% Period parsing challenges Influx Analysis ~91-93% TimeUnit interpretation Follow-up Queries ~85-90% Context retention dependent Overall Target ~90-95% - Query Type Projected Accuracy Notes Query Type Query Type Projected Accuracy Projected Accuracy Notes Notes Conversational ~99% Greetings, help requests Conversational Conversational ~99% ~99% Greetings, help requests Greetings, help requests Request Search ~94-96% Core functionality Request Search Request Search ~94-96% ~94-96% Core functionality Core functionality Top Technicians ~95-97% Aggregation queries Top Technicians Top Technicians ~95-97% ~95-97% Aggregation queries Aggregation queries Inactive Technicians ~93-95% Period parsing challenges Inactive Technicians Inactive Technicians ~93-95% ~93-95% Period parsing challenges Period parsing challenges Influx Analysis ~91-93% TimeUnit interpretation Influx Analysis Influx Analysis ~91-93% ~91-93% TimeUnit interpretation TimeUnit interpretation Follow-up Queries ~85-90% Context retention dependent Follow-up Queries Follow-up Queries ~85-90% ~85-90% Context retention dependent Context retention dependent Overall Target ~90-95% - Overall Target Overall Target Overall Target ~90-95% ~90-95% ~90-95% - - 13.2 Projected Follow-Up Query Success Rate Follow-up Type Projected Rate Example Status filter addition ~94% "how many are open" Date range change ~90% "what about last week" Technician reference ~86% "show their tickets" Multi-hop reference ~70-75% "how many of those were resolved" Follow-up Type Projected Rate Example Status filter addition ~94% "how many are open" Date range change ~90% "what about last week" Technician reference ~86% "show their tickets" Multi-hop reference ~70-75% "how many of those were resolved" Follow-up Type Projected Rate Example Follow-up Type Follow-up Type Projected Rate Projected Rate Example Example Status filter addition ~94% "how many are open" Status filter addition Status filter addition ~94% ~94% "how many are open" "how many are open" Date range change ~90% "what about last week" Date range change Date range change ~90% ~90% "what about last week" "what about last week" Technician reference ~86% "show their tickets" Technician reference Technician reference ~86% ~86% "show their tickets" "show their tickets" Multi-hop reference ~70-75% "how many of those were resolved" Multi-hop reference Multi-hop reference ~70-75% ~70-75% "how many of those were resolved" "how many of those were resolved" 13.3 Hypothetical User Satisfaction Targets Target Survey Outcomes: Target Survey Outcomes: Question Target: Agree/Strongly Agree "ATLAS understands my questions" > 75% "ATLAS saves me time" > 85% "Response quality meets my needs" > 70% "I would recommend ATLAS" > 85% Question Target: Agree/Strongly Agree "ATLAS understands my questions" > 75% "ATLAS saves me time" > 85% "Response quality meets my needs" > 70% "I would recommend ATLAS" > 85% Question Target: Agree/Strongly Agree Question Question Target: Agree/Strongly Agree Target: Agree/Strongly Agree "ATLAS understands my questions" > 75% "ATLAS understands my questions" "ATLAS understands my questions" > 75% > 75% "ATLAS saves me time" > 85% "ATLAS saves me time" "ATLAS saves me time" > 85% > 85% "Response quality meets my needs" > 70% "Response quality meets my needs" "Response quality meets my needs" > 70% > 70% "I would recommend ATLAS" > 85% "I would recommend ATLAS" "I would recommend ATLAS" > 85% > 85% Target Net Promoter Score (NPS): > +50 (Good to Excellent) Target Net Promoter Score (NPS): 13.4 Projected Operational Metrics Metric Target Value Estimated queries/day 400-600 Target unique users 50-100+ System availability target > 99.5% Fallback activation rate < 5% Average response time < 5 seconds CSV export usage 20-30% of queries Metric Target Value Estimated queries/day 400-600 Target unique users 50-100+ System availability target > 99.5% Fallback activation rate < 5% Average response time < 5 seconds CSV export usage 20-30% of queries Metric Target Value Metric Metric Target Value Target Value Estimated queries/day 400-600 Estimated queries/day Estimated queries/day 400-600 400-600 Target unique users 50-100+ Target unique users Target unique users 50-100+ 50-100+ System availability target > 99.5% System availability target System availability target > 99.5% > 99.5% Fallback activation rate < 5% Fallback activation rate Fallback activation rate < 5% < 5% Average response time < 5 seconds Average response time Average response time < 5 seconds < 5 seconds CSV export usage 20-30% of queries CSV export usage CSV export usage 20-30% of queries 20-30% of queries 14. Limitations 14.1 Known Constraints Limitation Impact Workaround Priority to Fix Single-language support English only None currently Medium No ticket creation Read-only queries Users must use ITSM directly High ~5-minute data latency Not real-time Background sync frequency tunable Low Complex boolean queries "Open OR pending AND network" may fail Rephrase as simpler queries Medium Cross-conversation context New session loses history Use same sessionId Low Attachment handling Cannot search attachment contents Not planned Low Limitation Impact Workaround Priority to Fix Single-language support English only None currently Medium No ticket creation Read-only queries Users must use ITSM directly High ~5-minute data latency Not real-time Background sync frequency tunable Low Complex boolean queries "Open OR pending AND network" may fail Rephrase as simpler queries Medium Cross-conversation context New session loses history Use same sessionId Low Attachment handling Cannot search attachment contents Not planned Low Limitation Impact Workaround Priority to Fix Limitation Limitation Impact Impact Workaround Workaround Priority to Fix Priority to Fix Single-language support English only None currently Medium Single-language support Single-language support English only English only None currently None currently Medium Medium No ticket creation Read-only queries Users must use ITSM directly High No ticket creation No ticket creation Read-only queries Read-only queries Users must use ITSM directly Users must use ITSM directly High High ~5-minute data latency Not real-time Background sync frequency tunable Low ~5-minute data latency ~5-minute data latency Not real-time Not real-time Background sync frequency tunable Background sync frequency tunable Low Low Complex boolean queries "Open OR pending AND network" may fail Rephrase as simpler queries Medium Complex boolean queries Complex boolean queries "Open OR pending AND network" may fail "Open OR pending AND network" may fail Rephrase as simpler queries Rephrase as simpler queries Medium Medium Cross-conversation context New session loses history Use same sessionId Low Cross-conversation context Cross-conversation context New session loses history New session loses history Use same sessionId Use same sessionId Low Low Attachment handling Cannot search attachment contents Not planned Low Attachment handling Attachment handling Cannot search attachment contents Cannot search attachment contents Not planned Not planned Low Low 14.2 Anticipated Scalability Limits Dimension Estimated Limit Behavior at Limit Concurrent users ~50 Response time degrades ~30% Queries per minute ~100 AI API rate limiting triggers Conversation length ~100 messages Context truncation to last 10 Result set size ~500 tickets Hard cap, pagination not implemented Background sync batch ~10,000 tickets Memory pressure, batching required Dimension Estimated Limit Behavior at Limit Concurrent users ~50 Response time degrades ~30% Queries per minute ~100 AI API rate limiting triggers Conversation length ~100 messages Context truncation to last 10 Result set size ~500 tickets Hard cap, pagination not implemented Background sync batch ~10,000 tickets Memory pressure, batching required Dimension Estimated Limit Behavior at Limit Dimension Dimension Estimated Limit Estimated Limit Behavior at Limit Behavior at Limit Concurrent users ~50 Response time degrades ~30% Concurrent users Concurrent users ~50 ~50 Response time degrades ~30% Response time degrades ~30% Queries per minute ~100 AI API rate limiting triggers Queries per minute Queries per minute ~100 ~100 AI API rate limiting triggers AI API rate limiting triggers Conversation length ~100 messages Context truncation to last 10 Conversation length Conversation length ~100 messages ~100 messages Context truncation to last 10 Context truncation to last 10 Result set size ~500 tickets Hard cap, pagination not implemented Result set size Result set size ~500 tickets ~500 tickets Hard cap, pagination not implemented Hard cap, pagination not implemented Background sync batch ~10,000 tickets Memory pressure, batching required Background sync batch Background sync batch ~10,000 tickets ~10,000 tickets Memory pressure, batching required Memory pressure, batching required 14.3 AI Model Dependencies Model availability: Dependent on Azure AI platform uptime (99.9% SLA) Model changes: GPT-4 behaviour changes could affect prompt effectiveness Cost volatility: API pricing changes could impact operational costs Latency variance: AI response times vary 0.5s-5s unpredictably Model availability: Dependent on Azure AI platform uptime (99.9% SLA) Model availability: Model changes: GPT-4 behaviour changes could affect prompt effectiveness Model changes: Cost volatility: API pricing changes could impact operational costs Cost volatility: Latency variance: AI response times vary 0.5s-5s unpredictably Latency variance: 15. Architecture Decision Records ADR-001: Multi-Agent vs Single-Agent Architecture Status:AcceptedDate: 2025-10-15 Status: Date: Context: Need to process natural language queries with high accuracy and maintainability. Context: Decision: Use three specialized agents instead of one general-purpose agent. Decision: Consequences: Consequences: (+) Clear separation of concerns (+) Easier debugging and prompt tuning (+) Lower per-agent prompt complexity (-) Higher latency (sequential calls) (-) More complex orchestration logic (+) Clear separation of concerns (+) Easier debugging and prompt tuning (+) Lower per-agent prompt complexity (-) Higher latency (sequential calls) (-) More complex orchestration logic Alternatives Considered: Alternatives Considered: Single agent with long prompt: Rejected due to prompt complexity and debugging difficulty Two agents (analysis + response): Rejected due to missing validation step Single agent with long prompt: Rejected due to prompt complexity and debugging difficulty Two agents (analysis + response): Rejected due to missing validation step ADR-002: Local Database Cache vs Direct API Queries Status:AcceptedDate: 2025-10-18 Status: Date: Context: User queries require fast response times; external ITSM API has 2-5 second latency. Context: Decision: Cache ITSM data locally with background sync. Decision: Consequences: Consequences: (+) Sub-100ms query latency (+) Complex aggregations possible (+) Resilience to ITSM API outages (-) Data freshness delay (up to 5 minutes) (-) Storage overhead (~500MB for 50K tickets) (+) Sub-100ms query latency (+) Complex aggregations possible (+) Resilience to ITSM API outages (-) Data freshness delay (up to 5 minutes) (-) Storage overhead (~500MB for 50K tickets) Alternatives Considered: Alternatives Considered: Direct API queries: Rejected due to latency requirements Redis cache: Considered for future distributed deployment Direct API queries: Rejected due to latency requirements Redis cache: Considered for future distributed deployment ADR-003: Semaphore-Based Thread Locking Status:AcceptedDate: 2025-11-01 Status: Date: Context: Azure AI Agents throw errors when multiple operations occur on same thread. Context: Decision: Implement per-thread SemaphoreSlim with dictionary lookup. Decision: Consequences: Consequences: (+) Prevents concurrent access errors (+) Graceful 503 response on timeout (-) Memory overhead for semaphore dictionary (-) Potential deadlock risk (mitigated by timeout) (+) Prevents concurrent access errors (+) Graceful 503 response on timeout (-) Memory overhead for semaphore dictionary (-) Potential deadlock risk (mitigated by timeout) Alternatives Considered: Alternatives Considered: Global lock: Rejected due to throughput impact Thread-per-user: Rejected due to Azure thread limits Global lock: Rejected due to throughput impact Thread-per-user: Rejected due to Azure thread limits 16. Future Work 16.1 Short-Term Enhancement Complexity Impact Status Redis distributed caching Medium Horizontal scaling Planned Real-time notifications Medium Proactive alerts Planned Voice input support Low Accessibility Backlog Mobile-optimized UI Medium User adoption Backlog Enhancement Complexity Impact Status Redis distributed caching Medium Horizontal scaling Planned Real-time notifications Medium Proactive alerts Planned Voice input support Low Accessibility Backlog Mobile-optimized UI Medium User adoption Backlog Enhancement Complexity Impact Status Enhancement Enhancement Complexity Complexity Impact Impact Status Status Redis distributed caching Medium Horizontal scaling Planned Redis distributed caching Redis distributed caching Medium Medium Horizontal scaling Horizontal scaling Planned Planned Real-time notifications Medium Proactive alerts Planned Real-time notifications Real-time notifications Medium Medium Proactive alerts Proactive alerts Planned Planned Voice input support Low Accessibility Backlog Voice input support Voice input support Low Low Accessibility Accessibility Backlog Backlog Mobile-optimized UI Medium User adoption Backlog Mobile-optimized UI Mobile-optimized UI Medium Medium User adoption User adoption Backlog Backlog 16.2 Medium-Term Enhancement Complexity Impact Status Ticket creation via NL High Bidirectional workflow Research Predictive SLA breach alerts High Proactive management Research Multi-language support Medium Global deployment Backlog Custom report scheduling Medium Automation Backlog Enhancement Complexity Impact Status Ticket creation via NL High Bidirectional workflow Research Predictive SLA breach alerts High Proactive management Research Multi-language support Medium Global deployment Backlog Custom report scheduling Medium Automation Backlog Enhancement Complexity Impact Status Enhancement Enhancement Complexity Complexity Impact Impact Status Status Ticket creation via NL High Bidirectional workflow Research Ticket creation via NL Ticket creation via NL High High Bidirectional workflow Bidirectional workflow Research Research Predictive SLA breach alerts High Proactive management Research Predictive SLA breach alerts Predictive SLA breach alerts High High Proactive management Proactive management Research Research Multi-language support Medium Global deployment Backlog Multi-language support Multi-language support Medium Medium Global deployment Global deployment Backlog Backlog Custom report scheduling Medium Automation Backlog Custom report scheduling Custom report scheduling Medium Medium Automation Automation Backlog Backlog 16.3 Long-Term Enhancement Complexity Impact Status Autonomous ticket triage Very High AI operations Concept Knowledge base integration High Auto-resolution Concept Fine-tuned domain model Very High Accuracy improvement Research Cross-system analytics High Enterprise insights Concept Enhancement Complexity Impact Status Autonomous ticket triage Very High AI operations Concept Knowledge base integration High Auto-resolution Concept Fine-tuned domain model Very High Accuracy improvement Research Cross-system analytics High Enterprise insights Concept Enhancement Complexity Impact Status Enhancement Enhancement Complexity Complexity Impact Impact Status Status Autonomous ticket triage Very High AI operations Concept Autonomous ticket triage Autonomous ticket triage Very High Very High AI operations AI operations Concept Concept Knowledge base integration High Auto-resolution Concept Knowledge base integration Knowledge base integration High High Auto-resolution Auto-resolution Concept Concept Fine-tuned domain model Very High Accuracy improvement Research Fine-tuned domain model Fine-tuned domain model Very High Very High Accuracy improvement Accuracy improvement Research Research Cross-system analytics High Enterprise insights Concept Cross-system analytics Cross-system analytics High High Enterprise insights Enterprise insights Concept Concept 17. Conclusion ATLAS demonstrates that natural language interfaces for enterprise ITSM systems are not only feasible but could deliver significant operational value. The key architectural decisions enabling this potential include: Multi-Agent Pipeline: Separating query understanding, validation, and response generation is projected to improve accuracy (target: 90-95%) and maintainability Context-Aware Conversations: Structured history management could enable natural follow-up queries (target: 85-90% success rate) Hybrid Data Architecture: Background synchronization is designed to provide sub-5-second response times while maintaining data freshness Graceful Degradation: Heuristic fallbacks are intended to ensure high query success rates despite AI service variability Enterprise-Ready Concurrency: Thread-safe agent orchestration designed for multi-user workloads Multi-Agent Pipeline: Separating query understanding, validation, and response generation is projected to improve accuracy (target: 90-95%) and maintainability Multi-Agent Pipeline: Context-Aware Conversations: Structured history management could enable natural follow-up queries (target: 85-90% success rate) Context-Aware Conversations: Hybrid Data Architecture: Background synchronization is designed to provide sub-5-second response times while maintaining data freshness Hybrid Data Architecture: Graceful Degradation: Heuristic fallbacks are intended to ensure high query success rates despite AI service variability Graceful Degradation: Enterprise-Ready Concurrency: Thread-safe agent orchestration designed for multi-user workloads Enterprise-Ready Concurrency: The system is projected to achieve positive ROI through time-to-insight reduction compared to traditional report generation methods. Key Takeaways for Practitioners: Key Takeaways for Practitioners: Multi-agent architectures trade latency for accuracy and maintainability Context preservation is essential for natural conversation flow Fallback mechanisms are critical in enterprise LLM systems Cost modeling should include AI API expenses early in design Multi-agent architectures trade latency for accuracy and maintainability Context preservation is essential for natural conversation flow Fallback mechanisms are critical in enterprise LLM systems Cost modeling should include AI API expenses early in design ATLAS represents a conceptual template for enterprise NLI systems that balance sophistication with pragmatic engineering constraints. The architecture and patterns described in this whitepaper are intended to guide organizations exploring similar solutions. 18. References Multi-Agent Systems & LLM Frameworks Wu, Q., Bansal, G., Zhang, J., Wu, Y., Zhang, S., Zhu, E., Li, B., Jiang, L., Zhang, X., & Wang, C. (2023). "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation." arXiv:2308.08155. https://arxiv.org/abs/2308.08155 LangChain. (2024). "LangGraph: Multi-Agent Workflows." LangChain Blog, January 2024. https://blog.langchain.com/langgraph-multi-agent-workflows/ LangChain. (2024). "Command: A New Tool for Building Multi-Agent Architectures in LangGraph." December 2024. https://blog.langchain.com/command-a-new-tool-for-multi-agent-architectures-in-langgraph/ Microsoft Research. (2025). "AutoGen v0.4 Release." January 2025. https://www.microsoft.com/en-us/research/project/autogen/ Wu, Q., Bansal, G., Zhang, J., Wu, Y., Zhang, S., Zhu, E., Li, B., Jiang, L., Zhang, X., & Wang, C. (2023). "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation." arXiv:2308.08155. https://arxiv.org/abs/2308.08155 https://arxiv.org/abs/2308.08155 LangChain. (2024). "LangGraph: Multi-Agent Workflows." LangChain Blog, January 2024. https://blog.langchain.com/langgraph-multi-agent-workflows/ https://blog.langchain.com/langgraph-multi-agent-workflows/ LangChain. (2024). "Command: A New Tool for Building Multi-Agent Architectures in LangGraph." December 2024. https://blog.langchain.com/command-a-new-tool-for-multi-agent-architectures-in-langgraph/ https://blog.langchain.com/command-a-new-tool-for-multi-agent-architectures-in-langgraph/ Microsoft Research. (2025). "AutoGen v0.4 Release." January 2025. https://www.microsoft.com/en-us/research/project/autogen/ https://www.microsoft.com/en-us/research/project/autogen/ Text-to-SQL & Benchmarks Li, J., Hui, B., Qu, G., Yang, J., Li, B., Li, B., Wang, B., Qin, B., Geng, R., Huo, N., et al. (2024). "Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs (BIRD)." NeurIPS 2023. https://bird-bench.github.io/ Lei, F., Chen, J., Ye, Y., Cao, R., Shin, D., Su, H., et al. (2024). "Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows." arXiv:2411.07763. https://spider2-sql.github.io/ Ma, L., Pu, K., & Zhu, Y. (2024). "Evaluating LLMs for Text-to-SQL Generation With Complex SQL Workload." arXiv:2407.19517. https://arxiv.org/abs/2407.19517 Li, J., Hui, B., Qu, G., Yang, J., Li, B., Li, B., Wang, B., Qin, B., Geng, R., Huo, N., et al. (2024). "Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs (BIRD)." NeurIPS 2023. https://bird-bench.github.io/ https://bird-bench.github.io/ Lei, F., Chen, J., Ye, Y., Cao, R., Shin, D., Su, H., et al. (2024). "Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows." arXiv:2411.07763. https://spider2-sql.github.io/ https://spider2-sql.github.io/ Ma, L., Pu, K., & Zhu, Y. (2024). "Evaluating LLMs for Text-to-SQL Generation With Complex SQL Workload." arXiv:2407.19517. https://arxiv.org/abs/2407.19517 https://arxiv.org/abs/2407.19517 Retrieval-Augmented Generation Edge, D., et al. (2024). "GraphRAG: Unlocking LLM Discovery on Narrative Private Data." Microsoft Research, 2024. Sarthi, P., et al. (2024). "RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval." ICLR 2024. Niu, X., et al. (2024). "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models." ACL 2024. Ranjan, R., et al. (2024). "A Comprehensive Survey of Retrieval-Augmented Generation (RAG): Evolution, Current Landscape and Future Directions." arXiv:2410.12837. https://arxiv.org/abs/2410.12837 Edge, D., et al. (2024). "GraphRAG: Unlocking LLM Discovery on Narrative Private Data." Microsoft Research, 2024. Sarthi, P., et al. (2024). "RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval." ICLR 2024. Niu, X., et al. (2024). "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models." ACL 2024. Ranjan, R., et al. (2024). "A Comprehensive Survey of Retrieval-Augmented Generation (RAG): Evolution, Current Landscape and Future Directions." arXiv:2410.12837. https://arxiv.org/abs/2410.12837 https://arxiv.org/abs/2410.12837 Commercial ITSM AI Solutions ServiceNow. (2024). "Now Platform Xanadu Release: Actionable AI." September 2024. https://www.servicenow.com/blogs/2024/now-platform-xanadu-release-actionable-ai ServiceNow. (2024). "Now Assist Documentation." https://www.servicenow.com/platform/now-assist.html Freshworks. (2024). "Introduction to Freddy AI Agent." October 2024. https://support.freshservice.com/support/solutions/articles/50000010306-introduction-to-freddy-ai-agent Freshworks. (2024). "Freddy AI Copilot." https://www.freshworks.com/freshdesk/omni/freddy-ai-copilot/ Zendesk. (2024). "Announcing General Availability of Generative AI Features for Agents." March 2024. https://support.zendesk.com/hc/en-us/articles/6806752620314 Zendesk. (2024). "About AI Agents." https://support.zendesk.com/hc/en-us/articles/6970583409690-About-AI-agents Zendesk. (2024). "Enhanced Generative AI Features with ChatGPT-4o." https://support.zendesk.com/hc/en-us/articles/7711631447450 ServiceNow. (2024). "Now Platform Xanadu Release: Actionable AI." September 2024. https://www.servicenow.com/blogs/2024/now-platform-xanadu-release-actionable-ai https://www.servicenow.com/blogs/2024/now-platform-xanadu-release-actionable-ai ServiceNow. (2024). "Now Assist Documentation." https://www.servicenow.com/platform/now-assist.html https://www.servicenow.com/platform/now-assist.html Freshworks. (2024). "Introduction to Freddy AI Agent." October 2024. https://support.freshservice.com/support/solutions/articles/50000010306-introduction-to-freddy-ai-agent https://support.freshservice.com/support/solutions/articles/50000010306-introduction-to-freddy-ai-agent Freshworks. (2024). "Freddy AI Copilot." https://www.freshworks.com/freshdesk/omni/freddy-ai-copilot/ https://www.freshworks.com/freshdesk/omni/freddy-ai-copilot/ Zendesk. (2024). "Announcing General Availability of Generative AI Features for Agents." March 2024. https://support.zendesk.com/hc/en-us/articles/6806752620314 https://support.zendesk.com/hc/en-us/articles/6806752620314 Zendesk. (2024). "About AI Agents." https://support.zendesk.com/hc/en-us/articles/6970583409690-About-AI-agents https://support.zendesk.com/hc/en-us/articles/6970583409690-About-AI-agents Zendesk. (2024). "Enhanced Generative AI Features with ChatGPT-4o." https://support.zendesk.com/hc/en-us/articles/7711631447450 https://support.zendesk.com/hc/en-us/articles/7711631447450 Platform Documentation Microsoft. (2024). "Azure AI Agent Service Documentation." https://learn.microsoft.com/en-us/azure/ai-services/agents/ Microsoft. (2024). "Retrieval Augmented Generation (RAG) in Azure AI Search." https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview Microsoft. (2024). "Azure AI Agent Service Documentation." https://learn.microsoft.com/en-us/azure/ai-services/agents/ https://learn.microsoft.com/en-us/azure/ai-services/agents/ Microsoft. (2024). "Retrieval Augmented Generation (RAG) in Azure AI Search." https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview Industry Research McKinsey & Company. (2024). "What is RAG (Retrieval Augmented Generation)." October 2024. https://www.mckinsey.com/featured-insights/mckinsey-explainers/what-is-retrieval-augmented-generation-rag Forrester. (2024). "Forrester's Guide to Retrieval-Augmented Generation." November 2024. https://www.forrester.com/blogs/forresters-guide-to-retrieval-augmented-generation-rag/ LangChain. (2024). "Top 5 LangGraph Agents in Production 2024." December 2024. https://blog.langchain.com/top-5-langgraph-agents-in-production-2024/ McKinsey & Company. (2024). "What is RAG (Retrieval Augmented Generation)." October 2024. https://www.mckinsey.com/featured-insights/mckinsey-explainers/what-is-retrieval-augmented-generation-rag https://www.mckinsey.com/featured-insights/mckinsey-explainers/what-is-retrieval-augmented-generation-rag Forrester. (2024). "Forrester's Guide to Retrieval-Augmented Generation." November 2024. https://www.forrester.com/blogs/forresters-guide-to-retrieval-augmented-generation-rag/ https://www.forrester.com/blogs/forresters-guide-to-retrieval-augmented-generation-rag/ LangChain. (2024). "Top 5 LangGraph Agents in Production 2024." December 2024. https://blog.langchain.com/top-5-langgraph-agents-in-production-2024/ https://blog.langchain.com/top-5-langgraph-agents-in-production-2024/ Foundational Work (for historical context) ITIL Foundation. (2019). "ITIL 4 Foundation." Axelos. ITIL Foundation. (2019). "ITIL 4 Foundation." Axelos. 19. Appendices Appendix A: API Reference POST /api/natural-query POST /api/natural-query Request: Request: { "Query": "string (required)", "SessionId": "string (optional, GUID)", "UserEmail": "string (optional, for personalization)" } { "Query": "string (required)", "SessionId": "string (optional, GUID)", "UserEmail": "string (optional, for personalization)" } Response (Success): Response (Success): { "SessionId": "5ffe2d39-0a8f-43b4-b603-bed01492620f", "ThreadId": "thread_ysZ6pR5HzCqEVSEef8T63DGh", "ConversationalResponse": "Looking at daily trends...", "ExcelFile": { "FileName": "queryresult_20251127142004.csv", "Url": "/api/Main/download-result/{sessionId}/{fileName}" }, "Summary": { "totalRequests": 2310, "timeUnit": "Day" } } { "SessionId": "5ffe2d39-0a8f-43b4-b603-bed01492620f", "ThreadId": "thread_ysZ6pR5HzCqEVSEef8T63DGh", "ConversationalResponse": "Looking at daily trends...", "ExcelFile": { "FileName": "queryresult_20251127142004.csv", "Url": "/api/Main/download-result/{sessionId}/{fileName}" }, "Summary": { "totalRequests": 2310, "timeUnit": "Day" } } Response (Busy): Response (Busy): { "Error": "System is busy processing a previous request.", "ConversationalResponse": "I'm currently processing your previous request..." } { "Error": "System is busy processing a previous request.", "ConversationalResponse": "I'm currently processing your previous request..." } Appendix B: Query Analysis Schema { "queryType": "conversational|inactive_technicians|influx_requests|top_request_areas|top_technicians|request_search", "isConversational": "boolean", "conversationalIntent": "greeting|help|thanks|farewell|capabilities|unclear|null", "dateFrom": "yyyy-MM-dd HH:mm|null", "dateTo": "yyyy-MM-dd HH:mm|null", "timeUnit": "hour|day|null", "topN": "number|null", "subject": "string|null", "technician": "string|null", "technicians": "[string]|null", "requester": "string|null", "inactivityPeriod": "string|null", "isUserRequest": "boolean", "isUserTechnician": "boolean", "status": "open|closed|null" } { "queryType": "conversational|inactive_technicians|influx_requests|top_request_areas|top_technicians|request_search", "isConversational": "boolean", "conversationalIntent": "greeting|help|thanks|farewell|capabilities|unclear|null", "dateFrom": "yyyy-MM-dd HH:mm|null", "dateTo": "yyyy-MM-dd HH:mm|null", "timeUnit": "hour|day|null", "topN": "number|null", "subject": "string|null", "technician": "string|null", "technicians": "[string]|null", "requester": "string|null", "inactivityPeriod": "string|null", "isUserRequest": "boolean", "isUserTechnician": "boolean", "status": "open|closed|null" } Appendix C: Hypothetical Query Examples Query Analysis Expected Result "request volume this week" influx_requests, timeUnit=day ~2,000+ requests, peak mid-week "what tickets do i have assigned to me" request_search, isUserTechnician=true User's assigned tickets "how many of them are open" (follow-up) request_search, isUserTechnician=true, status=open Filtered to open status "how many tickets assigned to TechUser1 this month" request_search, technician=TechUser1 ~100-150 tickets "how many involved network" (follow-up) request_search, technician=TechUser1, subject=network Filtered subset "top technicians based on requests handled past week" top_technicians, topN=10 Ranked list by volume "technicians with no requests treated in the past 1 month" inactive_technicians, inactivityPeriod=30 days List of inactive techs Query Analysis Expected Result "request volume this week" influx_requests, timeUnit=day ~2,000+ requests, peak mid-week "what tickets do i have assigned to me" request_search, isUserTechnician=true User's assigned tickets "how many of them are open" (follow-up) request_search, isUserTechnician=true, status=open Filtered to open status "how many tickets assigned to TechUser1 this month" request_search, technician=TechUser1 ~100-150 tickets "how many involved network" (follow-up) request_search, technician=TechUser1, subject=network Filtered subset "top technicians based on requests handled past week" top_technicians, topN=10 Ranked list by volume "technicians with no requests treated in the past 1 month" inactive_technicians, inactivityPeriod=30 days List of inactive techs Query Analysis Expected Result Query Query Analysis Analysis Expected Result Expected Result "request volume this week" influx_requests, timeUnit=day ~2,000+ requests, peak mid-week "request volume this week" "request volume this week" influx_requests, timeUnit=day influx_requests, timeUnit=day ~2,000+ requests, peak mid-week ~2,000+ requests, peak mid-week "what tickets do i have assigned to me" request_search, isUserTechnician=true User's assigned tickets "what tickets do i have assigned to me" "what tickets do i have assigned to me" request_search, isUserTechnician=true request_search, isUserTechnician=true User's assigned tickets User's assigned tickets "how many of them are open" (follow-up) request_search, isUserTechnician=true, status=open Filtered to open status "how many of them are open" (follow-up) "how many of them are open" (follow-up) request_search, isUserTechnician=true, status=open request_search, isUserTechnician=true, status=open Filtered to open status Filtered to open status "how many tickets assigned to TechUser1 this month" request_search, technician=TechUser1 ~100-150 tickets "how many tickets assigned to TechUser1 this month" "how many tickets assigned to TechUser1 this month" request_search, technician=TechUser1 request_search, technician=TechUser1 ~100-150 tickets ~100-150 tickets "how many involved network" (follow-up) request_search, technician=TechUser1, subject=network Filtered subset "how many involved network" (follow-up) "how many involved network" (follow-up) request_search, technician=TechUser1, subject=network request_search, technician=TechUser1, subject=network Filtered subset Filtered subset "top technicians based on requests handled past week" top_technicians, topN=10 Ranked list by volume "top technicians based on requests handled past week" "top technicians based on requests handled past week" top_technicians, topN=10 top_technicians, topN=10 Ranked list by volume Ranked list by volume "technicians with no requests treated in the past 1 month" inactive_technicians, inactivityPeriod=30 days List of inactive techs "technicians with no requests treated in the past 1 month" "technicians with no requests treated in the past 1 month" inactive_technicians, inactivityPeriod=30 days inactive_technicians, inactivityPeriod=30 days List of inactive techs List of inactive techs This whitepaper presents a conceptual architecture for ATLAS. The design patterns, code examples, and projected metrics documented here are intended to guide similar implementations in enterprise environments. All examples use hypothetical data and anonymized placeholders. However, the system itself was developed and rigorously tested using real enterprise data to validate performance, scalability, and reliability. This whitepaper presents a conceptual architecture for ATLAS. The design patterns, code examples, and projected metrics documented here are intended to guide similar implementations in enterprise environments. All examples use hypothetical data and anonymized placeholders. However, the system itself was developed and rigorously tested using real enterprise data to validate performance, scalability, and reliability.