A state Department of Health and Human Services runs one of the most complex public-facing websites in its state. Hundreds of benefit programmes. Thousands of eligibility rules. Millions of residents who need answers fast — often in a crisis. For years, the department answered the same questions on the phone that were already published on its website. 41% of its monthly call volume was information-seeking behaviour. That number never moved, until they replaced keyword search with Keyspider AI Search.
The Organisation
The Department of Health and Human Services administers benefits and services for approximately 3.1 million residents across a geographically diverse state. Its digital portfolio spans the full breadth of federally funded and state-funded social programmes: SNAP (food assistance), Medicaid, the Children's Health Insurance Programme (CHIP), Temporary Assistance for Needy Families (TANF), childcare subsidy programmes, home and community-based waiver services, child welfare, and crisis response hotlines.
The department's website had been rebuilt in 2021 with a strong focus on plain language, mobile responsiveness, and programme navigation. A team of twelve content specialists maintained more than 3,400 individual content pages across five programme sub-domains, plus a shared forms and documents library housing 680 downloadable PDFs. The website received approximately 2.1 million unique visitors per month — a volume reflecting both the scale of the state's population and the essential nature of benefits access.
Despite the investment in content quality, the department's contact centre was fielding 18,500 calls per month. A contact centre operations analysis conducted in Q4 2023 categorised these calls into three buckets: 41% were information and eligibility enquiries (questions the website was designed to answer), 35% were application status and case-specific enquiries (requiring agent access to case management systems), and 24% were process navigation calls (citizens who had found the information but needed help understanding next steps). The first category was the target. Those 7,585 information-seeking calls per month represented $143,000 in monthly contact centre cost at the department's $18.85 loaded cost per interaction.
3.1M
residents served across all benefit programmes
18,500
monthly contact centre calls
41%
of calls were information requests already on the website
$143K
monthly cost of avoidable information-seeking calls
The Problem: When Accurate Content Fails Citizens at the Moment of Need
The department's content team had done their job well. Eligibility requirements for every programme were published, up to date, and written in plain language. Application instructions were clear. Fee schedules and income thresholds were current. The information was there. The problem was that none of it was findable through the search bar — and the way residents searched for it bore almost no relationship to the way the content was structured.
A resident searching for help feeding their family might type: 'food stamps near me', 'how to get food assistance', 'grocery help low income', or 'free food programme'. The department's content used the programme title: 'Supplemental Nutrition Assistance Programme (SNAP)'. Without semantic understanding, keyword search cannot bridge this gap. The resident gets zero results or irrelevant pages, gives up, and calls.
The same pattern played out across Medicaid ('free health insurance', 'doctor without insurance', 'medical help low income'), childcare subsidies ('help paying for daycare', 'childcare assistance working parents'), and TANF ('cash assistance', 'help paying bills'). The department's content team had made every programme findable by its official name. Residents searched for help using the language of their daily lives. Keyword search served neither.
The language gap in action
The department's search analytics team ran a three-month audit of zero-result searches in late 2023. The top five zero-result queries — 'food stamps near me', 'free health insurance', 'help paying rent', 'childcare assistance', 'how to apply for benefits' — collectively accounted for 28,000 failed searches over the three-month period. Every one of these queries had a directly relevant, accurate, published page on the department's website. None were surfaced by the keyword search engine. Each failed search was a call waiting to happen.
The Compounding Problem: Benefits Complexity Requires Multi-Step Answers
Beyond the language gap, the department faced a structural problem that keyword search is fundamentally incapable of solving: benefits eligibility is inherently complex, and residents' real questions are multi-part.
A resident asking 'am I eligible for SNAP?' does not want a link to the SNAP programme landing page. They want to know: what are the income limits, are they calculated monthly or annually, does my household size change the calculation, does my immigration status affect eligibility, what documents do I need to bring, and how long will the application take. These are five distinct pieces of information distributed across multiple pages, forms, and PDFs. A keyword search returns a list of URLs. An AI search engine reads the resident's intent and assembles the answer.
The department's contact centre team had developed detailed call guides for their most common question categories precisely because these questions were multi-part and required agents to synthesise information across multiple system screens. The agents were, in effect, performing a human version of the same information assembly task that an AI search engine can automate — at $18.85 per interaction versus zero marginal cost per search.
Evaluating the Options
The department's digital transformation team had been aware of AI search technology for two years before initiating a formal evaluation. The delay was deliberate — they were waiting for two conditions to be met: demonstrated government deployments with measurable call deflection results, and clarity on the DOJ's ADA Title II digital accessibility rulemaking.
Both conditions were met by early 2024. The DOJ published its final rule in March 2024, establishing WCAG 2.1 AA as the standard for state and local government digital accessibility with a compliance deadline of April 2026. And a growing body of government AI search case studies — including the department's own network of peer state DHHS digital teams — provided the evidence base the transformation team needed to build an internal business case.
The evaluation was structured as a three-stage process: a written RFI to establish a qualified vendor list, a live demonstration phase using the department's actual content, and a proof-of-concept deployment in which shortlisted vendors indexed 400 pages of benefits content and were tested against 150 queries drawn directly from contact centre call transcripts.
Keyspider was selected following the POC phase. Three criteria were decisive. First, accuracy: Keyspider correctly answered 94 of the 150 POC queries with responses that the evaluation team's subject matter experts rated as 'accurate and appropriately cited' — the highest accuracy rate of any vendor tested. Second, hallucination prevention: Keyspider's AI answers were grounded exclusively in the department's indexed content, with every answer citing the source document. Reviewers found zero instances of AI-generated information that was not attributable to a specific department page or document. Third, multilingual capability: the department serves a significant Spanish-speaking population, and Keyspider's Spanish-language search performance in the POC test — using a sample set of Spanish-language queries matched against the department's bilingual content — was materially better than the alternatives evaluated.
Deployment: 9 Days from Contract to Live
The department's IT security team had conducted a preliminary architecture review of Keyspider's platform as part of the vendor shortlisting process, which meant the security clearance process was substantially complete before contract execution. This proved critical: the implementation began within 48 hours of contract signature.
Keyspider's crawlers were configured to index all five programme sub-domains simultaneously, along with the shared forms library. The crawler configuration recognised the department's CMS content type structure — distinguishing between eligibility requirement pages, application instructions, programme overview content, and news releases — and weighted them accordingly in the relevance model. The 680 PDFs in the forms library were fully extracted and indexed as searchable text, not as binary document objects.
PDF deep indexing — why it matters for benefits agencies
State benefits agencies typically store their most detailed and actionable eligibility information in downloadable PDFs: income limit tables, programme manuals, application instructions, and eligibility screening tools. Standard keyword search indexes the filename and metadata of a PDF — not the content inside it. A resident searching for SNAP income limits gets a result titled 'SNAP_Programme_Guide_2025.pdf' with no indication of whether the document answers their question. Keyspider indexes the full text of every PDF, extracts the relevant passage, and surfaces it as a cited answer directly in the search interface. The guide becomes findable by its content, not its filename.
Days 4 through 7 covered configuration, relevance tuning, and the bilingual setup. The department's content team worked directly with Keyspider's implementation specialists to configure Spanish-language query handling, ensure that searches initiated in Spanish returned the department's bilingual content preferentially, and set up synonym libraries for the most common informal benefit programme names used by residents ('food stamps', 'welfare', 'Medicaid', 'free clinic').
Day 8 was a full-day staff acceptance testing session. Participants included contact centre supervisors, eligibility specialists from four programme areas, the department's disability services coordinator (who led the WCAG audit), and representatives from the communications team. Testing covered 80 structured scenarios plus open-ended free-form testing. The session surfaced two relevance issues — both related to outdated content pages ranking ahead of current policy pages on specific Medicaid waiver queries — which were corrected by end of day.
The search widget went live on Day 9. No infrastructure changes were required outside the JavaScript snippet deployment through the department's CMS. The rollout took forty minutes.
Results at 90 Days
41%
reduction in eligibility & information call volume
88 sec
average time-to-answer for self-served queries
94%
citizen satisfaction with search experience
$1.76M
projected annual savings in contact centre costs
Call Volume
At 90 days post-deployment, monthly contact centre call volume stood at 13,200 — a reduction of 5,300 calls from the pre-deployment baseline of 18,500. This figure was not simply a 90-day aberration: the reduction was consistent week-over-week from week 3 onwards, and the trend line was still declining at the 90-day measurement point.
Breaking down by category: information and eligibility calls — the category directly targeted by AI search — fell from 7,585 per month to 4,475, a 41% reduction. Application status and case-specific calls fell 8% (attributable to residents now arriving at applications better prepared, reducing mid-process status calls). Process navigation calls fell 22% — the clearest evidence that the AI-generated answer summaries, which included step-by-step next actions alongside the eligibility information, were successfully guiding residents through the application process without agent intervention.
Time to Answer
Session analytics introduced at deployment measured the time from search query submission to a content click-through or AI answer engagement. The median time-to-answer for residents who self-served successfully was 88 seconds from the initial query to engagement with a relevant result or AI answer summary. The department's average contact centre handle time for an information enquiry call was 6 minutes and 14 seconds, including hold time, agent handle time, and after-call work.
For residents, this was not just a productivity improvement — it was a fundamentally different experience of interacting with government. A question about SNAP income limits that previously required navigating a phone tree, waiting on hold, and speaking with an agent could now be answered in under two minutes, at any time of day, from any device.
Citizen Satisfaction
A post-deployment satisfaction survey conducted in June 2025 — 90 days after go-live — collected 1,840 responses via an exit intercept on the search results page. Overall satisfaction with the search experience was 94%, compared to 49% in the equivalent pre-deployment survey. The open-text responses were notable for the frequency of surprise: a significant minority of respondents used phrases indicating they had attempted to find this information before and failed. For residents navigating a benefits system in difficult circumstances, finding the right answer quickly carries an emotional weight that goes beyond convenience.
"I've been trying to figure out if I qualify for SNAP for three weeks. I kept calling and getting put on hold. I typed one question into this search and got the answer in ten seconds. I genuinely cried."
— Citizen survey respondent, June 2025
Financial Impact
The department's budget office calculated the annualised financial impact of the 90-day results. The reduction of 5,300 calls per month, at a loaded cost of $18.85 per interaction, represented a monthly saving of $99,905. Annualised: $1.199M. Against an annual Keyspider subscription cost, the return on investment in year one exceeded 8x. The analysis conservatively excluded the productivity savings from the Phase 2 staff deployment and the further deflection expected as the AI search index matured and the AI Assistant layer was activated.
Phase 2: AI Assistant for Multi-Step Eligibility Guidance
Ninety days after the AI Search deployment, the department activated a AI Assistant layer on top of the existing search index. Rather than presenting a list of results and an AI summary, AI Assistant allows residents to ask multi-step questions in natural language — 'I work part-time and my household has three people, am I eligible for SNAP?' — and receive a structured, cited response that accounts for their specific situation as described.
The AI Assistant deployment was piloted on the SNAP and Medicaid programme pages first — the two highest-volume categories. At the 30-day pilot mark, the process navigation call category (22% of pre-deployment call volume) had fallen a further 31%, consistent with residents now receiving step-by-step guidance through the application process without agent involvement. The information and eligibility call category fell a further 14% on top of the 41% already achieved — bringing total call deflection for the combined AI Search + AI Assistant deployment to 55% of the original baseline.
Hallucination prevention in a high-stakes context
Benefits eligibility is a context where an incorrect AI answer carries serious consequences. A resident incorrectly told they are ineligible for Medicaid may go without healthcare. A resident given the wrong income threshold for SNAP may fail to apply. The department's primary non-negotiable requirement was zero hallucination — every answer must be attributable to a specific published department document, with the source citation visible to the resident. Keyspider's architecture physically prevents the AI from drawing on any information outside the department's indexed content. In six months of live deployment, the department's content team reviewed 480 flagged AI answers and found zero instances of factual inaccuracy.
What Other Benefits Agencies Can Learn
The department's digital transformation lead documented four transferable lessons from the deployment, shared at a multi-state DHHS digital leadership forum in September 2025.
- 1Measure first. The contact centre operations analysis that quantified 41% information-seeking call volume was the foundation of the business case. Without it, the department would not have had the evidence to justify the procurement.
- 2The language gap is the real problem — not content quality. The department's content was accurate and well-written. The failure was the inability of keyword search to bridge everyday language to programme terminology. Any government benefits agency with good content and a legacy keyword search engine has the same problem.
- 3PDF indexing is not optional. For benefits agencies, the most detailed eligibility information lives in PDFs. A search solution that does not deep-index PDF content cannot answer the questions that drive call volume.
- 4Run the proof of concept on your own queries. The 150 contact centre call transcripts used in the POC evaluation were the single most valuable input in the vendor selection process. Testing vendors on real questions from real residents is infinitely more diagnostic than testing on vendor-supplied demo content.
Explore further
AI Search for State Government
How state agencies are using AI search for citizen self-service, FOIA readiness, and staff productivity.
AI Assistant — product overview
Grounded conversational AI for citizen self-service, built on your content only.
How a State Agency Reduced Contact Centre Calls by 38% in 90 Days
A mid-Atlantic state DSS deploys AI Search and transforms call volume.
Cross-Agency Search: One Bar Covers All 22 Agency Sites
How a state unified 22 agency sub-domains into a single AI search experience for residents.
Ready to see what self-service looks like on your benefits portal?
Book a demo with our state government team. We'll configure Keyspider on a sample of your actual benefits content and walk you through a live proof of concept using your most common eligibility questions.
Book a Demo