Srota AI: A Voice-Controlled Intelligent Web Automation Platform Using Large Language Models and Multi-Agent Orchestration

Authors

  • Mr.Shuvendu Samal Author
  • Mr.Biswajit Swain Author
  • Prof Biswajit Sahoo Author

DOI:

https://doi.org/10.64751/vyfrez55

Abstract

The rapid proliferation of complex, multi-step webbased applications across government, healthcare, education, and enterprise domains has introduced significant usability barriers for diverse user populations. Traditional web automation tools rely on brittle, static rule-based scripting that is inaccessible to non-technical users and fails to adapt to dynamic web interfaces. This paper presents Srota AI, a voice-controlled intelligent web automation platform that leverages large language models (LLMs), multi-agent orchestration via LangGraph, and real-time voice transcription through the Deepgram API to enable users to complete complex digital workflows using natural language or spoken commands. The system architecture integrates a Planner Agent for goal decomposition, a Navigator Agent for semantic DOM interaction, a Voice Assistant Module for hands-free operation, and a Semantic DOM Processing Module for contextaware UI navigation. The backend, developed in Python with FastAPI, supports asynchronous concurrent session management and integrates with Google Gemini for AI reasoning. A plug-andplay React SDK enables seamless embedding into third-party web applications. Evaluation results demonstrate significant reductions in task completion time and error rates across real-world automation scenarios, including government form submission, healthcare appointment scheduling, and enterprise onboarding. The system establishes a replicable, extensible framework for applying agentic AI and voice recognition to accessible, intelligent web automation.

Downloads

Published

2026-06-06

How to Cite

Mr.Shuvendu Samal, Mr.Biswajit Swain, & Prof Biswajit Sahoo. (2026). Srota AI: A Voice-Controlled Intelligent Web Automation Platform Using Large Language Models and Multi-Agent Orchestration. International Journal of Economic Social Science and Management LAW, 7(2(1), 83-88. https://doi.org/10.64751/vyfrez55