Home
International Journal of Science and Research Archive
International, Peer reviewed, Open access Journal ISSN Approved Journal No. 2582-8185

Main navigation

  • Home
    • Journal Information
    • Abstracting and Indexing
    • Editorial Board Members
    • Reviewer Panel
    • Journal Policies
    • IJSRA CrossMark Policy
    • Publication Ethics
    • Issue in Progress
    • Current Issue
    • Past Issues
    • Instructions for Authors
    • Article processing fee
    • Track Manuscript Status
    • Get Publication Certificate
    • Become a Reviewer panel member
    • Join as Editorial Board Member
  • Contact us
  • Downloads

ISSN Approved Journal || eISSN: 2582-8185 || CODEN: IJSRO2 || Impact Factor 8.2 || Google Scholar and CrossRef Indexed

Peer Reviewed and Referred Journal || Free Certificate of Publication

Research and review articles are invited for publication in March 2026 (Volume 18, Issue 3) Submit manuscript

Optimizing Large Language Model Deployment in Edge Computing Environments

Breadcrumb

  • Home
  • Optimizing Large Language Model Deployment in Edge Computing Environments

Raghavan Krishnasamy Lakshmana Perumal *

Member of IEEE Computer Society, Tampa, Florida, USA.

Research Article

International Journal of Science and Research Archive, 2025, 14(03), 1658-1669

Article DOI: 10.30574/ijsra.2025.14.3.0912

DOI url: https://doi.org/10.30574/ijsra.2025.14.3.0912

Received on 21 February 2025; revised on 29 March 2025; accepted on 31 March 2025

Deploying large language models (LLMs) in edge computing environments is an emerging challenge at the intersection of AI and distributed systems. Running LLMs directly on edge devices can greatly reduce latency and improve privacy, enabling real-time intelligent applications without constant cloud connectivity. However, modern LLMs often consist of billions of parameters and require tens of gigabytes of memory and massive compute power, far exceeding what typical edge hardware can provide. In this paper, we present a comprehensive approach to optimize LLM deployment in edge computing environments by combining four existing classes of optimisation techniques: model compression, quantization, distributed inference, and federated learning, in a unified framework. Our insight is that a holistic combination of these techniques is necessary to successfully deploy LLMs in practical edge settings. We also provide new algorithmic solutions and empirical data to advance the state of the art. 

Large Language Models (LLMs); Edge Computing; Model Compression; Distributed Inference; Federated Learning

https://ijsra.net/sites/default/files/fulltext_pdf/IJSRA-2025-0912.pdf

Preview Article PDF

Raghavan Krishnasamy Lakshmana Perumal. Optimizing Large Language Model Deployment in Edge Computing Environments. International Journal of Science and Research Archive, 2025, 14(03), 1658-1669. Article DOI: https://doi.org/10.30574/ijsra.2025.14.3.0912.

Copyright © Author(s). All rights reserved. This article is published under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as appropriate credit is given to the original author(s) and source, a link to the license is provided, and any changes made are indicated.


All statements, opinions, and data contained in this publication are solely those of the individual author(s) and contributor(s). The journal, editors, reviewers, and publisher disclaim any responsibility or liability for the content, including accuracy, completeness, or any consequences arising from its use.

Get Certificates

Get Publication Certificate

Download LoA

Check Corssref DOI details

Issue details

Issue Cover Page

Editorial Board

Table of content

          

   

Copyright © 2026 International Journal of Science and Research Archive - All rights reserved

Developed & Designed by VS Infosolution