Home
International Journal of Science and Research Archive
International, Peer reviewed, Open access Journal ISSN Approved Journal No. 2582-8185

Main navigation

  • Home
    • Journal Information
    • Abstracting and Indexing
    • Editorial Board Members
    • Reviewer Panel
    • Journal Policies
    • IJSRA CrossMark Policy
    • Publication Ethics
    • Issue in Progress
    • Current Issue
    • Past Issues
    • Instructions for Authors
    • Article processing fee
    • Track Manuscript Status
    • Get Publication Certificate
    • Become a Reviewer panel member
    • Join as Editorial Board Member
  • Contact us
  • Downloads

ISSN Approved Journal || eISSN: 2582-8185 || CODEN: IJSRO2 || Impact Factor 8.2 || Google Scholar and CrossRef Indexed

Peer Reviewed and Referred Journal || Free Certificate of Publication

Research and review articles are invited for publication in March 2026 (Volume 18, Issue 3) Submit manuscript

Scalable knowledge distillation for large language models on multi-GPU systems

Breadcrumb

  • Home
  • Scalable knowledge distillation for large language models on multi-GPU systems

Wary Hossain Rabby 1, *, A.S.S.M.Q-E-Elahy 1, Gias Uddin 2, Emran Sikder 3, Rafiqul Islam 1 and Hasibul Islam 1

1 Jahangirnagar University, Dhaka, Bangladesh.

2 Uttara University, Dhaka, Bangladesh.

3 Daffodil International University, Dhaka, Bangladesh.

Research Article

International Journal of Science and Research Archive, 2025, 16(03), 314–321

Article DOI: 10.30574/ijsra.2025.16.3.2569

DOI url: https://doi.org/10.30574/ijsra.2025.16.3.2569

Received on 29 July 2025; revised on 06 September 2025; accepted on 08 September 2025

One well-liked method for condensing massive language models (LLMs) into smaller, faster, more effective versions without sacrificing performance is knowledge distillation (KD). However, it is no longer feasible to run distillation on a single device as LLMs grow to hundreds of billions of parameters; it is simply too computationally demanding. In this paper, we investigate how to leverage multi-GPU configurations to make KD scale. To overcome communication bottlenecks and accelerate training, our method com- bines adaptive gradient compression with tensor, pipeline, and data parallelism. Tested on transformer-based LLMs, our framework maintains strong accuracy for both understanding and generation tasks, reduces communication overhead by 27%, and provides up to 3.4× faster training than single-GPU baselines. 

Knowledge Distillation (KD); Large Language Models (LLMS); Model Compression; Multi-GPU Systems; Distributed Training; Hybrid Parallelism (Data, Tensor, Pipeline); Gradient Compression

https://ijsra.net/sites/default/files/fulltext_pdf/IJSRA-2025-2569.pdf

Preview Article PDF

Wary Hossain Rabby, A.S.S.M.Q-E-Elahy, Gias Uddin, Emran Sikder, Rafiqul Islam and Hasibul Islam. Scalable knowledge distillation for large language models on multi-GPU systems. International Journal of Science and Research Archive, 2025, 16(03), 314–321. Article DOI: https://doi.org/10.30574/ijsra.2025.16.3.2569.

Copyright © Author(s). All rights reserved. This article is published under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as appropriate credit is given to the original author(s) and source, a link to the license is provided, and any changes made are indicated.


All statements, opinions, and data contained in this publication are solely those of the individual author(s) and contributor(s). The journal, editors, reviewers, and publisher disclaim any responsibility or liability for the content, including accuracy, completeness, or any consequences arising from its use.

Get Certificates

Get Publication Certificate

Download LoA

Check Corssref DOI details

Issue details

Issue Cover Page

Editorial Board

Table of content

          

   

Copyright © 2026 International Journal of Science and Research Archive - All rights reserved

Developed & Designed by VS Infosolution