University of New Haven, Connecticut.
International Journal of Science and Research Archive, 2026, 18(03), 1066-1079
Article DOI: 10.30574/ijsra.2026.18.3.0487
Received on 03 February 2026; revised on 18 March 2026; accepted on 20 March 2026
The increasing adoption of artificial intelligence and real-time analytics in healthcare has exposed fundamental limitations in traditional clinical data engineering approaches, which rely heavily on batch-oriented pipelines, rigid schemas, and manual data governance. These limitations introduce latency, interoperability challenges, and silent data-quality failures that directly affect clinical decision-making and model reliability. This paper presents a solution-oriented clinical data engineering paradigm based on real-time lakehouse architectures, interoperability-first design, and autonomous data quality management. The proposed approach unifies streaming and historical clinical data across heterogeneous sources, enabling low-latency analytics while preserving semantic consistency and auditability. Interoperability is addressed through canonical data modeling and real-time semantic normalization, allowing seamless integration of electronic health records, medical devices, and imaging systems. Autonomous data quality mechanisms continuously detect anomalies, drift, and inconsistencies, preventing corrupted data from propagating into downstream clinical applications. Real-world clinical scenarios demonstrate how this architecture improves operational readiness, enhances AI reliability, and supports trustworthy, real-time clinical decision support.
Clinical Data Engineering; Real-Time Lakehouse; Healthcare Interoperability; Autonomous Data Quality; Clinical Analytics
Preview Article PDF
Prathyusha Beemanaboina. Next-generation clinical data engineering: Real-time Lake houses, interoperability, and autonomous data quality. International Journal of Science and Research Archive, 2026, 18(03), 1066-1079. Article DOI: https://doi.org/10.30574/ijsra.2026.18.3.0487.






