Local mean decomposition (LMD) is an effective signal analysis method for analyzing nonlinear and nonstationary signals. LMD has been usefully applied in a wide variety of applications. However, achieving real-time LMD calculations in software is difficult. In this paper, a flexible, low-cost, and high-performance hardware architecture for LMD is proposed that satisfies the real-time requirements of various LMD applications. All proposed circuits were developed using Verilog and then synthesized using the Synopsys Design Compiler with the Taiwan Semiconductor Manufacturing Company 0.18-μm cell library. With the help of parameterization, the proposed LMD circuit can easily be used for various applications and hardware architectures.