In this paper, we improve the state-of-the-art ECAPA-TDNN model for speaker verification with CNN stem, self-calibration (SC) block, and deep layer aggregation. The proposed architecture is called Emphasized Channel Attention Propagation and Deep Layer Aggregation with Pre-activated CNN Stem in Time Delay Neural Network, which is abbreviated as ECAPDLA CNNv2-TDNN. First, we add a pre-activated stemming convolution layer in front of the main ECAPA-TDNN architecture. This ensures that the input to our main model architecture is a stable feature representation. Next, we change the multi-layer aggregation of ECAPA-TDNN to deep layer aggregation and replace the SE-Res2block in ECAPATDNN with SC block. Thus, the proposed implementation enhances feature extraction on multiple time scales and spectral channels and improves the overall training efficacy. On the VoxCeleb1-O dataset, the proposed model achieves an equal error rate (EER) of 0.95%. This is significantly better than the EER of 1.23% achieved by the ECAPA-TDNN baseline.