Considerable research has focused on dimensional sentiment analysis, which seeks to predict a real-valued sentiment score in multiple dimensions for a given sentiment expression. Although state-of-the-art methods can obtain decent results with high-quality and large-scale corpora data, performance declines significantly under conditions of data scarcity. To address this data scarcity problem, this study proposes a domain-distilled method to learn domain-invariant features instead of the domain-specific features commonly used by traditional methods because learning domain-specific features under data scarcity condition may restrict coverage of the domain feature space. The proposed distillation process is accomplished using a domain discriminator to distinguish the feature’s domain. In addition, the domain discriminator is trained by maximizing the prediction loss because this makes it difficult for the discriminator to distinguish among domains, thus improving its ability to learn domain-invariant features. To evaluate the proposed method, we implement the domain-distilled method in Bidirectional Encoder Representations from Transformers (BERT) due to its promising results in many natural language processing (NLP) tasks. Experiments on the EmoBank, a three dimensional sentiment corpus, show that the proposed domain-distilled BERT outperforms the original BERT and other deep learning models in terms of dimensional sentiment score prediction.