This paper describes a design of adjustable convolutional hardware in image processing. As performing an n×n convolution traditionally needs n2 processing elements (PEs), therefore, the larger the mask size is, the more the number of PEs and the cost are. In order to achieve a reasonable performance/cost ratio, we design a new architecture using n or less PEs to perform an n×n convolution. This hardware architecture together with a few delay-line buffers is designed in pipeline-and-paralle1 fashion. Presently, a 3×3 convolutional prototype consisting of only two PEs has been constructed on a single board and can approximately operate in real-time.