ArtAdapter: Text-to-Image Style Transfer using
Multi-Level Style Encoder and Explicit Adaptation

Dar-Yen Chen Hamish Tennent Ching-Wen Hsu

Cardinal Blue

This work introduces ArtAdapter, a transformative text-to-image (T2I) style transfer framework that transcends traditional limitations of color, brushstrokes, and object shape, capturing high-level style elements such as composition and distinctive artistic expression. The integration of a multi-level style encoder with our proposed explicit adaptation mechanism enables ArtAdapte to achieve unprecedented fidelity in style transfer, ensuring close alignment with textual descriptions. Additionally, the incorporation of an Auxiliary Content Adapter (ACA) effectively separates content from style, alleviating the borrowing of content from style references. Moreover, our novel fast finetuning approach could further enhance zero-shot style representation while mitigating the risk of overfitting. Comprehensive evaluations confirm that ArtAdapter surpasses current state-of-the-art methods.


ArtAdapter employs a multi-level style encoder to obtain style embeddings. To allow the diffusion backbone to adapt to style concepts, we introduce the Explicit Adaptation mechanism within the cross-attention layers. Furthermore, our framework incorporates the Auxiliary Content Adapter (ACA) - a key component during training, that is excluded when inference. ACA helps eliminate the influence of the content semantics in the style reference. Conclusively, our fast finetuning method is designed to capture more nuanced style characteristics, applicable to both individual and collective style references.

Style Mixing

Our innovative style mixing approach leverages the multi-level style encoder to expand the T2I style transfer's horizons, involving applying distinct styles to different hierarchical levels. We show the mixings of two styles: one affects low- and mid-level features, while the other shapes high-level attributes.

Structure Control

ArtAdapter can be integrated with the existing extra controls like T2I-Adapter.


