Latent Style Representation Learning and Knowledge-Driven Inference for Consistent Prompt-Conditioned Image Generation | IJET – Volume 12 Issue 2 | IJET-V12I2P132

International Journal of Engineering and Techniques (IJET) Logo

International Journal of Engineering and Techniques (IJET)

Open Access • Peer Reviewed • High Citation & Impact Factor • ISSN: 2395-1303

Volume 12, Issue 2  |  Published: April 2026

Author: Dinesh S, Kalai Kumar K, Jaya Murugan V, Harishwaran P, Naveen K

DOI: https://doi.org/{{doi}}  â€˘  PDF: Download

Abstract

Palette AI addresses the challenge of maintaining consistent artistic identity in AI-generated images by capturing and reusing the visual “DNA” of reference artwork. Existing systems often demand repetitive style descriptions in prompts, leading to inconsistency. PaletteAI allows users to upload reference images, which are analysed using multimodal AI to extract visual attributes like color palettes and stylistic techniques. These attributes are converted into structured style representations and vector embeddings, forming a reusable style profile. During image generation, the system uses reasoning-based prompt fusion to integrate the learned style profile with the user’s text prompt, ensuring stylistic consistency. The framework supports both text-to-image and image-to-image style transformation through a graph- based pipeline. Users can control the style’s influence via adjustable weights. The platform also includes a voice-driven assistant for brainstorming. Experiments confirm that PaletteAI improves consistency, reduces prompt engineering, and enhances human-AI creative collaboration.

Keywords

Generative AI, Artistic Style Learning, Visual Feature Extraction, Prompt Fusion, Image Generation, Multimodal AI, Style Transfer, Human–AI Creative Systems.

Conclusion

PaletteAI provides an effective AI-powered framework for maintaining stylistic consistency in prompt-conditioned image generation. By integrating multimodal style extraction, vector- based style representations, and reasoning-based prompt fusion, the system successfully captures artistic characteristics from reference images and applies them during image generation. The proposed approach enables users to generate visually coherent images without repeatedly describing stylistic attributes within prompts. Evaluation results demonstrate that the system achieves strong style consistency, accurate prompt alignment, and efficient generation performance. The ability to store reusable style profiles and control the influence of style during generation enhances both usability and creative flexibility. These capabilities make PaletteAI suitable for a wide range of applications including digital art creation, graphic design, and AI-assisted media production. Overall, PaletteAI offers a practical and scalable solution for improving the reliability and consistency of AI-generated visual content while supporting human–AI collaborative creativity.

References

[1]L. A. Gatys, A. S. Ecker and M. Bethge, “A Neural Algorithm of Artistic Style,” arXiv preprint arXiv:1508.06576, 2015. [2]Y. Jing, Y. Yang, Z. Feng, J. Ye and M. Song, “Neural Style Transfer: A Review,” IEEE Transactions on Visualization and Computer Graphics, 2018. [3]L. Liu, Z. Xi, R. Ji and W. Ma, “Advanced Deep Learning Techniques for Image Style Transfer: A Survey,” Signal Processing: Image Communication, vol. 78, pp. 465–470, 2019. [4]D. Jin, Z. Jin, Z. Hu, O. Vechtomova and R. Mihalcea, “Deep Learning for Text Style Transfer: A Survey,” Computational Linguistics, vol. 48, 2022. [5]S. T. Nguyen, N. Q. Tuyen and N. H. Phuc, “Deep Feature Rotation for Multimodal Image Style Transfer,” Proceedings of the 8th NAFOSTED Conference on Information and Computer Science, 2021. [6]H. Wang, P. Wu, K. D. Rosa, C. Wang and A. Shrivastava, “Multimodality-guided Image Style Transfer using Cross- modal GAN Inversion,” Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2022. [7]C. Tan, W. Zhang, Z. Qi, K. Shih, X. Liu and A. Xiang, “Generating Multimodal Images with GAN: Integrating Text, Image, and Style,” Proceedings of ICCMT, 2023. [8]J. Huang et al., “MRStyle: A Unified Framework for Color Style Transfer with Multi-Modality Reference,” arXiv preprint arXiv:2409.05250, 2024. [9]Z. Han et al.,“StyleBooth: Image Style Editing with Multimodal Instruction,”Proceedings of ICCV Workshops, 2023. [10]A. Hertz, A. Voynov, S. Fruchter and D. Cohen-Or, “Style Aligned Image Generation via Shared Attention,” Proceedings of CVPR, 2023. [11]S. Huang et al., “QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity,” Proceedings of CVPR, 2023. [12]J. Kwon, S. Kim, Y. Lin, S. Yoo and J. Cha, “AesFA: An Aesthetic Feature-Aware Arbitrary Neural Style Transfer,” Proceedings of AAAI Conference on Artificial Intelligence, 2024. [13]Y. Wang et al., “SigStyle: Signature Style Transfer via Personalized Text-to-Image Models,” Proceedings of AAAI Conference on Artificial Intelligence, 2025. [14] Y. Zhang et al., “A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning,” ACM Transactions on Graphics, 2023. [15]P. H. Le-Khac, G. Healy and A. F. Smeaton, “Contrastive Representation Learning: A Framework and Review,” IEEE Access, vol. 8, 2020.

Cite this article

APA
{{author}} (April 2026). {{title}}. International Journal of Engineering and Techniques (IJET), 12(2). https://doi.org/{{doi}}
{{author}}, “{{title}},” International Journal of Engineering and Techniques (IJET), vol. 12, no. 2, April 2026, doi: {{doi}}.
Submit Your Paper