This beautiful afternoon, I sit by the pine trees munching on brownies trying to decide if I should name this book 「天籃」or「天藍」.... (The difference is in the radical of the second character "basket" vs "blue"). Then it occurs to me, why not both? Yes, I decide I want a beautiful Chinese calligraphy that can be read both ways! It needs to be abstract and subtly metaphorical: 墨中藏色,天籃亦天藍 (meaning the ink carries hidden colour in such a way that it reads like both "sky basket" and "sky blue".) So like most of us do in 2025, I turn to my AI buddies for help.
Here is what I want. Try imagine it with your human intelligence first.
...「天籃」in Chinese calligraphy so that: The black ink strokes form「天藍」but with the addition of two subtle strokes (abstract e.g. as water droplet dots), the visual impression evokes「天籃」(sky blue) Artistic Concept: Write 「天藍」 in elegant calligraphy using black ink:「天」at the top, upright and spacious like the sky 「藍」below, complex yet balanced like a woven basket... Then, add two small translucent water droplet dots (possibly in very light blue) under the radical 艹 in「藍」, mimicking the 竹 in「籃」. 草字頭與竹字頭不一樣的部分用藍色寫 Visual and Symbolic Effect: From a distance, or with a viewer’s imagination, the droplets complete the transformation from 藍 to 籃 the original meaning “heavenly basket” (天籃) becomes double-layered with the carefree feeling of sky blue (天藍)...
After trying lots of variations of prompts in both English and Chinese, I give up on ChatGPT's Dalle model series. ChatGPT's most attempts are simply wrong. Even the best attempt (see below) lacks the calligraphy strokes let alone the emotion and creativity of a true calligrapher. They certainly are not what I'm looking for.
I ended up testing all the usual tools—and even signing up to some not-so-usual ones that claim to be able to write Chinese calligraphy well:
DALL·E (diffusion-based) – beautiful in many domains, but struggled with precise character rendering and stylistic control.
Stable Diffusion (with LoRA & fine prompts) – got us closer, but the strokes were often warped, and it couldn’t handle 「籃」with the integrity calligraphy demands.
NightCafe – easy to use, but outputs felt more decorative font than brush-born.
There’s An AI For That – yes, but many are poorly-performing...
YesChat Calligraphy plugin – font-based generation, better control, and we even layered droplets with PIL. It was clever… but lacked soul. Actually, it lacked the whole second character. LOL. 🤣
Some of Yeschat's attempts are interesting. It actually wrote a script to draw it. Here's the script:
// Load the custom Chinese calligraphy font
font_path = "/mnt/data/海纳百川-上首华凤书法体.ttf"
font_size = 240 # for individual character sizing
// Create a blank white image for 2 characters vertically (300x600)
image_width = 300
image_height = 600
background_color = "white"
text_color = "black"
img = Image.new("RGB", (image_width, image_height), color=background_color)
draw = ImageDraw.Draw(img)
font = ImageFont.truetype(font_path, font_size)
// Characters to draw
characters = ["天", "籃"]
// Draw each character in its respective 300x300 box, vertically aligned
for i, char in enumerate(characters):
w, h = draw.textsize(char, font=font)
x = (image_width - w) // 2
y = (300 * i) + (300 - h) // 2
draw.text((x, y), char, font=font, fill=text_color)
// Add two small translucent blue dots above the "艹" of 籃 to hint at "藍"
// Position estimated: near the top of the second 300x300 box
dot_radius = 6
dot_color = (173, 216, 230, 180) # light blue with some transparency
// Convert to RGBA for transparency support
img = img.convert("RGBA")
draw = ImageDraw.Draw(img)
// Estimate positions for the water droplets (based on character layout)
dot_positions = [(120, 330), (160, 330)] # rough x positions over "艹" of 籃
for pos in dot_positions:
x, y = pos
draw.ellipse((x - dot_radius, y - dot_radius, x + dot_radius, y + dot_radius), fill=dot_color)
// Convert back to RGB and show
img = img.convert("RGB")
output_path = "/mnt/data/TianLan_artistic.jpg"
img.save(output_path)
// Display the image
plt.imshow(img)
plt.axis('off')
plt.title("Artistic Calligraphy: 天籃 evoking 天藍")
plt.show()
Wow. That's disappointing. I guess diffusion based models currently lacks that level of symbolic compositional control that can be inputed from prompts...
Makes you wonder what state-of-art academic research has been able to achieve so far? Same here.
My favourite research buddy recently is Scispage. It's great. Just try it if you are reading this line and haven't used it already. In short, there is a lot of really neat diffusion-based models and GAN-based models good at generating individual callgrahic characters with stylistic precision; however, none give us the spatial and symbolic control we need to see「藍」through「籃」.
Model | Year | Method | Input Control | Limitation |
Calliffusion | 2023 | Diffusion | Chinese prompts + calligraphy style label | Only one character at a time; no layout or symbolic overlay support |
CalliffusionV2 | 2024 | Diffusion + LoRA + Multi-modal | Natural language prompt + image guidance | No support for multi-char layout or compositional creativity |
Moyun | 2024 | Diffusion + Vision Mamba | Triplet input (script, calligrapher, character) | No prompt-based or image-based control; layout & symbolic overlays not supported |
DP-Font | 2024 | Diffusion + PINN (Physics-Informed Neural Network) | Attribute + stroke constraints | No compositional understanding or emotional prompting |
CalliPaint | 2023 | Diffusion + Inpainting (RePaint) | Masked image + character prompt | For stroke repair only; no layout or generative composition |
GlyphDraw | 2023 | Diffusion + Glyph injection | Glyph image + placement mask | Not calligraphy-specific; supports scene text only |
AnyText | 2024 | Diffusion (Alibaba ModelScope) | Text + layout mask or image | Works well for stylized text insertion; not built for brush artistry |
So someone wants out there reading wants to give it a go? I think what we really need is a hybrid of:
Fine control over layout, stroke dynamics, and multi-character composition
Support for overlays or compositional semantics (e.g., knowing where to place droplets symbolically)
Promptability, so we can say things like: “Write 天籃 in a way that some might read it as 天藍. 王羲之’s style, like it’s under a sky before rain”
Stroke-level output, so we can animate or render brush pressure—not just pixels
Perhaps start by:
Fine-tuning a diffusion model like Calliffusion with multi-character layout data?
Training on annotated datasets with symbolic annotations (like “here, a dot implies something”)?
Using Hanzi stroke decomposition datasets to reconstruct SVGs and animate the strokes?
Blending this with a custom rendering engine that understands both brush and poetry?
Who knows, maybe yours will be able to take a set of Chinese character, a text prompt and even a image and paint something magical like this video my grandma shared with me earlier last week... speaking of which, maybe that's why the idea of calligraphic art came to me this beautiful afternoon...
Liao, Q., Li, L., Fei, Y., & Xia, G. (2024). CalliffusionV2: Personalized natural calligraphy generation with flexible multi-modal control. arXiv. https://doi.org/10.48550/arXiv.2410.03787
Liao, Q., Li, L., Fei, Y., & Xia, G. (2023). Calliffusion: Personalized Chinese calligraphy generation with diffusion models. ICCC 2023.
Wu, S. J., Yang, C.-Y., & Hsu, J. Y. (2020). CalliGAN: Style and structure-aware Chinese calligraphy character generator. arXiv. http://export.arxiv.org/pdf/2005.12500
Liu, Z., Li, Y., Wang, K., Wu, Y., & Sun, W. (2024). Moyun: A diffusion-based Chinese calligraphy generation model with triple-label control. arXiv. https://arxiv.org/abs/2410.07618
Zhang, X., Han, T., Bai, S., Liu, H., & Xie, E. (2024). DP-Font: Diffusion-based calligraphy font generation with physical constraints. IJCAI 2024.
Liao, Q., Li, L., Fei, Y., & Xia, G. (2023). CalliPaint: Chinese calligraphy inpainting with diffusion models. Workshop on AI for Cultural Heritage, ICCV 2023.
Ma, L., Chen, C., Jin, X., Wang, Y., & Zeng, B. (2023). GlyphDraw: Integrating glyph control into diffusion for better Chinese text generation. arXiv. https://arxiv.org/abs/2303.17870
Lu, Y., Zhao, Y., Pan, J., & Han, J. (2024). AnyText: Scene text editing and generation with diffusion models. ModelScope, Alibaba DAMO Academy.
Sphene Labs