Brush & Byte: The Whisper Between 籃 and 藍 - A Calligraphy Experiment with AI

This beautiful afternoon, I sit by the pine trees munching on brownies trying to decide if I should name this book 「天籃」or「天藍」.... (The difference is in the radical of the second character "basket" vs "blue"). Then it occurs to me, why not both? Yes, I decide I want a beautiful Chinese calligraphy that can be read both ways! It needs to be abstract and subtly metaphorical: 墨中藏色，天籃亦天藍 (meaning the ink carries hidden colour in such a way that it reads like both "sky basket" and "sky blue".) So like most of us do in 2025, I turn to my AI buddies for help.

Here is what I want. Try imagine it with your human intelligence first.

...「天籃」in Chinese calligraphy so that: The black ink strokes form「天藍」but with the addition of two subtle strokes (abstract e.g. as water droplet dots), the visual impression evokes「天籃」(sky blue) Artistic Concept: Write 「天藍」 in elegant calligraphy using black ink:「天」at the top, upright and spacious like the sky 「藍」below, complex yet balanced like a woven basket... Then, add two small translucent water droplet dots (possibly in very light blue) under the radical 艹 in「藍」, mimicking the 竹 in「籃」. 草字頭與竹字頭不一樣的部分用藍色寫 Visual and Symbolic Effect: From a distance, or with a viewer’s imagination, the droplets complete the transformation from 藍 to 籃 the original meaning “heavenly basket” (天籃) becomes double-layered with the carefree feeling of sky blue (天藍)...

After trying lots of variations of prompts in both English and Chinese, I give up on ChatGPT's Dalle model series. ChatGPT's most attempts are simply wrong. Even the best attempt (see below) lacks the calligraphy strokes let alone the emotion and creativity of a true calligrapher. They certainly are not what I'm looking for.

Most of ChatGPT's attempts did not even write the characters properly. Even the best attempt (the one on the right) lacks the calligraphy strokes let alone the emotion and creativity of a true calligrapher. Good try adding the picture element of a basket in there but it would be better to make it so the strokes of the characters are integrated into the lines of the basket instead.

I ended up testing all the usual tools—and even signing up to some not-so-usual ones that claim to be able to write Chinese calligraphy well:

DALL·E (diffusion-based) – beautiful in many domains, but struggled with precise character rendering and stylistic control.
Stable Diffusion (with LoRA & fine prompts) – got us closer, but the strokes were often warped, and it couldn’t handle 「籃」with the integrity calligraphy demands.
NightCafe – easy to use, but outputs felt more decorative font than brush-born.
There’s An AI For That – yes, but many are poorly-performing...
YesChat Calligraphy plugin – font-based generation, better control, and we even layered droplets with PIL. It was clever… but lacked soul. Actually, it lacked the whole second character. LOL. 🤣

Oh. It didn't do anything I asked it to do. It looks aesthetic though.

Umm... This is probably the worst one so far.

Some of Yeschat's attempts are interesting. It actually wrote a script to draw it. Here's the script:

// Load the custom Chinese calligraphy font
font_path = "/mnt/data/海纳百川-上首华凤书法体.ttf"
font_size = 240 # for individual character sizing

// Create a blank white image for 2 characters vertically (300x600)
image_width = 300
image_height = 600
background_color = "white"
text_color = "black"

img = Image.new("RGB", (image_width, image_height), color=background_color)
draw = ImageDraw.Draw(img)
font = ImageFont.truetype(font_path, font_size)

// Characters to draw
characters = ["天", "籃"]

// Draw each character in its respective 300x300 box, vertically aligned
for i, char in enumerate(characters):
	w, h = draw.textsize(char, font=font)
	x = (image_width - w) // 2
	y = (300 * i) + (300 - h) // 2
	draw.text((x, y), char, font=font, fill=text_color)

// Add two small translucent blue dots above the "艹" of 籃 to hint at "藍"
// Position estimated: near the top of the second 300x300 box
dot_radius = 6
dot_color = (173, 216, 230, 180) # light blue with some transparency

// Convert to RGBA for transparency support
img = img.convert("RGBA")
draw = ImageDraw.Draw(img)

// Estimate positions for the water droplets (based on character layout)
dot_positions = [(120, 330), (160, 330)] # rough x positions over "艹" of 籃
for pos in dot_positions:
	x, y = pos
	draw.ellipse((x - dot_radius, y - dot_radius, x + dot_radius, y + dot_radius), fill=dot_color)

// Convert back to RGB and show
img = img.convert("RGB")
output_path = "/mnt/data/TianLan_artistic.jpg"
img.save(output_path)

// Display the image
plt.imshow(img)
plt.axis('off')
plt.title("Artistic Calligraphy: 天籃 evoking 天藍")
plt.show()

The wait was exciting. The result? Well, I had a good laugh.

Wow. That's disappointing. I guess diffusion based models currently lacks that level of symbolic compositional control that can be inputed from prompts...

Makes you wonder what state-of-art academic research has been able to achieve so far? Same here.

Academic Research

My favourite research buddy recently is Scispage. It's great. Just try it if you are reading this line and haven't used it already. In short, there is a lot of really neat diffusion-based models and GAN-based models good at generating individual callgrahic characters with stylistic precision; however, none give us the spatial and symbolic control we need to see「藍」through「籃」.

Model	Year	Method	Input Control	Limitation
Calliffusion	2023	Diffusion	Chinese prompts + calligraphy style label	Only one character at a time; no layout or symbolic overlay support
CalliffusionV2	2024	Diffusion + LoRA + Multi-modal	Natural language prompt + image guidance	No support for multi-char layout or compositional creativity
Moyun	2024	Diffusion + Vision Mamba	Triplet input (script, calligrapher, character)	No prompt-based or image-based control; layout & symbolic overlays not supported
DP-Font	2024	Diffusion + PINN (Physics-Informed Neural Network)	Attribute + stroke constraints

So What Can We Build?

So someone wants out there reading wants to give it a go? I think what we really need is a hybrid of:

Fine control over layout, stroke dynamics, and multi-character composition
Support for overlays or compositional semantics (e.g., knowing where to place droplets symbolically)
Promptability, so we can say things like: “Write 天籃 in a way that some might read it as 天藍. 王羲之’s style, like it’s under a sky before rain”
Stroke-level output, so we can animate or render brush pressure—not just pixels

Perhaps start by:

Fine-tuning a diffusion model like Calliffusion with multi-character layout data?
Training on annotated datasets with symbolic annotations (like “here, a dot implies something”)?
Using Hanzi stroke decomposition datasets to reconstruct SVGs and animate the strokes?
Blending this with a custom rendering engine that understands both brush and poetry?

Who knows, maybe yours will be able to take a set of Chinese character, a text prompt and even a image and paint something magical like this video my grandma shared with me earlier last week... speaking of which, maybe that's why the idea of calligraphic art came to me this beautiful afternoon...

References

Liao, Q., Li, L., Fei, Y., & Xia, G. (2024). CalliffusionV2: Personalized natural calligraphy generation with flexible multi-modal control. arXiv. https://doi.org/10.48550/arXiv.2410.03787

Liao, Q., Li, L., Fei, Y., & Xia, G. (2023). Calliffusion: Personalized Chinese calligraphy generation with diffusion models. ICCC 2023.

Wu, S. J., Yang, C.-Y., & Hsu, J. Y. (2020). CalliGAN: Style and structure-aware Chinese calligraphy character generator. arXiv. http://export.arxiv.org/pdf/2005.12500

Liu, Z., Li, Y., Wang, K., Wu, Y., & Sun, W. (2024). Moyun: A diffusion-based Chinese calligraphy generation model with triple-label control. arXiv. https://arxiv.org/abs/2410.07618

Zhang, X., Han, T., Bai, S., Liu, H., & Xie, E. (2024). DP-Font: Diffusion-based calligraphy font generation with physical constraints. IJCAI 2024.

Liao, Q., Li, L., Fei, Y., & Xia, G. (2023). CalliPaint: Chinese calligraphy inpainting with diffusion models. Workshop on AI for Cultural Heritage, ICCV 2023.

Ma, L., Chen, C., Jin, X., Wang, Y., & Zeng, B. (2023). GlyphDraw: Integrating glyph control into diffusion for better Chinese text generation. arXiv. https://arxiv.org/abs/2303.17870

Lu, Y., Zhao, Y., Pan, J., & Han, J. (2024). AnyText: Scene text editing and generation with diffusion models. ModelScope, Alibaba DAMO Academy.

More from Sphene Labs: Open Lab Journal 📓

Cover image for Research Roundup: How AI and LLMs Are Revolutionizing Smart Contract Security

Sphene Labs: Open Lab Journal 📓

May 10

Research Roundup: How AI and LLMs Are Revolutionizing Smart Contract Security

Insights from 7 Recent Papers on Using AI to Detect and Fix Smart Contract Vulnerabilities

Cover image for Quantum Models of DNA Replication Fidelity Using Decoherence-Inspired Error Correction

Sphene Labs: Open Lab Journal 📓

Nov 4

Quantum Models of DNA Replication Fidelity Using Decoherence-Inspired Error Correction

Unlocking DNA's Quantum Mysteries: How Decoherence and Error Correction Shape Replication Fidelity

Cover image for The Symposium at the Stoa: Navigating the Necessity of Active Trading in Sustainable Finance

Sphene Labs: Open Lab Journal 📓

Jun 6

The Symposium at the Stoa: Navigating the Necessity of Active Trading in Sustainable Finance

Exploring the Ethics of Trading: A Deliberation on Value, Impact, and the Future of Finance

Here is what I want. Try imagine it with your human intelligence first.

...「天籃」in Chinese calligraphy so that: The black ink strokes form「天藍」but with the addition of two subtle strokes (abstract e.g. as water droplet dots), the visual impression evokes「天籃」(sky blue) Artistic Concept: Write 「天藍」 in elegant calligraphy using black ink:「天」at the top, upright and spacious like the sky 「藍」below, complex yet balanced like a woven basket... Then, add two small translucent water droplet dots (possibly in very light blue) under the radical 艹 in「藍」, mimicking the 竹 in「籃」. 草字頭與竹字頭不一樣的部分用藍色寫 Visual and Symbolic Effect: From a distance, or with a viewer’s imagination, the droplets complete the transformation from 藍 to 籃 the original meaning “heavenly basket” (天籃) becomes double-layered with the carefree feeling of sky blue (天藍)...

I ended up testing all the usual tools—and even signing up to some not-so-usual ones that claim to be able to write Chinese calligraphy well:

DALL·E (diffusion-based) – beautiful in many domains, but struggled with precise character rendering and stylistic control.
Stable Diffusion (with LoRA & fine prompts) – got us closer, but the strokes were often warped, and it couldn’t handle 「籃」with the integrity calligraphy demands.
NightCafe – easy to use, but outputs felt more decorative font than brush-born.
There’s An AI For That – yes, but many are poorly-performing...
YesChat Calligraphy plugin – font-based generation, better control, and we even layered droplets with PIL. It was clever… but lacked soul. Actually, it lacked the whole second character. LOL. 🤣

Some of Yeschat's attempts are interesting. It actually wrote a script to draw it. Here's the script:

// Load the custom Chinese calligraphy font
font_path = "/mnt/data/海纳百川-上首华凤书法体.ttf"
font_size = 240 # for individual character sizing

// Create a blank white image for 2 characters vertically (300x600)
image_width = 300
image_height = 600
background_color = "white"
text_color = "black"

img = Image.new("RGB", (image_width, image_height), color=background_color)
draw = ImageDraw.Draw(img)
font = ImageFont.truetype(font_path, font_size)

// Characters to draw
characters = ["天", "籃"]

// Draw each character in its respective 300x300 box, vertically aligned
for i, char in enumerate(characters):
	w, h = draw.textsize(char, font=font)
	x = (image_width - w) // 2
	y = (300 * i) + (300 - h) // 2
	draw.text((x, y), char, font=font, fill=text_color)

// Add two small translucent blue dots above the "艹" of 籃 to hint at "藍"
// Position estimated: near the top of the second 300x300 box
dot_radius = 6
dot_color = (173, 216, 230, 180) # light blue with some transparency

// Convert to RGBA for transparency support
img = img.convert("RGBA")
draw = ImageDraw.Draw(img)

// Estimate positions for the water droplets (based on character layout)
dot_positions = [(120, 330), (160, 330)] # rough x positions over "艹" of 籃
for pos in dot_positions:
	x, y = pos
	draw.ellipse((x - dot_radius, y - dot_radius, x + dot_radius, y + dot_radius), fill=dot_color)

// Convert back to RGB and show
img = img.convert("RGB")
output_path = "/mnt/data/TianLan_artistic.jpg"
img.save(output_path)

// Display the image
plt.imshow(img)
plt.axis('off')
plt.title("Artistic Calligraphy: 天籃 evoking 天藍")
plt.show()

Wow. That's disappointing. I guess diffusion based models currently lacks that level of symbolic compositional control that can be inputed from prompts...

Makes you wonder what state-of-art academic research has been able to achieve so far? Same here.

Academic Research

Model	Year	Method	Input Control	Limitation
Calliffusion	2023	Diffusion	Chinese prompts + calligraphy style label	Only one character at a time; no layout or symbolic overlay support
CalliffusionV2	2024	Diffusion + LoRA + Multi-modal	Natural language prompt + image guidance	No support for multi-char layout or compositional creativity
Moyun	2024	Diffusion + Vision Mamba	Triplet input (script, calligrapher, character)	No prompt-based or image-based control; layout & symbolic overlays not supported
DP-Font	2024	Diffusion + PINN (Physics-Informed Neural Network)	Attribute + stroke constraints

So What Can We Build?

So someone wants out there reading wants to give it a go? I think what we really need is a hybrid of:

Fine control over layout, stroke dynamics, and multi-character composition
Support for overlays or compositional semantics (e.g., knowing where to place droplets symbolically)
Promptability, so we can say things like: “Write 天籃 in a way that some might read it as 天藍. 王羲之’s style, like it’s under a sky before rain”
Stroke-level output, so we can animate or render brush pressure—not just pixels

Perhaps start by:

Fine-tuning a diffusion model like Calliffusion with multi-character layout data?
Training on annotated datasets with symbolic annotations (like “here, a dot implies something”)?
Using Hanzi stroke decomposition datasets to reconstruct SVGs and animate the strokes?
Blending this with a custom rendering engine that understands both brush and poetry?