# What is a nilpotent matrix?

By [sonadorje](https://paragraph.com/@sonadorje) · 2026-04-02

---

![](https://storage.googleapis.com/papyrus_images/21484978c4b97bbaa70ccc910b9a7abdc69b16f766eace3214af4126cea13227.png)

To understand the position of the nilpotent matrix in Jordan canonical form, we can think of it as a "Lego brick" in the world of linear algebra.

If a diagonal matrix is ​​a "pure stretch," then a null-power matrix is ​​an "ordered collapse." The core idea of ​​the Jordan canonical form is to break down any complex linear transformation into a perfect combination of **"stretching"** and **"collapse"**.

* * *

1\. What is a zero-power matrix? (Intuitive understanding)
----------------------------------------------------------

A matrix $$N$$ is called a **nilpotent matrix** if there exists some positive integer $$k$$ such that: $$N^k = 0$$

Geometrically, null-power matrices are quite interesting. Unlike rotations or scaling, which preserve the dimensions of space, they are more like a "black hole" or a "crusher": **Step 1:** It flattens a portion of the space. **Step Two:** It continues to flatten the remaining portion.

*   **Finally:** After a finite number of operations, the entire space is compressed to the origin (zero vector).
    

**Illustrated Concept:** Imagine a staircase in three-dimensional space, where each step represents a dimension. The purpose of a null-power matrix is ​​to make a vector "descend the stairs." When it reaches the ground (null space), it disappears.

* * *

2\. Anatomy of the Jordan Block: Stretch + Offset
-------------------------------------------------

Any Jordan block $$J(\\lambda)$$ can be split into two parts:

$$\\begin{aligned} J &= \\underbrace{\\begin{pmatrix} \\lambda & 0 & 0 \\\\ 0 & \\lambda & 0 \\\\ 0 & 0 & \\lambda \\end{pmatrix}}\_{\\text{Diagonal Part}} + \\underbrace{\\begin{pmatrix} 0 & 1 & 0 \\\\ 0 & 0 & 1 \\\\ 0 & 0 & 0 \\end{pmatrix}}\_{\\text{Nilpotent Part } N} \\end{aligned}$$

in:

1.  **$$\\lambda I$$ (diagonal part):** Responsible for uniformly stretching or compressing in all directions (scaling factor of $$\\lambda$$).
    
2.  **$$N$$ (Nullifier Part):** Responsible for **concatenation**. It is a matrix where all elements except the second diagonal are 1.
    

### Why is this $$N$$ needed?

Ideally, if a matrix has enough eigenvectors, we can diagonalize it. But some matrices are "stubborn" and don't have enough eigenvectors. This is where $$N$$ comes in; it fills in the missing dimensions using a **Jordan chain**.

* * *

3\. Core Application: Transitivity of the Jordan Chain
------------------------------------------------------

The core role of a null-power matrix in decomposition is to construct a **vector chain**.

Let's take a 3x3 example (the most intuitive one):

$$ N = \\begin{pmatrix} 0 & 1 & 0 \\\\ 0 & 0 & 1 \\\\ 0 & 0 & 0 \\end{pmatrix} $$

Its "disappearance process" is super fun:

$$ N^2 = \\begin{pmatrix} 0 & 0 & 1 \\\\ 0 & 0 & 0 \\\\ 0 & 0 & 0 \\end{pmatrix}, \\quad N^3 = \\begin{pmatrix} 0 & 0 & 0 \\\\ 0 & 0 & 0 \\\\ 0 & 0 & 0 \\end{pmatrix} $$

(N³ = 0, but N² ≠ 0, so the exponent of zero power is 3)

Its effect on a set of bases $${v\_1, v\_2, v\_3}$$ is as follows:

*   $$N v\_3 = v\_2$$
    
*   $$N v\_2 = v\_1$$
    
*   $$N v\_1 = 0$$
    

This forms a **cascaded structure**: $$v\_3 \\to v\_2 \\to v\_1 \\to 0$$.

**Geometric Interpretation:** N acts like a "chain," pushing the generalized eigenvectors one by one until they reach 0. This is why the size of the Jordan block reflects the degree to which the matrix is ​​"not perfectly diagonalizable"!

In short: **The null-power matrix is ​​the "glue" in Jordan canonical form—it binds the individual λ values ​​together, allowing any matrix to be "normalized." Without it, Jordan form would not exist!**

**Visual Imagination:** This is like a set of falling dominoes. Each vector points to its "upstream," while the last vector points to nothingness. In the Jordan canonical form, each Jordan block is actually such an independent sequence of dominoes.

* * *

Summarize
---------

**The zero-power matrix plays a "glue" role in Jordan decomposition:**

**Structurally:** It fills the gap caused by insufficient feature vectors, stringing together independent feature vectors into a "chain".

*   **Computationally:** It utilizes the property that "a finite number of multiplications will always result in zero" to reduce complex transcendental functions (such as $$e^A$$) to simple polynomials.
    
*   **Intuitively:** It represents the **shearing effect** or **delay effect** in the system.
    

It is because of the zero-power matrix that we can say that any linear transformation can essentially be decomposed into a combination of several independent "scaling + shearing" units.

* * *

Calculation of generalized eigenvectors
=======================================

The process of calculating generalized eigenvectors is essentially about finding those vectors that, although not on the same line as the eigenvectors, will eventually 'collapse' back to the eigenvectors after matrix transformations.

If eigenvectors are the "trunk" of a system, then generalized eigenvectors are the "branches and leaves" attached to the trunk.

* * *

1\. Why do we need to look for generalized eigenvectors?
--------------------------------------------------------

When we find that the geometric multiplicity (the number of eigenvectors) of an eigenvalue $$\\lambda$$ is less than its algebraic multiplicity (the degree of the root of the characteristic equation), it means that ordinary eigenvectors are insufficient and the space is "missing".

To complete the base, we need to build the **Jordan Chain**.

* * *

2\. Algorithm Core: "Climbing Up" Strategy
------------------------------------------

The most intuitive way to find generalized eigenvectors is recursion. Suppose we have found a true eigenvector $$v\_1$$ (satisfying $$(A - \\lambda I)v\_1 = 0$$).

Our goal is to find $$v\_2, v\_3, \\dots$$ such that:

1.  $$(A - \\lambda I)v\_1 = 0$$ (This is the foundation: eigenvectors)
    
2.  $$(A - \\lambda I)v\_2 = v\_1$$ (This is the first-order generalized eigenvector)
    
3.  $$(A - \\lambda I)v\_3 = v\_2$$ (This is the second-order generalized eigenvector)
    
4.  ...and so on.
    

### The mathematical logic here:

Note that $$v\_2$$, although not an eigenvector, becomes zero if we apply the operator **twice**: $$(A - \\lambda I)^2 v\_2 = (A - \\lambda I) v\_1 = 0$$ This means that $$v\_k$$ belongs to the null space of the matrix $$(A - \\lambda I)^k$$.

* * *

3\. Visual Understanding: The "Baton" of Vectors
------------------------------------------------

We can think of this process as a **relay race**:

*   **$$v\_k$$** has received the baton.
    
*   Through the action of the operator $$(A - \\lambda I)$$, it passes the baton to the previous **$$v\_{k-1}$$**. Finally, the baton was passed to **$$v\_1$$**. After its final action, \* $$v\_1$$ crossed the finish line directly (becoming **$$0$$**).
    

**Diagram Structure:**

> $$v\_k \\xrightarrow{A-\\lambda I} v\_{k-1} \\xrightarrow{A-\\lambda I} \\dots \\xrightarrow{A-\\lambda I} v\_1 \\xrightarrow{A-\\lambda I} 0$$ This sequence of vectors forms the basis of a Jordan block.

* * *

4\. Actual Calculation Steps (Taking a $$3 \\times 3 $$ matrix as an example)
-----------------------------------------------------------------------------

\*\*Focusing solely on a specific Jordan chain of length 3, let's examine what happens to its vectors under the influence of the powers $$N, N^2, N^3$$ of the nilpotent matrix $$N$$.

This is the cornerstone of understanding the whole problem.

* * *

### 4.1 Setting the Scene

Suppose we have a nilpotent matrix $$N$$ (e.g., $$N = A - \\lambda I$$, where $$\\lambda$$ are eigenvalues). We find a **Jordan chain** of length 3 consisting of three linearly independent vectors $${v\_1, v\_2, v\_3}$$ that satisfy the following **unique, core recurrence relation**: $$ \\begin{aligned} N v\_1 &= 0 \\ N v\_2 &= v\_1 \\ N v\_3 &= v\_2 \\end{aligned} $$ This relationship is the **definition** of the Jordan Chain. Please remember it.

* * *

### 4.2 Examining step-by-step which "kernel space" these three vectors belong to.

**Definition of kernel space $$\\ker(N^m)$$**: The set of all vectors $$\\mathbf{x}$$ that satisfy $$N^m \\mathbf{x} = \\mathbf{0}$$.

**(1) First level: $$\\ker N$$ (vectors satisfying $$N\\mathbf{x}=0$$)**

*   $$N v\_1 = 0$$ ⇒ Therefore $$v\_1 \\in \\ker N$$.
    
*   $$N v\_2 = v\_1 \\neq 0$$ ⇒ So $$v\_2 \\notin \\ker N$$.
    
*   $$N v\_3 = v\_2 \\neq 0$$ ⇒ So $$v\_3 \\notin \\ker N$$.
    

**Conclusion:** In the subspace generated by this chain, **only $$v\_1$$ belongs to $$\\ker N$$**. It contributes **1 dimension**.

**(2) Second level: $$\\ker N^2$$ (vectors satisfying $$N^2\\mathbf{x}=0$$)** We need to calculate $$N^2$$ to act on each vector.

*   $$N^2 v\_1 = N(N v\_1) = N(0) = 0$$ ⇒ Therefore $$v\_1 \\in \\ker N^2$$. (This is natural because $$\\ker N \\subset \\ker N^2$$)
    
*   $$N^2 v\_2 = N(N v\_2) = N(v\_1) = 0$$ ⇒ **Here's the key point**: Although $$N v\_2 \\neq 0$$, $$N^2 v\_2 = 0$$. Therefore, $$v\_2 \\in \\ker N^2$$.
    
*   $$N^2 v\_3 = N(N v\_3) = N(v\_2) = v\_1 \\neq 0$$ ⇒ So $$v\_3 \\notin \\ker N^2$$.
    

**Conclusion:** In the subspace generated by this chain, $${v\_1, v\_2}$$ all satisfy $$N^2 \\mathbf{x} = 0$$, they are linearly independent, and therefore **together span a 2-dimensional subspace contained in $$\\ker N^2$$**. Compared to $$\\ker N$$, **an additional vector $$v\_2$$ has been added**, thus increasing the dimension by **1**.

**(3) Third level: $$\\ker N^3$$ (vectors that satisfy $$N^3\\mathbf{x}=0$$)**

*   $$N^3 v\_1 = N^2(0) = 0$$ ⇒ $$v\_1 \\in \\ker N^3$$.
    
*   $$N^3 v\_2 = N^2(v\_1) = 0$$ ⇒ $$v\_2 \\in \\ker N^3$$.
    
*   $$N^3 v\_3 = N^2(v\_2) = N(v\_1) = 0$$ ⇒ **Look here**: Now $$v\_3$$ also satisfies $$N^3 v\_3 = 0$$! So $$v\_3 \\in \\ker N^3$$.
    

**Conclusion:** The entire chain $${v\_1, v\_2, v\_3}$$ falls within $$\\ker N^3$$, contributing a 3-dimensional subspace. Compared to $$\\ker N^2$$, **an additional vector $$v\_3$$ has been added**, increasing the dimension by **1**.

* * *

### 4.3 Draw an "unlock" diagram for this task.

You can think of $$\\ker N^m$$ as an "m-level security door".

*   $$v\_1$$ has the highest privileges and can pass through **Level 1 doors** ($$N$$).
    
*   $$v\_2$$ has lower privileges; it cannot pass through level 1 doors, but it can pass through **level 2 doors** ($$N^2$$). Its "pass" is $$N^2 v\_2 = 0$$.
    
*   $$v\_3$$ has the lowest privilege level; it cannot even pass through a level 2 door, but it can pass through a level 3 door ($$N^3$$).
    

**From $$\\ker N$$ to $$\\ker N^2$$**, the gate is relaxed, and $$v\_2$$ is "released" in. **The dimension is thus increased by 1**. **From $$\\ker N^2$$ to $$\\ker N^3$$**, the access control becomes looser, and $$v\_3$$ is also "released" in. **The dimension increases by 1**.

* * *

### 4.4 Rules Connecting to "Growth Amount"

Now, consider this chain as a "block":

This block contributes 1 dimension (i.e., $$v\_1$$) to $$\\dim \\ker N$$. This block contributes **2 dimensions** ($$v\_1, v\_2$$) to $$\\dim \\ker N^2$$. Compared to the first level, the **growth is +1**. This "+1" occurs **if and only if the length of this block is ≥2**. This block contributes **3 dimensions** ($$v\_1, v\_2, v\_3$$) to $$\\dim \\ker N^3$$. Compared to the second level, the **growth is again +1**. This "+1" occurs **only if** the length of this block is ≥3.

**Promotion**:

*   If a block is of length 2 (only $${v\_1, v\_2}$$), then it will contribute +1 growth from $$\\ker N$$ to $$\\ker N^2$$, but will not grow any further at $$\\ker N^3$$ (because there is no $$v\_3$$).
    
*   If a block has a length of 1 (only $${v\_1}$$), then it contributes only 1 dimension in any $$\\ker N^m$$, **and will never bring growth**.
    

* * *

5\. Analogy from a Developer's Perspective
------------------------------------------

If you are familiar with **singly linked lists**, the structure of a generalized feature vector is actually a linked list: Each vector vi has a "pointer" that points to its next form. In the code, this is like a recursive call that continues until the `Base Case` (i.e., the zero vector) is reached.

This chained structure ensures that under the transformation of $$P^{−1}AP$$, the matrix is ​​forced to be written as:

*   The main diagonal is λ. The second diagonal (immediately above the main diagonal) is all 1. Here, 1 represents the algebraic representation of the relationship $$(A−λI)v\_k=v\_{k−1}$$.
    

**Advanced Thinking:** In practical numerical computation (such as processing noisy data), finding generalized eigenvectors is a very "ill-conditioned" problem because even slight perturbations can break the Jordan chain. This is why, in engineering practice, we tend to use **SVD (Singular Value Decomposition)** or **Schur Decomposition** rather than rigidly adhering to the Jordan canonical form.

---

*Originally published on [sonadorje](https://paragraph.com/@sonadorje/what-is-a-nilpotent-matrix)*
