Smart contracts are the backbone of decentralized finance (DeFi), NFTs, DAOs, and many other Web3 innovations. But they’re also permanently deployed, immutable, and often handle millions in assets. Once a contract is deployed to the blockchain, you can’t just patch a bug like in traditional software—any vulnerability becomes a target.
In 2022 and 2023 alone, over $3 billion was lost due to smart contract exploits, many of which could have been avoided with better auditing tools or detection systems.
That’s why security is critical—not just during audits, but throughout the development lifecycle. And now, thanks to advances in AI, developers have access to powerful LLM-based tools that can assist in writing, reviewing, and even fixing vulnerable code.
These are recent research papers from 2024, offering valuable insights for developers and security researchers exploring AI-assisted auditing.
Paper Title | Insight & Approach | Main Use Case | Results |
---|---|---|---|
Smart Contract Vulnerability Detection: The Role of LLM Biagio Boi, Christian Esposito, Sokjoon Lee. ACM SAC Review, May 2024 | Uses GPT-3.5 trained on annotated vulnerabilities to detect complex bugs in Solidity code. | Pre-deployment bug detection | LLMs significantly improve vulnerability coverage during audits. |
LLMSmartSec Viraaji Mothukuri, Reza M. Parizi, James L. Massa. Aug 2024 | Integrates fine-tuned GPT-4 with annotated control flow graphs for logic-aware auditing. | Logic-based vulnerability detection and auto-fix | Reduces reliance on manual audits with cost-effective automation. |
GPTScan Yuqiang Sun, Daoyuan Wu, Yue Xue, et al. Apr 2024 | Combines GPT with semantic slicing and program analysis for deep code understanding. | Finding logic bugs in money flow and control structures | Identifies ~80% of logic flaws missed by traditional tools. |
ContractTinker Che Wang, Jiashuo Zhang, Jianbo Gao, et al. Oct 2024 | Uses Chain-of-Thought prompting with static analysis to suggest code patches. | Automated vulnerability repair | Successfully fixes high-severity real-world contract bugs. |
Prompt-Guided ChatGPT Jiarun Ma, Shiling Feng, Jiahao Zeng, et al. Aug 2024 | Enhances ChatGPT with structured prompts and opcode-aware heuristics. | Detecting known vulnerability patterns | Prompt tuning significantly improves detection accuracy. |
SLLM System Yunlai Zhou, Jianzhong Qi, Jin Zhu. Oct 2024 | Fuses Slither with GPT-4 using few-shot prompting for better bug classification. | Classifying vulnerabilities accurately | Reduces false positives and improves audit precision. |
ACFIX Lyuye Zhang, Kaixuan Li, Kairan Sun, et al. Mar 2024, arXiv | Guides GPT-4 using mined access control (RBAC) patterns for secure patch generation. | Fixing access control bugs | Achieves 94.9% repair accuracy; outperforms baseline GPT. |
LLMs can detect bugs well, especially when trained or fine-tuned on real vulnerability data (e.g., GPT-3.5, GPT-4, Llama-2).
Hybrid approaches dominate — the most effective tools combine LLMs with traditional program analysis (like Control Flow Graphs or static analysis via Slither).
Contract repair is real — tools like ContractTinker and ACFIX show LLMs can not only find bugs but also suggest patches with high success rates.
Prompt engineering works — ChatGPT’s vulnerability detection improves drastically with structured prompts and domain-specific guidance.
False positives are being reduced — systems like SLLM convert results into pseudocode and loop in feedback, making audits more accurate and less noisy.
Access control is a major focus — several papers, especially ACFIX, target fixing privilege escalation and role-based vulnerabilities effectively.
Cross-contract logic and subtle money flow bugs — previously missed by static tools — are now detectable by LLM-assisted tools like GPTScan and xFuzz.
While AI and LLM-based tools offer great support, manual review remains essential for catching subtle, business logic–specific issues. Here's a simplified checklist for developers doing a hands-on inspection:
Understand What Your Contract is Supposed to Do
Write a plain-English description of your contract’s purpose.
List expected user roles (e.g., admin, user) and what they should/shouldn’t be able to do.
This becomes your mental threat model.
Know the Architecture of Your Project
List all your contracts. Which ones are yours? Which ones are imported (e.g., OpenZeppelin)?
Understand how contracts are connected: Who calls whom? Who owns what?
Use the VS Code extension Solidity Visual Developer to visually inspect contract structure, call graphs, function dependencies, and inheritance.
This gives you a map of your system—which is essential to understand the blast radius of any function.
Build a Simple Threat Model
For each contract, answer:
What can an attacker try to manipulate?
Where are funds stored or moved?
Who can call upgrade/admin functions?
Are there external calls? Are return values checked?
Example output
Use annotations like @audit
 in your code to bookmark lines of concern if using the Solidity Visual Developer extension.
Consider documenting findings in a spreadsheet with columns: Function | Risk Level | Reason | Mitigation
The goal is to list assumptions and then try to break them or ensure they’re enforced in code.
Check Transfer Logic and Access Control
Search your contract code for these sensitive keywords:
For each occurrence, ask yourself:
Who is allowed to call this function?
Is the access control enforced with a modifier or require()
?
Are return values checked (e.g., does transfer()
 return true
)?
Is the value coming from user input or untrusted source?
If using delegatecall
, is it protected by strict access control and input validation?
If using token transfers, do you confirm success and avoid silent failures?
Use Slither to Catch Common Vulnerabilities
Example findings:
NaiveReceiverPool.flashLoan(...)
 uses arbitrary from
 in transferFrom()
 → potential for unauthorized token movement
UnstoppableVault.execute(...)
 uses delegatecall
 on user-supplied data → critical control risk
DamnValuableStaking.stake()
 ignores return value of transferFrom(...)
 → may fail silently
ShardsNFTMarketplace._closeOffer(...)
 performs external calls before updating state → reentrancy risk
FreeRiderRecoveryManager.onERC721Received()
 uses tx.origin
 for auth → replace with msg.sender
Math.sol and other files use ^ (bitwise XOR) where ** was likely intended → incorrect math
Don’t assume a contract is safe just because it’s small.
Don’t trust input values without require()
 checks.
Don’t use tx.origin
 for authentication.
Don’t ignore compiler warnings—they often flag real risks.
AI-assisted security tools are not meant to replace traditional audits—but they streamline the development process, help detect bugs early, and reduce time-to-audit.
Here’s how smart contract developers and auditors can practically integrate these tools into their workflows:
IDE Integration: Use tools like Armur AI or LLMSmartSec that plug into VS Code or your editor of choice to get real-time feedback as you write Solidity.
AI Code Review: After writing a function or module, prompt GPT-based tools to simulate an adversary—ask: “Where can this function break?”
Prompt-Guided Checks: Apply techniques from Prompt-Guided ChatGPT or ACFIX to guide LLMs with your specific logic or access control patterns.
Scan Entire Repos: Use SolidityScan, GPTScan, or SLLM to scan contracts end-to-end for known bugs and logic inconsistencies.
Hybrid Analysis: Combine LLM suggestions with static analyzers like Slither, and fuzzers like Echidna to increase signal and reduce false positives.
Threat Modeling: Pair LLMs with your manual threat model to validate assumptions (e.g., “Can anyone bypass this modifier?”).
Regression Checks: When updating contracts via proxy or upgradable patterns, rerun LLM scans to identify newly introduced risks.
Bug Repair: Use tools like ContractTinker to generate patches for known vulnerabilities and test suggested fixes.
Knowledge Sharing: Store past prompts and their LLM responses to build an internal library of reusable prompts for future audits.
Used together, LLMs offer a powerful way to speed up reviews, detect non-obvious flaws, and elevate the baseline security posture of your contracts.
If you're building or auditing smart contracts:
Combine Slither/Echidna with LLM tools like GPTScan or SLLM to catch logic bugs early.
Use prompt techniques from ACFIX or LLMSmartSec to enhance GPT-based contract review.
Look into tools like ContractTinker for automatic remediation of discovered flaws.
For readers newer to smart contract development, here’s a quick breakdown of terms used in this article:
Term | What It Means |
---|---|
Smart Contract | A program stored on a blockchain that runs when triggered. Used to automate agreements or logic like token transfers or DAO voting. |
Immutable | Once deployed, the code cannot be changed unless it uses upgradeable patterns like proxy contracts. |
DeFi | Short for Decentralized Finance—apps like lending, trading, or yield farming that operate without centralized intermediaries. |
DAO | Decentralized Autonomous Organization—an on-chain group governed by smart contracts and token holders. |
LLM (Large Language Model) | AI models (like GPT-4) trained to understand and generate human language, and increasingly used to analyze or write code. |
Static Analysis | Examining code without executing it, often used to catch security issues in smart contracts. |
Prompt Engineering | Crafting effective prompts to guide an AI (like ChatGPT) to produce accurate, relevant, or secure responses. |
CFG (Control Flow Graph) | A diagram representing the execution paths within a smart contract. Useful for understanding logic and detecting flaws. |
Slither / Echidna | Popular open-source tools used to analyze and fuzz test Solidity smart contracts. |
Reentrancy | A type of vulnerability where an external contract repeatedly calls into a function before the first invocation finishes—can drain funds. |
tx.origin | A global variable in Solidity that should not be used for authentication—it refers to the original caller, not necessarily the current one. |
Fuzzing | Automated testing technique where random inputs are fed to a contract to try and trigger bugs or vulnerabilities. |
Proxy Pattern | A design that enables upgradeable smart contracts by separating logic and data storage. |
Blast Radius | The extent of damage if a function is exploited. |
Sphene Labs