OpenAI GPT 4o ranked as best AI model for writing Solidity smart contract code by IQ

October 21, 2024

0 0 2 minutes read

Receive, Manage & Grow Your Crypto Investments With Brighty

SolidityBench by IQ has launched as the primary leaderboard to judge LLMs in Solidity code era. Out there on Hugging Face, it introduces two modern benchmarks, NaïveJudge and HumanEval for Solidity, designed to evaluate and rank the proficiency of AI fashions in producing sensible contract code.

Developed by IQ’s BrainDAO as a part of its forthcoming IQ Code suite, SolidityBench serves to refine their very own EVMind LLMs and evaluate them in opposition to generalist and community-created fashions. IQ Code goals to supply AI fashions tailor-made for producing and auditing sensible contract code, addressing the rising want for safe and environment friendly blockchain purposes.

As IQ instructed CryptoSlate, NaïveJudge affords a novel strategy by tasking LLMs with implementing sensible contracts based mostly on detailed specs derived from audited OpenZeppelin contracts. These contracts present a gold normal for correctness and effectivity. The generated code is evaluated in opposition to a reference implementation utilizing standards reminiscent of purposeful completeness, adherence to Solidity finest practices and safety requirements, and optimization effectivity.

The analysis course of leverages superior LLMs, together with totally different variations of OpenAI’s GPT-4 and Claude 3.5 Sonnet as neutral code reviewers. They assess the code based mostly on rigorous standards, together with implementing all key functionalities, dealing with edge circumstances, error administration, correct syntax utilization, and total code construction and maintainability.

Optimization issues reminiscent of fuel effectivity and storage administration are additionally evaluated. Scores vary from 0 to 100, offering a complete evaluation throughout performance, safety, and effectivity, mirroring the complexities {of professional} sensible contract improvement.

Which AI fashions are finest for solidity sensible contract improvement?

Benchmarking outcomes confirmed that OpenAI’s GPT-4o mannequin achieved the best total rating of 80.05, with a NaïveJudge rating of 72.18 and HumanEval for Solidity cross charges of 80% at cross@1 and 92% at cross@3.

Apparently, newer reasoning fashions like OpenAI’s o1-preview and o1-mini had been overwhelmed to the highest spot, scoring 77.61 and 75.08, respectively. Fashions from Anthropic and XAI, together with Claude 3.5 Sonnet and grok-2, demonstrated aggressive efficiency with total scores hovering round 74. Nvidia’s Llama-3.1-Nemotron-70B scored lowest within the high 10 at 52.54.

SolidityBench scores for LLMs (Hugging Face)

Per IQ, HumanEval for Solidity adapts OpenAI’s unique HumanEval benchmark from Python to Solidity, encompassing 25 duties of various problem. Every activity contains corresponding assessments appropriate with Hardhat, a well-liked Ethereum improvement surroundings, facilitating correct compilation and testing of generated code. The analysis metrics, cross@1 and cross@3, measure the mannequin’s success on preliminary makes an attempt and over a number of tries, providing insights into each precision and problem-solving capabilities.

Objectives of using AI fashions in sensible contract improvement

By introducing these benchmarks, SolidityBench seeks to advance AI-assisted sensible contract improvement. It encourages the creation of extra subtle and dependable AI fashions whereas offering builders and researchers with worthwhile insights into AI’s present capabilities and limitations in Solidity improvement.

The benchmarking toolkit goals to advance IQ Code’s EVMind LLMs and in addition units new requirements for AI-assisted sensible contract improvement throughout the blockchain ecosystem. The initiative hopes to handle a essential want within the business, the place the demand for safe and environment friendly sensible contracts continues to develop.

Builders, researchers, and AI fanatics are invited to discover and contribute to SolidityBench, which goals to drive the continual refinement of AI fashions, promote finest practices, and advance decentralized purposes.

Go to the SolidityBench leaderboard on Hugging Face to be taught extra and start benchmarking Solidity era fashions.

Talked about on this article

Source link

OpenAI GPT 4o ranked as best AI model for writing Solidity smart contract code by IQ

Which AI fashions are finest for solidity sensible contract improvement?

Objectives of using AI fashions in sensible contract improvement

🤖 High AI Crypto Belongings

Talked about on this article

Leave a Reply Cancel reply

MetaMask and Blockaid partner to develop “privacy-preserving module” to enhance web3 security

peaq Connects with Over 30 Web3 Ecosystems: Unlocks Billions in Liquidity

Runestone NFT Floor Price Crashes to 0.03 BTC After Meme Coin Airdrop

‘Champions Ascension’ Enhances Gaming Experience with Amazon Prime

Four Major Web3 Gaming Projects Shutdown in One Week

Which AI fashions are finest for solidity sensible contract improvement?

Objectives of using AI fashions in sensible contract improvement

Talked about on this article

Why altcoins might surge 300% as Bitcoin dominance falls

IOTA (MIOTA) Price Prediction 2024 2025 2026 2027

Leave a Reply Cancel reply

Related Articles

Fobi Files 2025 Annual and Q1 & Q2 2026 Interim Financial Statements

Bitcoin (BTC) Price Prediction: $2.38T Crypto Market Cap and 58.2% BTC Dominance Signal Rotation

Bluwhale Empowers Anyone to Build and Earn from AI Financial Agents

Periodically Poled Lithium Niobate (PPLN) Crystal Market Size to Reach USD 600 Million by 2033; Growing at a CAGR of 8.7% | Top Players, Trends, and Regional Forecast