Open source models race to beat GPT-4 on coding tasks

Two open source models, WizardCoder 34B by Wizard LM and CodeLlama-34B by Phind, have been released in the last few days. Both models are based on Code Llama, a large language model (LLM) developed by Meta.

Wizard LM claims that WizardCoder 34B outperformed GPT-4, ChatGPT-3.5, and Claude-2 on HumanEval, a benchmark for evaluating the coding abilities of LLMs. However, it appears that Wizard LM compared WizardCoder 34B’s score to the HumanEval rating of GPT-4’s March version, rather than the August version, where GPT-4 achieved an 82%.

Phind also claims that their fine-tuned versions, CodeLlama-34B and CodeLlama-34B-Python, achieved pass rates of 67.6% and 69.5% on HumanEval, respectively. These numbers are almost equivalent to GPT-4’s.

The open source community is said to be obsessed with beating GPT-4, which is considered to be the ultimate benchmark for LLMs. Meta on its own is creating models meant for specific tasks, and they are trying to surpass GPT-4 in those particular tasks.

HumanEval benchmark may not be a perfect measure of the coding abilities of LLMs. Factors like code explanation, docstring generation, code infilling, SO questions, and writing tests are not captured by HumanEval.

OpenAI on its own has not released any details about the training data or evaluation metrics used for GPT-4. This has led some to speculate that OpenAI is holding back its trade secrets in order to maintain its lead in the LLM market.

The sources for this piece include an article in AnalyticsIndiaMag.

Open source models race to beat GPT-4 on coding tasks

Top Stories

New Google update targets sensitive data exposure in search results

GitHub Copilot to train on user data by default

Microsoft pulls Copilot Chat from core Office apps for enterprise customers

OpenAI pauses ChatGPT erotic mode “indefinitely”

Researcher Says “APT” Label No Longer Reflects the Threat Landscape

How do you select a graph database? – Part 1

Related Articles

Top 10 reflections on information technology developments in 2025

Alphabet to buy data centre and energy firm to boost AI capacity

AI is reshaping how people look for information, Google’s Year in Search 2025 shows

Former Shopify product chief joins OpenAI to lead ChatGPT app platform

TND Newsdesk

TND Newsdesk

Jim Love

Follow Us

Popular categories

Tech News Delivered