Google launched its latest artificial intelligence (AI) model Gemini on Dec. 6, announcing it as the most advanced AI model currently available on the market, surpassing OpenAI’s GPT-4.
Gemini is multimodal, which means it was built to understand and combine different types of information. It comes in three versions (Ultra, Pro, Nano) to serve different use cases, and one area in which it appears to beat GPT-4 is its ability to perform advanced math and specialized coding.
On its debut, Google released multiple benchmark tests that compared Gemini with GPT-4. The Gemini Ultra version achieved “state-of-the-art performance” in 30 out of 32 academic benchmarks that were used in large language model (LLM) development.
However, this is where critics across the internet have been poking at Gemini and questioning the methods used in the benchmark test that suggest Gemini’s superiority, along with Google’s marketing of the product.
“Misleading” Gemini promotion
One user on the social media platform X who works in the field of machine learning development, questioned whether Gemini’s claim of superiority over GPT-4 was true or not.
He pointed out that Google may be hyping up Gemini or “cherry-picking” examples of its superiority. Still, he concluded, “my bet is that Gemini is very competitive and will give GPT-4 a run for its money” and that competition in the space is good.
However, shortly afterward, he made a second post saying Google should be “embarrassed” for its “misleading” promotion of the product in a promotional video it created for the release of Gemini.
In response to his tweet, other X users spoke out about feeling deceived by Google’s portrayal of Gemini. One user said claims that Gemini would end the era of GPT-4 are “canceled.”
Another user, a computer scientist, agreed, and called Google’s portrayal of Gemini’s superiority “disingenuous.”
Botching benchmarks
Users pointed out that Google had included benchmarks that used an outdated version of GPT-4, rather than its current capacity, and therefore the comparisons were redundant.
Another area of concern to social media sleuths was in the parameters that Google used to compare its Gemini model with GPT-4. Moreover, the prompts given to both models were not identical, which could have major implications for the outcomes.
The user also pointed out that the results were achieved using tests carried out on a model that “isn’t publicly available” at the moment. Another user pointed out that scores could be different if the advanced model of Gemini was tested against the advanced version of GPT-4 known as “turbo.”
Related: Elon Musk’s xAI files with SEC for private sale of $1B in unregistered securities
To the test
Other social media users have decided to dismiss the benchmarks published by Google, and instead have been describing their own experiences with Gemini in comparison to GPT-4.
Anne Moss, who works in web publishing services and claims to be a regular user of AI, particularly GPT-4, said she used Gemini through Google’s Bard tool and felt “underwhelmed by the experience.”
She concluded that she would stick to GPT-4 for now explaining that the differences she noted included Gemini/Bard refusing to answer political questions and “lying” about knowing personal information.
Another user working in app development posted screenshots in which he asked both models, via the same prompt, to generate a code based on a photo. He pointed out Gemini/Bard’s underwhelming response in comparison to GPT-4.
According to Google, it plans to roll out Gemini more broadly to the public in early 2024. The model will also be integrated with Google’s suit of apps and services.
Magazine: Real AI use cases in crypto: Crypto-based AI markets, and AI financial analysis
Source: Read Full Article