<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Gemini on Carles Abarca</title><link>https://carlesabarca.com/tags/gemini/</link><description>Recent content in Gemini on Carles Abarca</description><generator>Hugo -- gohugo.io</generator><language>en</language><copyright>© 2026 Carles Abarca</copyright><lastBuildDate>Wed, 10 Dec 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://carlesabarca.com/tags/gemini/index.xml" rel="self" type="application/rss+xml"/><item><title>Gemini vs ChatGPT: The AI Race Changes Leaders Every Quarter</title><link>https://carlesabarca.com/posts/gemini-vs-chatgpt-ai-race/</link><pubDate>Wed, 10 Dec 2025 00:00:00 +0000</pubDate><guid>https://carlesabarca.com/posts/gemini-vs-chatgpt-ai-race/</guid><description>Google overtakes ChatGPT in PhD-level benchmarks. OpenAI responds in 14 days. AI leadership now lasts cycles, not years.</description><content:encoded>&lt;p&gt;Eight days ago Sam Altman declared &amp;ldquo;Code Red&amp;rdquo; at OpenAI. Today Google has just overtaken ChatGPT in PhD-level benchmarks.&lt;/p&gt;
&lt;p&gt;This is not a definitive victory &amp;ndash; it is a change of leadership in a race that has barely begun.&lt;/p&gt;

&lt;h2 class="relative group"&gt;Gemini 3 Pro wins today in:
 &lt;div id="gemini-3-pro-wins-today-in" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#gemini-3-pro-wins-today-in" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;1M-token context window (vs 400k for ChatGPT)&lt;/li&gt;
&lt;li&gt;Native integration in Google Search and Workspace&lt;/li&gt;
&lt;li&gt;Complex reasoning benchmarks&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 class="relative group"&gt;ChatGPT remains better in:
 &lt;div id="chatgpt-remains-better-in" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#chatgpt-remains-better-in" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Familiar conversational interface&lt;/li&gt;
&lt;li&gt;Partner ecosystem integration&lt;/li&gt;
&lt;li&gt;Speed in iterative tasks&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But here is the reality: in 6 months someone else may be in first place. From Europe comes Mistral, a growing option. Amazon is investing in its own chips. Meta is keeping a suspiciously low profile.&lt;/p&gt;

&lt;h2 class="relative group"&gt;OpenAI&amp;rsquo;s Response: 14 Days
 &lt;div id="openais-response-14-days" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#openais-response-14-days" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h2&gt;
&lt;p&gt;The updated scoreboard:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Gemini 3 Pro:&lt;/strong&gt; 1M context tokens, ~130 tokens/second, $2.00/1M tokens. Dominates in multimodal reasoning.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;GPT-5.2:&lt;/strong&gt; 400k context tokens, ~90 tokens/second, $1.75/1M tokens. Dominates in structured professional tasks.&lt;/p&gt;
&lt;p&gt;Current leader: &lt;strong&gt;tie&lt;/strong&gt;. It depends on which benchmark you look at.&lt;/p&gt;
&lt;p&gt;This is no longer a marathon. It is a relay sprint where the baton changes hands every quarter &amp;ndash; or even sooner.&lt;/p&gt;

&lt;h2 class="relative group"&gt;Why It Matters
 &lt;div id="why-it-matters" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#why-it-matters" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h2&gt;
&lt;p&gt;The classic mistake: picking a &amp;ldquo;winner&amp;rdquo; and betting everything for 3 years. Today that is systemic risk.&lt;/p&gt;
&lt;p&gt;The best-positioned organizations do not ask &amp;ldquo;which is the best AI?&amp;rdquo; They ask: &amp;ldquo;which is the best AI for this specific problem&amp;hellip; knowing that in 6 months it may change?&amp;rdquo;&lt;/p&gt;
&lt;p&gt;We need:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Flexible architecture &amp;ndash; do not marry a platform&lt;/li&gt;
&lt;li&gt;Teams that understand technical trade-offs, not hype&lt;/li&gt;
&lt;li&gt;Processes that adapt quickly&lt;/li&gt;
&lt;/ul&gt;</content:encoded><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://carlesabarca.com/posts/gemini-vs-chatgpt-ai-race/featured.png"/></item><item><title>Comparing Popular AI Models: My Test Results</title><link>https://carlesabarca.com/posts/comparing-ai-models/</link><pubDate>Mon, 27 Jan 2025 00:00:00 +0000</pubDate><guid>https://carlesabarca.com/posts/comparing-ai-models/</guid><description>A personal comparison of ChatGPT 4o, ChatGPT o1, Claude 3.5, Gemini, Perplexity Pro, and DeepSeek across creative writing, image reasoning, and math.</description><content:encoded>&lt;p&gt;I recently tested several leading AI models to see how they stack up against one another. The models I compared were: &lt;strong&gt;ChatGPT 4o&lt;/strong&gt;, &lt;strong&gt;ChatGPT o1&lt;/strong&gt;, &lt;strong&gt;Claude 3.5 Sonnet&lt;/strong&gt;, &lt;strong&gt;Gemini 2.0 Flash Experimental&lt;/strong&gt;, &lt;strong&gt;Perplexity Pro&lt;/strong&gt;, and &lt;strong&gt;DeepSeek&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Using a consistent set of inputs, I evaluated their performance across a range of tasks: creative writing, image description and reasoning, and multi-step mathematical problem solving.&lt;/p&gt;
&lt;p&gt;The results do not intend to be a scientific and exhaustive comparison, but my own opinion based on my preferences when comparing the answers of the models when submitting the exact same stimulus.&lt;/p&gt;
&lt;hr&gt;

&lt;h2 class="relative group"&gt;1. Creative Writing Tasks
 &lt;div id="1-creative-writing-tasks" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#1-creative-writing-tasks" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h2&gt;

&lt;h3 class="relative group"&gt;Song Lyrics: &amp;ldquo;Nostalgia for a place you&amp;rsquo;ve never visited&amp;rdquo;
 &lt;div id="song-lyrics-nostalgia-for-a-place-youve-never-visited" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#song-lyrics-nostalgia-for-a-place-youve-never-visited" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;ChatGPT 4o&lt;/strong&gt; delivered evocative lyrics with dusty streets, twilight breezes, and photographs &amp;ndash; a strong emotional arc. &lt;strong&gt;ChatGPT o1&lt;/strong&gt; (&amp;ldquo;Faraway Memories&amp;rdquo;) chose salt, distant shores, and cobbled roads &amp;ndash; warm and melodic. &lt;strong&gt;Claude 3.5&lt;/strong&gt; went minimalist with painted scenes in travel books and cherry blossoms &amp;ndash; clean and visual. &lt;strong&gt;Gemini&lt;/strong&gt; offered sun-bleached postcards and whispering trees &amp;ndash; atmospheric. &lt;strong&gt;Perplexity&lt;/strong&gt; (&amp;ldquo;Echoes of Elsewhere&amp;rdquo;) wrote cobblestone streets and ancient bells &amp;ndash; effective. &lt;strong&gt;DeepSeek&lt;/strong&gt; (&amp;ldquo;Ghosts of Nowhere&amp;rdquo;) stood out with amber streetlamp glow, a door never turned, and whispers clinging to cobblestones &amp;ndash; the most poetic of the group.&lt;/p&gt;

&lt;h3 class="relative group"&gt;Short Story: &amp;ldquo;A memory from childhood&amp;rdquo;
 &lt;div id="short-story-a-memory-from-childhood" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#short-story-a-memory-from-childhood" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;ChatGPT 4o&lt;/strong&gt; placed us barefoot under a mango tree with sticky fruit juice &amp;ndash; vivid sensory detail. &lt;strong&gt;ChatGPT o1&lt;/strong&gt; described a cracked concrete porch with faded green cushions &amp;ndash; intimate and grounded. &lt;strong&gt;Claude 3.5&lt;/strong&gt; took us to a grandmother&amp;rsquo;s backyard with a sprawling fig tree fortress &amp;ndash; deeply nostalgic. &lt;strong&gt;Gemini&lt;/strong&gt; evoked damp earth and Mrs. Gable&amp;rsquo;s garden &amp;ndash; warm neighborhood storytelling. &lt;strong&gt;Perplexity&lt;/strong&gt; offered a tire swing and ancient oak &amp;ndash; classic Americana. &lt;strong&gt;DeepSeek&lt;/strong&gt; described golden light, barefoot in grass, chasing fireflies &amp;ndash; romantic and warm.&lt;/p&gt;
&lt;hr&gt;

&lt;h2 class="relative group"&gt;2. Image Description and Reasoning
 &lt;div id="2-image-description-and-reasoning" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#2-image-description-and-reasoning" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h2&gt;
&lt;p&gt;I uploaded an image of an espresso in a white paper cup on a wooden surface.&lt;/p&gt;

&lt;h3 class="relative group"&gt;Basic Description
 &lt;div id="basic-description" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#basic-description" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h3&gt;
&lt;p&gt;All models correctly identified a white disposable paper cup containing espresso on a polished wooden surface. The models varied in detail: &lt;strong&gt;ChatGPT 4o&lt;/strong&gt; noted matte finish and vertical seams. &lt;strong&gt;Claude&lt;/strong&gt; specifically identified the tapered shape typical of paper cups. &lt;strong&gt;Gemini&lt;/strong&gt; organized its response into subject matter and visual details. &lt;strong&gt;Perplexity&lt;/strong&gt; noted the golden-brown crema layer.&lt;/p&gt;

&lt;h3 class="relative group"&gt;Deductive Reasoning
 &lt;div id="deductive-reasoning" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#deductive-reasoning" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h3&gt;
&lt;p&gt;When asked what could be deduced about the environment, time of day, or possible events:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ChatGPT 4o&lt;/strong&gt; sketched a likely indoor office environment with artificial lighting, suggesting a morning or early afternoon coffee break &amp;ndash; complete and imaginative. &lt;strong&gt;ChatGPT o1&lt;/strong&gt; was more cautious, admitting uncertainty while leaning toward morning. &lt;strong&gt;Claude&lt;/strong&gt; indicated a cafe-style setting with medium natural light &amp;ndash; creative but slightly speculative. &lt;strong&gt;Gemini&lt;/strong&gt; appropriately highlighted the challenge in determining precise time of day. &lt;strong&gt;Perplexity&lt;/strong&gt; creatively placed the scene at &amp;ldquo;Tuesday morning at 9 AM&amp;rdquo; &amp;ndash; inventive but unsupported. &lt;strong&gt;DeepSeek&lt;/strong&gt; did not support this task.&lt;/p&gt;
&lt;hr&gt;

&lt;h2 class="relative group"&gt;3. Multi-Step Mathematical Problem Solving
 &lt;div id="3-multi-step-mathematical-problem-solving" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#3-multi-step-mathematical-problem-solving" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h2&gt;

&lt;h3 class="relative group"&gt;First Problem
 &lt;div id="first-problem" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#first-problem" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&amp;ldquo;A rectangular garden is 10 meters long and 5 meters wide. Calculate the area, then find the cost of fencing it if fencing costs $5 per meter.&amp;rdquo;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The right answer: area of 50 square meters, fencing cost of $150. All models answered correctly with 2-3 step breakdowns. &lt;strong&gt;Perplexity&lt;/strong&gt; was most concise with just two steps and detailed formulas.&lt;/p&gt;

&lt;h3 class="relative group"&gt;Second Problem
 &lt;div id="second-problem" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#second-problem" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&amp;ldquo;If half of the garden&amp;rsquo;s area is for vegetables and the other half for flowers, and you need 4 flowers per square meter, how many flower plants do you need? Also, if a sprinkler covers 2 square meters, how many sprinklers for the entire garden?&amp;rdquo;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The right answer: 100 flower plants and 25 sprinklers. All models answered correctly. &lt;strong&gt;ChatGPT o1&lt;/strong&gt; added a preliminary step recalculating the garden area. &lt;strong&gt;Perplexity&lt;/strong&gt; was again most concise.&lt;/p&gt;
&lt;hr&gt;

&lt;h2 class="relative group"&gt;Conclusions
 &lt;div id="conclusions" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#conclusions" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h2&gt;
&lt;p&gt;There is no single &amp;ldquo;best&amp;rdquo; model &amp;ndash; it depends on what you need:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;For creative writing&lt;/strong&gt;, DeepSeek and Claude impressed with their poetic and literary qualities&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;For image reasoning&lt;/strong&gt;, ChatGPT 4o offered the most complete and imaginative analysis&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;For mathematical problem solving&lt;/strong&gt;, all models performed well, with Perplexity standing out for conciseness&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;For cautious, accurate responses&lt;/strong&gt;, ChatGPT o1 consistently avoided overreach&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The AI landscape is evolving so rapidly that these results represent a snapshot in time. In six months, the rankings may look entirely different.&lt;/p&gt;</content:encoded><media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://carlesabarca.com/posts/comparing-ai-models/featured.png"/></item></channel></rss>