home

[2602.03837] Accelerating Scientific Research with Gemini: Case Studies and Common Techniques

Recent advances in large language models (LLMs) have opened new avenues for accelerating scientific research. While models are increasingly capable of assisting with routine tasks, their ability to contribute to novel, expert-level mathematical discovery is less understood. We present a collection of case studies demonstrating how researchers have successfully collaborated with advanced AI models, specifically Google's Gemini-based models (in particular Gemini Deep Think and its advanced variants), to solve open problems, refute conjectures, and generate new proofs across diverse areas in theoretical computer science, as well as other areas such as economics, optimization, and physics. Based on these experiences, we extract common techniques for effective human-AI collaboration in theoretical research, such as iterative refinement, problem decomposition, and cross-disciplinary knowledge transfer. While the majority of our results stem from this interactive, conversational methodology, we also highlight specific instances that push beyond standard chat interfaces. These include deploying the model as a rigorous adversarial reviewer to detect subtle flaws in existing proofs, and embedding it within a "neuro-symbolic" loop that autonomously writes and executes code to verify complex derivations. Together, these examples highlight the potential of AI not just as a tool for automation, but as a versatile, genuine partner in the creative process of scientific discovery.

science · ai · paper

February 12, 2026 at 12:52:03 PM EST * · permalink

·

https://arxiv.org/abs/2602.03837

Gemini Deep Think: Redefining the Future of Scientific Research — Google DeepMind

Collaborating with experts on 18 research problems, an advanced version of Gemini Deep Think helped resolve long-standing bottlenecks across algorithms, ML and combinatorial optimization, information theory, and economics. Highlights from our “Accelerating Research with Gemini” paper include (corresponding section numbers in paper):

singularity · ai · research

February 12, 2026 at 12:51:25 PM EST * · permalink

·

https://deepmind.google/blog/accelerating-mathematical-and-scientific-discovery-with-gemini-deep-think/

RAW.works - Recursive Language Models as Memory Systems

we were able to demonstrate a “Top-5” LongMemEval result with very minimal modifications to dspy.RLM, just some helper functions to process the “multi-chat” sessions

ai · agent · memory

February 12, 2026 at 12:34:34 PM EST * · permalink

·

https://raw.works/recursive-language-models-as-memory-systems/

AI predictions for 2026 - by Ajeya Cotra

singularity

February 12, 2026 at 10:20:17 AM EST * · permalink

·

https://www.planned-obsolescence.org/p/ai-predictions-for-2026

Harness engineering: leveraging Codex in an agent-first world | OpenAI

Humans always remain in the loop, but work at a different layer of abstraction than we used to. We prioritize work, translate user feedback into acceptance criteria, and validate outcomes. When the agent struggles, we treat it as a signal: identify what is missing—tools, guardrails, documentation—and feed it back into the repository, always by having Codex itself write the fix.

Our most difficult challenges now center on designing environments, feedback loops, and control systems that help agents accomplish our goal: build and maintain complex, reliable software at scale.

singularity

February 12, 2026 at 9:50:20 AM EST * · permalink

·

https://openai.com/index/harness-engineering/

Introducing GPT-5.3-Codex | OpenAI

The engineering team used Codex to optimize and adapt the harness for GPT‑5.3-Codex. When we started seeing strange edge cases impacting users, team members used Codex to identify context rendering bugs, and root cause low cache hit rates. GPT‑5.3-Codex is continuing to help the team throughout the launch by dynamically scaling GPU clusters to adjust to traffic surges and keeping latency stable.

singularity

February 12, 2026 at 9:17:24 AM EST * · permalink

·

https://openai.com/index/introducing-gpt-5-3-codex/

He spent 37 days in jail for a Facebook post — now FIRE has his back | The Foundation for Individual Rights and Expression

A 61-year-old Tennessee man is finally free after spending a shocking 37 days in jail — all for posting a meme.

censorship

February 11, 2026 at 11:33:28 PM EST * · permalink

·

https://www.thefire.org/news/he-spent-37-days-jail-facebook-post-now-fire-has-his-back

The Tumbler Ridge shooter was a trans female. How rare is that? | National Post

Of those, GVA said there were five confirmed transgender shooters, or fewer than a tenth of one per cent. (There have also been four cases of mass shootings by females in the U.S. since 1982.)

transgender · violence

February 11, 2026 at 6:50:43 PM EST * · permalink

·

https://archive.is/neKrG#selection-2851.0-2855.54

SWE-AGI Leaderboard

Across frontier models, gpt-5.3-codex achieves the best overall performance (solving 19/22 tasks, 86.4%), outperforming claude-opus-4.6 (15/22, 68.2%), and kimi-2.5 exhibits the strongest performance among open-source models

ai · benchmark

February 11, 2026 at 10:13:15 AM EST * · permalink

·

https://swe-agi.com/

BlackRock offers DeFi trading for the first time, buys Uniswap tokens | Fortune

The firm is also whitelisting a handful of market makers, including longtime crypto liquidity provider Wintermute, to facilitate trading. Meanwhile, access to BUIDL is restricted to qualified purchasers, a legal designation for those with assets of $5 million or more.

defi

February 11, 2026 at 9:45:18 AM EST * · permalink

·

https://archive.is/I66NI#selection-1275.231-1275.499

Donald Trump’s biggest Epstein lie exposed

For years, Trump has claimed he had “no idea” about Epstein’s abuse of underage girls. Yet records show that in 2006, he privately told Palm Beach police that “everyone” knew about Epstein’s activities and described Ghislaine Maxwell as evil.
Trump’s call to Palm Beach police chief

According to an FBI interview conducted in October 2019 with former Palm Beach Police Chief Michael Reiter, Trump personally called him in July 2006, just as Epstein’s criminal sex charges became public. Reiter told agents that Trump said, “Thank goodness you’re stopping him, everyone has known he’s been doing this.”

Epstein · trump

February 11, 2026 at 2:30:13 AM EST * · permalink

·

https://thedailysight.com/2026/02/10/donald-trumps-biggest-epstein-lie-exposed/

Observational Memory: 95% on LongMemEval - Mastra Research

Observational Memory achieves the highest score ever recorded on LongMemEval — 94.87% with gpt-5-mini — while maintaining a completely stable, cacheable context window. It beats the oracle, outperforms complex multi-step reranking systems with a single pass, and scales better with model quality than existing approaches.

ai · llm · agent

February 10, 2026 at 9:55:30 PM EST * · permalink

·

https://mastra.ai/research/observational-memory

Trump is in Epstein files "more than a million times," Raskin alleges

"I mean, there's tons of redacted stuff. ... And [Trump's] name, I think I put his name, and it appears more than a million times. So it's all over the place."

The bottom line: "To me, this whole rollout of saying that members can come from nine to five to sit at those four computers, is just part of the coverup," Raskin asserted.

The 3 million documents that the administration has not publicly released "are the ones I'd like to see," he said.
"The administration says that these are duplicative. Well go ahead and release them then! If they're duplicative, what's the problem? We'll be the judge of that."

trump · Epstein

February 10, 2026 at 3:15:58 PM EST * · permalink

·

https://archive.is/Bi3uX#selection-751.0-773.164

Congressman Says Redacted Part of Epstein File Suggests Trump Never Banned Epstein from Mar-a-Lago

"Epstein's lawyers synopsized and quoted Trump as saying that Jeffrey Epstein was not a member of his club at Mar-a-Lago, but he was a guest at Mar-a-Lago, and he had never been asked to leave," Raskin said. "That was redacted for some indeterminate, inscrutable reason."

trump · Epstein

February 10, 2026 at 12:50:41 PM EST * · permalink

·

https://people.com/jamie-raskin-unredacted-epstein-file-trump-claim-11903119

How AI Impacts Skill Formation - 2601.20245v2.pdf

Among participants who use AI, we find a stark divide in skill formation outcomes between high scoring interaction patterns (65%-86% quiz score) vs low-scoring interaction patterns (24%-39% quiz score). The high scorers only asked AI conceptual questions instead of code generation or asked for explanations to accompany generated code; these usage patterns demonstrate a high level of cognitive engagement.

ai · learning

February 9, 2026 at 10:03:34 PM EST * · permalink

·

https://arxiv.org/pdf/2601.20245

Political Cycles and Stock Returns | Journal of Political Economy: Vol 128, No 11

We develop a model of political cycles driven by time-varying risk aversion. Agents choose to work in the public or private sector and to vote Democratic or Republican. In equilibrium, when risk aversion is high, agents elect Democrats—the party promising more redistribution. The model predicts higher average stock market returns under Democratic presidencies, explaining the well-known “presidential puzzle.” The model can also explain why economic growth has been faster under Democratic presidencies. In the data, Democratic voters are more risk averse, and risk aversion declines during Democratic presidencies. Public workers vote Democratic, while entrepreneurs vote Republican, as the model predicts.

economics · us

February 9, 2026 at 2:39:28 PM EST * · permalink

·

https://www.journals.uchicago.edu/doi/abs/10.1086/710532?af=R&

Presidential Data 2024

economics · us

February 9, 2026 at 2:37:39 PM EST * · permalink

·

https://presidentialdata.org/

What is the impact of AI on productivity? - by Alex Imas

We may be on the descending portion of a productivity J-curve. As Brynjolfsson, Rock, and Syverson illustrate, when firms adopt transformative general-purpose technologies, measured productivity often initially falls because resources are diverted to investment, reorganization, and learning that do not show up as measured output.

ai · productivity

February 9, 2026 at 2:21:21 PM EST * · permalink

·

https://aleximas.substack.com/p/what-is-the-impact-of-ai-on-productivity

Task-Completion Time Horizons of Frontier AI Models - METR

The task-completion time horizon is the task duration (measured by human expert completion time) at which an AI agent is predicted to succeed with a given level of reliability

ai · benchmark

February 9, 2026 at 2:16:16 PM EST * · permalink

·

https://metr.org/time-horizons/

Discord will require a face scan or ID for full access next month | The Verge

it will automatically set all users’ accounts to a “teen-appropriate” experience unless they demonstrate that they’re adults

age_verification

February 9, 2026 at 1:40:46 PM EST * · permalink

·

https://www.theverge.com/tech/875309/discord-age-verification-global-roll-out