Froth on the Daydream (FOD) – our weekly summary of over 150 AI newsletters. We connect the dots and cut through the froth, bringing you a comprehensive picture of the ever-evolving AI landscape. Stay tuned for clarity amidst surrealism and experimentation.
The recent surge in the sophistication of Large Language Models (LLMs) — both proprietary and open-source — presents a paradox of potential and perplexity. These systems, characterized by their remarkable natural language processing capabilities, have propelled us into a new era of technological marvels. Yet, they also bring many challenges, chiefly in the realm of trustworthiness.
Last week’s paper, “
The paper also poses a question: “To what extent can we genuinely trust LLMs?”
But can we genuinely trust LLMs? We can’t.
Much better would be to adopt the principle of ‘trust, but verify.’ This approach, reminiscent of Cold War-era diplomacy, is increasingly relevant in the digital age, especially with advancements in AI. It suggests a balanced strategy: embracing the utility and potential of these models while stringently scrutinizing their mechanisms and outcomes.
When working with LLMs, you can trust your expertise in verifying the work that LLM automates or accelerates for you. But you can’t just genuinely trust it. I even think that, along with the new role of an AI engineer, we should have a new job position for an in-house AI Verifier, akin to a fact-checker in a media publication.
The other news from last week ‘complements’ the insights from the paper. Anthropic’s research on AI systems reveals a startling facet of deceptive ‘
Meanwhile, the nuanced shift in OpenAI’s policy, discreetly
On a more commercial and, so to speak, physical note, the launch of the
*Many publications mistakenly attribute the coining of LAM to the Rabbit R1 team, when in fact, it was Salesforce Chief Scientist Silvio Savarese who coined it in June 2023 in his blog post “Towards Actionable Generative AI'“. Trust, but verify ;)
Adding to the global perspective: In its “
As we navigate this era of groundbreaking AI advancements, the “trust, but verify” principle remains a beacon. We need to balance the excitement of AI’s potential with rigorous, ongoing scrutiny of its trustworthiness, safety, and ethical implications.
MoE-Mamba: Efficient Selective State Space Models with MoE. Researchers from the University of Warsaw developed MoE-Mamba, integrating Mamba, a State Space Model (SSM), with a Mixture of Experts (MoE) layer. This model outperforms both Mamba and Transformer-MoE in efficiency and performance, achieving equivalent results to Mamba with fewer training steps
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM. Researchers from the University of Cambridge and University College London introduced “Blending,” a method combining smaller AI models to match or exceed the performance of larger models like ChatGPT
Soaring from 4K to 400K: Extending LLM’s Context with Activation Beacon. Researchers from the Beijing Academy of AI and Gaoling School of AI developed Activation Beacon, a module enhancing LLMs’ context window length
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution. Researchers from MIT CSAIL and Meta AI developed CRUXEval, a benchmark comprising 800 Python functions for evaluating code models’ reasoning and execution skills
Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers. Researchers from Google Research and Tel Aviv University introduce GRANOLA QA, an evaluation setting for open-domain question answering (QA) that considers multi-granularity answers
TOFU: A Task of Fictitious Unlearning for LLMs. Researchers from Carnegie Mellon University introduced TOFU, a benchmark for evaluating unlearning in LLMs using synthetic author profiles
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in LLMs. Researchers from OpenNLPLab developed Lightning Attention-2, an advanced linear attention mechanism for LLMs that efficiently handles unlimited sequence lengths without increased memory usage or decreased speed
Transformers are Multi-State RNNs. Researchers from The Hebrew University of Jerusalem and FAIR AI at Meta, redefined decoder-only transformers as a variant of Recurrent Neural Networks (RNNs) called infinite Multi-State RNNs (MSRNNs)
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models. Researchers from Google Research and Tel Aviv University introduce “Patchscopes,” a new framework for analyzing hidden representations in LLMs
Machine Translation and Cross-Lingual Applications
Tuning LLMs with Contrastive Alignment Instructions for Machine Translation (MT) in Unseen, Low-resource Languages. Researchers from Apple introduced “contrastive alignment instructions” (AlignInstruct) to enhance MT in LLMs for unseen, low-resource languages
Also published here.