Physics • Machine Learning • Curiosity

Welcome to Tensors & Quarks

Exploring the cosmos of physics and the depths of machine learning with hands-on experiments, notes, and essays.

Read the latest insights Meet the author

ML Astrophysics Misc

Latest Posts

November 7, 2024 — Tags: ML

From “Why” to “How”: ReAct’s Unified Reasoning-Acting Paradigm

Large language models (LLMs) have reshaped natural language processing by demonstrating impressive capabilities in text generation, summarization, and translation. Yet, as powerful as they are, these models often struggle when asked to perform complex, multi-step tasks that require deliberate planning and interaction with external information sources. Traditional chain-of-thought (CoT) prompting enables LLMs to articulate intermediate reasoning steps, but it remains confined to the model’s internal knowledge and inference capabilities. Conversely, action-based approaches have allowed models to execute external operations—such as querying an API or navigating an environment—but lack explicit internal reasoning, leading to unexplainable or brittle behavior. The ReAct framework addresses this gap by synergizing reasoning and acting in a unified prompt-based paradigm that interleaves “thoughts” and “actions” to solve complex tasks more effectively and transparently.

Read more →
October 31, 2024 — Tags: ML

From Facts to Insight: Bridging the Compositionality Gap in Language Models

Large language models (LLMs) such as GPT-3 have transformed natural language understanding by memorizing vast amounts of text. Yet, when faced with questions that require combining multiple pieces of knowledge—so-called compositional reasoning—even the biggest models stumble. In their paper Measuring and Narrowing the Compositionality Gap in Language Models, Press et al. introduce a new metric for this shortfall, show that it persists despite model scale, and propose practical prompting techniques to close it.

Read more →
October 24, 2024 — Tags: ML

LoRA: A Breakthrough in Efficient Fine-Tuning of Large Language Models

As large language models (LLMs) like GPT-3, LLaMA, and BERT continue to grow in size and influence, one challenge becomes increasingly apparent: while these models offer exceptional capabilities, adapting them for new tasks remains expensive and resource-intensive. Fine-tuning a model with billions of parameters typically requires large datasets, massive compute power, and hours or even days of training time — luxuries not everyone can afford.

Read more →
October 17, 2024 — Tags: ML

Fine-Tuning Language Models: Welcome to the Nerdy Playground of LLMs

From LoRA to RLHF — and all the acronyms in between

So, you’ve got your hands on a fancy pre-trained language model. Great. It’s read more text than any human ever will, speaks in Shakespearean iambic pentameter and Python, and can tell you the capital of Burkina Faso at 3 AM.

Read more →
Welcome to Tensors & Quarks

September 21, 2024 — Tags:

This is the first post!
Here I’ll share ideas in physics, AI, and their cosmic overlaps.

Read more →

← Newer Posts Page 10 of 10

Welcome to Tensors & Quarks

Latest Posts

From “Why” to “How”: ReAct’s Unified Reasoning-Acting Paradigm

From Facts to Insight: Bridging the Compositionality Gap in Language Models

LoRA: A Breakthrough in Efficient Fine-Tuning of Large Language Models

Fine-Tuning Language Models: Welcome to the Nerdy Playground of LLMs

Welcome to Tensors & Quarks