AI Feed

Distill.pub Papers January 30, 2021

Curve Circuits

Reverse engineering the curve detection algorithm from InceptionV1 and reimplementing it from scratch.

Distill.pub Papers January 27, 2021

High-Low Frequency Detectors

A family of early-vision neurons reacting to directional transitions from high to low spatial frequency.

Jay Alammar Tech Media January 19, 2021

Finding the Words to Say: Hidden State Visualizations for Language Models

By visualizing the hidden state between a model's layers, we can get some clues as to the model's "thought process". Figure: Finding the words to say…

Lilian Weng Tech Media January 2, 2021

Controllable Neural Text Generation

[Updated on 2021-02-01: Updated to version 2.0 with several work added and many typos fixed.] [Updated on 2021-05-26: Add P-tuning and Prompt Tuning in the “prompt…

Jay Alammar Tech Media December 17, 2020

Interfaces for Explaining Transformer Language Models

Interfaces for exploring transformer language models by looking at input saliency and neuron activation. Explorable #1: Input saliency of a list of countries generated by a…

Distill.pub Papers December 8, 2020

Naturally Occurring Equivariance in Neural Networks

Neural networks naturally learn many transformed copies of the same feature, connected by symmetric weights.

Distill.pub Papers November 17, 2020

Understanding RL Vision

With diverse environments, we can analyze, diagnose and edit deep reinforcement learning models using attribution.

Lilian Weng Tech Media October 29, 2020

How to Build an Open-Domain Question Answering System?

[Updated on 2020-11-12: add an example on closed-book factual QA using OpenAI API (beta). A model that can answer any question with regard to factual knowledge…

Distill.pub Papers September 11, 2020

Communicating with Interactive Articles

Examining the design of interactive articles by synthesizing theory from disciplines such as education, journalism, and visualization.

Distill.pub Papers August 27, 2020

Thread: Differentiable Self-organizing Systems

A collection of articles and comments with the goal of understanding how to design robust and general purpose self-organizing systems.

Distill.pub Papers August 27, 2020

Self-classifying MNIST Digits

Training an end-to-end differentiable, self-organising cellular automata for classifying MNIST digits.

Lilian Weng Tech Media August 6, 2020

Neural Architecture Search

Although most popular and successful model architectures are designed by human experts, it doesn’t mean we have explored the entire network architecture space and settled down…

Jay Alammar Tech Media July 27, 2020

How GPT3 Works – Visualizations and Animations

Discussions: Hacker News (397 points, 97 comments), Reddit r/MachineLearning (247 points, 27 comments) Translations: German, Korean, Chinese (Simplified), Russian, Turkish The tech world is abuzz with…

Distill.pub Papers June 17, 2020

Curve Detectors

Part one of a three part deep dive into the curve neuron family.

Lilian Weng Tech Media June 7, 2020

Exploration Strategies in Deep Reinforcement Learning

[Updated on 2020-06-17: Add “exploration via disagreement” in the “Forward Dynamics” section. Exploitation versus exploration is a critical topic in Reinforcement Learning. We’d like the RL…

Distill.pub Papers May 5, 2020

Exploring Bayesian Optimization

How to tune hyperparameters for your machine learning model using Bayesian optimization.

Lilian Weng Tech Media April 7, 2020

The Transformer Family

[Updated on 2023-01-27: After almost three years, I did a big refactoring update of this post to incorporate a bunch of new Transformer models since 2020.…

Distill.pub Papers April 1, 2020

An Overview of Early Vision in InceptionV1

An overview of all the neurons in the first five layers of InceptionV1, organized into a taxonomy of 'neuron groups.'

Distill.pub Papers March 16, 2020

Visualizing Neural Networks with the Grand Tour

By focusing on linear dimensionality reduction, we show how to visualize many dynamic phenomena in neural networks.

Distill.pub Papers March 10, 2020

Thread: Circuits

What can we learn if we invest heavily in reverse engineering a single neural network?

Distill.pub Papers March 10, 2020

Zoom In: An Introduction to Circuits

By studying the connections between neurons, we can find meaningful algorithms in the weights of neural networks.

Distill.pub Papers February 11, 2020

Growing Neural Cellular Automata

Training an end-to-end differentiable, self-organising cellular automata model of morphogenesis, able to both grow and regenerate specific patterns.

Lilian Weng Tech Media January 29, 2020

Curriculum for Reinforcement Learning

[Updated on 2020-02-03: mentioning PCG in the “Task-Specific Curriculum” section. [Updated on 2020-02-04: Add a new “curriculum through distillation” section.

Distill.pub Papers January 10, 2020

Visualizing the Impact of Feature Attribution Baselines

Exploring the baseline input hyperparameter, and how it impacts interpretations of neural network behavior.

Latest