Neural Architecture Search
Although most popular and successful model architectures are designed by human experts, it doesn’t mean we have explored the entire network architecture space and settled down…
Although most popular and successful model architectures are designed by human experts, it doesn’t mean we have explored the entire network architecture space and settled down…
Discussions: Hacker News (397 points, 97 comments), Reddit r/MachineLearning (247 points, 27 comments) Translations: German, Korean, Chinese (Simplified), Russian, Turkish The tech world is abuzz with…
Part one of a three part deep dive into the curve neuron family.
[Updated on 2020-06-17: Add “exploration via disagreement” in the “Forward Dynamics” section. Exploitation versus exploration is a critical topic in Reinforcement Learning. We’d like the RL…
How to tune hyperparameters for your machine learning model using Bayesian optimization.
[Updated on 2023-01-27: After almost three years, I did a big refactoring update of this post to incorporate a bunch of new Transformer models since 2020.…
An overview of all the neurons in the first five layers of InceptionV1, organized into a taxonomy of 'neuron groups.'
By focusing on linear dimensionality reduction, we show how to visualize many dynamic phenomena in neural networks.
What can we learn if we invest heavily in reverse engineering a single neural network?
By studying the connections between neurons, we can find meaningful algorithms in the weights of neural networks.
Training an end-to-end differentiable, self-organising cellular automata model of morphogenesis, able to both grow and regenerate specific patterns.
[Updated on 2020-02-03: mentioning PCG in the “Task-Specific Curriculum” section. [Updated on 2020-02-04: Add a new “curriculum through distillation” section.
Exploring the baseline input hyperparameter, and how it impacts interpretations of neural network behavior.
Detailed derivations and open-source code to analyze the receptive fields of convnets.