‘Finding Neurons in a Haystack’ Initiative at MIT, Harvard and Northeastern University Uses Scattered Surveys

‘Finding Neurons in a Haystack’ Initiative at MIT, Harvard and Northeastern University Uses Scattered Surveys

It is common to think of neural networks as adaptive “feature extractors” that learn by progressively refining appropriate representations from initial raw inputs. So, the question arises: which characteristics are represented and how? To better understand how human-interpretable high-level features are described in LLM neuronal activations, a research team from the Massachusetts Institute of Technology…