Eoin Farrell

Astrophysics PhD & AI Researcher

List of publications available here

Published at

Jun 3, 2024

Unlearning with sparse autoencoders

AI

We trained sparse autoencoders on the open-source language model Pythia-2.8b to use for unlearning Harry Potter related knowledge. We can successfully unlearn significant levels of Harry Potter related knowledge with little to no side effects. This technique is worth exploring further.

Read more →
Published at

Apr 18, 2024

A method for non-linear inversion of the stellar structure applied to gravity-mode pulsators

Astrophysics

Published in Farrell et al. (2024) Astronomy & Astrophysics, Volume 686

Read more →
Published at

Apr 16, 2024

Experiments with an alternative method to promote sparsity in sparse autoencoders

AI

I experimented with alternatives to the standard L1 penalty used to promote sparsity in sparse autoencoders (SAEs). I found that including terms based on an alternative differentiable approximation of the feature sparsity in the loss function was an effective way to generate sparsity in SAEs trained on the residual stream of GPT2-small.

Read more →
Published at

Dec 21, 2023

Experiments with Sparse Autoencoders on Attention Heads

AI

I trained sparse autoencoders on the key and query vectors of previous token heads and induction heads of attn-only-2l and gpt2-small and found interpretable features which I could intervene on in a predictable and interpretable way.

Read more →
Published at

Oct 6, 2023

'Recency bias' in an induction head

AI

The induction head in a 2-layer attention-only transformer model has a slight bias towards tokens later in the context compared to earlier. Interestingly, its notion of position appears to not depend on positional embeddings, or any specific output from an attention head in the previous layer.

Read more →
Published at

Oct 3, 2023

Induction head circuits for longer sequences

AI

In a 2-layer attention-only transformer model, an induction head can combine with an "averaging" head that stores some kind of average over the previous ~4-5 tokens to produce a circuit that can predict the next token in repeated sequences of length 2 to 5 .

Read more →
Published at

Oct 2, 2023

The previous token head and the "look-back-two" head

AI

A few plots on previous tokens heads, a discussion of how they work and a comparison to a similar type of attention head -- a "look-back-two" head.

Read more →
Published at

Oct 1, 2023

Positional Embeddings in a 2-layer attention-only transformer model

AI

This post contains some visualisations and discussion of positional embeddings. The position embeddings in a 2-layer attention-only transformer model arrange themselves into a helical structure. This presumably allows the model to generate QK matrices to move a few positions in relative terms with a similar transformation for all positions. The positional embeddings at positions 0 and 1023 have special properties.

Read more →
Published at

Oct 1, 2022

The Initial Magnetic Field Distribution in AB Stars

Astrophysics

Published in Farrell et al. (2022), The Astrophysical Journal, Volume 938, Issue 1

Read more →
Published at

May 1, 2022

Numerical experiments to help understand cause and effect in massive star evolution

Astrophysics

Published in Farrell et al. (2022), Monthly Notices of the Royal Astronomical Society, Volume 512, Issue 3

Read more →
Published at

Mar 1, 2021

Is GW190521 the merger of black holes from the first stellar generations?

Astrophysics

Published in Farrell et al. (2021), Monthly Notices of the Royal Astronomical Society: Letters, Volume 502, Issue 1

Read more →
Published at

Jul 1, 2020

SNAPSHOT: connections between internal and surface properties of massive stars

Astrophysics

Published in Farrell et al. (2020), Monthly Notices of the Royal Astronomical Society Volume 495, Issue 4

Read more →
Published at

May 1, 2020

The uncertain masses of progenitors of core-collapse supernovae and direct-collapse black holes

Astrophysics

Published in Farrell et al. (2020), Monthly Notices of the Royal Astronomical Society, Volume 494, Issue 1

Read more →
Published at

Jan 1, 2019

Impact of binary interaction on the evolution of blue supergiants

Astrophysics

Published in Farrell et al. (2019), Astronomy & Astrophysics, Volume 621

Read more →