research

Selected publications and ongoing research work.

I focus on mechanistic interpretability and practical AI safety research.

2024

  1. Unveiling the Black Box: Causal Inference and Feature Analysis in Fine-Tuned Language Models Using Sparse Autoencoders
    Rini Gupta and Sean Sica
    Aug 2024