Diksha Shrivastava diksha-shrivastava13

Diksha Shrivastava

Diksha is an AI Safety Researcher based in India, who has spent the last eight months working full-time on safety research at a capability-first lab and is now moving to independent work. Her research sits at the intersection of dynamic agency, multi-agent risks, developmental interpretability and scalable oversight — she’s building upon Causal Incentives research to study how co-evolving environments shape temporal goal structures in RL agents using regret-based Unsupervised Environment Design. She’s particularly interested in what it means for an agent to model its own training process, and what that implies for oversight. Alongside her research, she volunteers in reading groups and mentors people new to AI Safety. She’s always glad to talk about risks from open-endedness, agent epistemics, or alignment as an environment design problem.

diksha-shrivastava13.github.io

Note: I’m not active on social media — the best way to reach me is by email.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diksha Shrivastava diksha-shrivastava13

Achievements

Achievements

Organizations

Block or report diksha-shrivastava13

Diksha Shrivastava

Pinned Loading

Uh oh!