I am a PhD student in Computer Science at Stanford University. My research interests are in scaling up decision-making methods such as reinforcement learning.
-
Stanford University
- California
-
02:20
(UTC -12:00) - asap7772.github.io
- https://orcid.org/0000-0001-5286-6082
- @Anikait_Singh_
- in/asap7772
- https://huggingface.co/Asap7772
Highlights
- Pro
Pinned Loading
-
fewshot-preference-optimization
fewshot-preference-optimization PublicFew-Shot Preference Optimization (FSPO) personalizes LLMs by reframing reward modeling as a meta-learning problem, enabling rapid adaptation to user preferences with minimal labeled data, leveragin…
-
understanding-rlhf
understanding-rlhf PublicLearning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Our new work finds that approaches employing on-policy samplin…
-
OfflineRlWorkflow
OfflineRlWorkflow PublicThis repository accompanies the following paper: A Workflow for Offline Model-Free Robotic RL
-
Personalized-Text-To-Image-Diffusion
Personalized-Text-To-Image-Diffusion PublicPublic Implementation of PPD
-
DeepCriminalize
DeepCriminalize PublicProject that uses GAN's to develop a sketch artist like representation of a criminal. Winners of the Cal Hack Fellowship 2019
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.




