-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Currently, our Q-table (self.q_table) uses exact string keys for state-action values, e.g., "2,2|8,0|13". This means only exact state matches are possible. For better generalization and learning, we want to experiment with fuzzy lookup strategies that can retrieve similar states, not just exact matches.
Goals
- Implement and benchmark three different fuzzy lookup strategies for the Q-table:
- Manual Iteration with Custom Similarity Function
- Structured Key with Spatial Index (e.g., KD-tree)
- Embedding-based Approximate Nearest Neighbor (ANN) Search
Tasks
1. Manual Iteration with Custom Similarity Function
- Write a function that iterates over all Q-table keys and computes a similarity score (e.g., negative Manhattan distance) to the query key.
- Return the top-k most similar states.
- Benchmark lookup speed and quality for small/medium table sizes.
2. Structured Key with Spatial Index
- Change state keys to numeric tuples (e.g.,
(x, y, tx, ty, steps)). - Use a spatial index (e.g.,
scikit-learn's KD-tree or BallTree) to perform fast nearest-neighbor lookups. - Compare lookup speed and accuracy to manual iteration.
3. Embedding-based ANN Search
- Define an embedding function for states (could be hand-crafted or learned).
- Use an ANN library (e.g.,
faiss,annoy, orscann) to index and search state embeddings. - Benchmark performance and generalization.
Evaluation
- Compare the three approaches in terms of:
- Lookup speed (for various Q-table sizes)
- Memory usage
- Quality of retrieved states (does it help RL performance?)
- Document findings and recommend the best approach for our use case.
References
Metadata
Metadata
Assignees
Labels
No labels