Feature Summary
- Category: Data Collection
- Priority: High
Background and Objective
- Problem Statement: HN posts need to be represented optimally for later unsupervised segmentation
- Objective: Determine what additional information besides post title and URL, if any, should be encoded
Technical Requirements
- Languages and Frameworks: Python - sklearn, PyTorch
- Dependencies:
- representative extract of data to test
- Embedding model selected
Tests and Evaluation Metrics
Evaluation Metrics: (e.g., Inertia, Silhouette Score for $k$-means clustering):
Feature Summary
Background and Objective
Technical Requirements
Tests and Evaluation Metrics
Evaluation Metrics: (e.g., Inertia, Silhouette Score for$k$ -means clustering):