Skip to content
This repository was archived by the owner on Sep 26, 2025. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions LESSONS_LEARNT.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Lessons Learned While Experimenting with Windsurf, Claude Sonnet 3.7, and Gemini 2.5 Pro
This guide outlines practical tips and best practices for working effectively with LLM-based coding assistants. These insights are drawn from hands-on experience building Stridely and experimenting with models like Windsurf, Claude Sonnet 3.7, and Gemini 2.5 Pro.
This guide outlines practical tips and best practices for working effectively with LLM-based coding assistants. These insights are drawn from hands-on experience building ZoneLens and experimenting with models like Windsurf, Claude 3.5/3.7 Sonnet, and Gemini 2.5 Pro.

## Git Is Your Best Friend
Large Language Models (LLMs) are excellent at generating large volumes of code quickly. However, they can unintentionally modify existing logic or introduce regressions. Robust version control is essential.
Expand Down Expand Up @@ -83,11 +83,18 @@ LLMs often introduce unnecessary complexity. Stay focused on core functionality

## LLM Collaboration Nuances

When working with LLMs (e.g., Gemini 2.5 Pro, Claude 3 Sonnet) on intricate code modifications, such as adjusting specific filter conditions within a complex method, they may sometimes propose larger-scale refactors rather than small, surgical changes. These expansive refactors can often be perplexing, creating unnecessary mess and potentially breaking existing functionality. For such fine-grained adjustments, it is frequently more effective to leverage one's own coding expertise to make the precise change, or to guide the LLM with very specific, constrained instructions. This highlights the importance of human oversight and a critical approach when pair-programming with AI, especially for delicate modifications in an existing codebase.
When working with LLMs (e.g., Gemini 1.5 Pro, Claude 3.5 Sonnet) on intricate code modifications, such as adjusting specific filter conditions within a complex method, they may propose larger-scale refactors instead of small, surgical changes. These expansive refactors can be perplexing, create unnecessary mess, and break existing functionality. For such fine-grained adjustments, it is often more effective to use one's own expertise for the precise change or to guide the LLM with specific, constrained instructions. This highlights the importance of human oversight and a critical approach when pair-programming with AI, especially for delicate modifications in an existing codebase.

**Django Forms and LLMs:** Generating well-functioning Django forms with LLMs can be particularly frustrating. It usually leads to breaking a ton of stuff and getting stuck in a vicious circle. This highlights a specific area where LLM assistance requires extreme caution and often more manual intervention.


## LLM is sometimes surprisingly dumb

An LLM is sometimes surprisingly dumb when a project's scope is bigger. When changing the HR calculation from elapsed time to moving time, the LLM managed to break the code three times and was unable to deliver any working results.
Even though this action required minimal code changes, the LLM failed spectacularly. Very disappointing.

**Django Forms and LLMs:** Generating well-functioning Django forms with LLMs can be particularly frustrating. It usually leads to breaking a ton of stuff and circulating in a vicious circle. This highlights a specific area where LLM assistance requires extreme caution and often more manual intervention.

## Final Thoughts
Effective LLM-assisted development is a skillone that improves with experience. By applying structured workflows, clear communication, and a collaborative mindset, you can get the most out of tools like Claude Sonnet, Gemini, and Windsurf.
Effective LLM-assisted development is a skillone that improves with experience. By applying structured workflows, clear communication, and a collaborative mindset, you can get the most out of tools like Claude 3.5/3.7 Sonnet, Gemini 2.5 Pro, and Windsurf.

Co-authored with a free-tier OpenAI ChatGPT.
78 changes: 59 additions & 19 deletions backend/api/hr_processing.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,21 +22,35 @@

from __future__ import annotations

from collections import defaultdict
from typing import TYPE_CHECKING, Any

from api.logging import get_logger
from api.models import ActivityType

if TYPE_CHECKING:
from collections.abc import Sequence

from api.models import CustomZonesConfig

logger = get_logger(__name__)

OUTSIDE_ZONES_KEY = "Time Outside Defined Zones"
# Kind of arbitrary constant to match Strava calculation as much as possible
# Strava algorithm is for sure different, but this is our nice poor man's solution
MOVING_DISTANCE_THRESHOLDS = defaultdict(
lambda: 0.8,
{
ActivityType.RIDE: 3.0,
ActivityType.RUN: 2.0,
ActivityType.DEFAULT: 0.8,
},
)


def parse_activity_streams(
streams_data: dict[str, Any] | None,
) -> tuple[list[int] | None, list[int] | None]:
) -> tuple[list[int] | None, list[int] | None, list[float] | None, list[bool] | None]:
"""Parse the raw activity stream data from Strava to extract time and heart rate series.

Parameters
Expand All @@ -51,30 +65,32 @@ def parse_activity_streams(
Time series in seconds, or None if not found or invalid.
heartrate_data
Heart rate series in bpm, or None if not found or invalid.
distance_data
Distance series in meters, or None if not found or invalid.
moving_data
Moving data series, or None if not found or invalid.
"""
if not streams_data:
logger.warning("No stream data provided to parse.")
return None, None
return None, None, None, None

time_data = _parse_activity_stream(streams_data, "time")
heartrate_data = _parse_activity_stream(streams_data, "heartrate")
distance_data = _parse_activity_stream(streams_data, "distance", float)
moving_data = _parse_activity_stream(streams_data, "moving", bool)

if time_data and heartrate_data and len(time_data) != len(heartrate_data):
logger.warning(
f"Time stream (len {len(time_data)}) and heartrate stream (len {len(heartrate_data)}) "
"have different lengths. This might indicate an issue with the data."
)
return time_data, heartrate_data, distance_data, moving_data # type: ignore[return-value]

return time_data, heartrate_data


def _parse_activity_stream(data_streams: dict[str, Any], stream_type: str) -> list[int] | None:
def _parse_activity_stream(
data_streams: dict[str, Any], stream_type: str, expected_type: type[bool | int | float] = int
) -> list[int] | list[bool] | list[float] | None:
stream = data_streams.get(stream_type)
if isinstance(stream, dict) and isinstance(stream.get("data"), list):
if not (data := stream["data"]):
logger.warning(f"{stream_type.capitalize()} stream data array is empty.")
return None
if not all(isinstance(t, int) for t in data):
if not all(isinstance(t, expected_type) for t in data):
logger.warning(f"{stream_type.capitalize()} stream data contains non-integer values.")
return None
return data
Expand Down Expand Up @@ -127,6 +143,8 @@ def determine_hr_zone(hr_value: int, zones_config: CustomZonesConfig | None) ->
def calculate_time_in_zones(
time_data: list[int] | None,
heartrate_data: list[int] | None,
distance_data: list[float] | None,
moving_data: list[bool] | None,
zones_config: CustomZonesConfig | None,
) -> dict[str, int]:
"""Calculate the total time spent in each custom heart rate zone for an activity.
Expand All @@ -137,6 +155,10 @@ def calculate_time_in_zones(
A list of integers representing the time series in seconds (sorted).
heartrate_data
A list of integers representing the heart rate series in bpm.
distance_data
A list of floats representing the distance series in meters.
moving_data
A list of booleans representing whether the activity was moving at each time point.
zones_config
The CustomZonesConfig object containing the zone definitions.

Expand Down Expand Up @@ -179,16 +201,34 @@ def calculate_time_in_zones(
logger.info("Insufficient data points (need at least 2) to calculate time in zones.")
return time_spent_in_zones

for i in range(len(time_data) - 1):
hr_value = heartrate_data[i]
if (duration_seconds := time_data[i + 1] - time_data[i]) <= 0:
moving_threshold = MOVING_DISTANCE_THRESHOLDS[
zones_config.activity_type if zones_config is not None else ActivityType.DEFAULT
]
for idx in range(1, len(heartrate_data)):
# Skip non-moving times if data available
if not _is_moving_datapoint(moving_data, distance_data, moving_threshold, idx):
continue

if zone_name := determine_hr_zone(hr_value, zones_config):
time_spent_in_zones[zone_name] = (
time_spent_in_zones.get(zone_name, 0) + duration_seconds
)
if (duration := time_data[idx] - time_data[idx - 1]) <= 0:
continue

# Take the average of the current and previous heart rate data points
heart_rate = round((heartrate_data[idx] + heartrate_data[idx - 1]) / 2)
if zone_name := determine_hr_zone(heart_rate, zones_config):
time_spent_in_zones[zone_name] = time_spent_in_zones.get(zone_name, 0) + duration
else:
time_spent_in_zones[OUTSIDE_ZONES_KEY] += duration_seconds
time_spent_in_zones[OUTSIDE_ZONES_KEY] += duration

return time_spent_in_zones


def _is_moving_datapoint(
moving_data: Sequence[bool] | None,
distance_data: Sequence[float] | None,
moving_threshold: float,
idx: int,
) -> bool:
# Cannot evaluate if moving/distance data are not available
if not (moving_data and distance_data):
return True
return moving_data[idx] or (distance_data[idx] - distance_data[idx - 1] > moving_threshold)
2 changes: 1 addition & 1 deletion backend/api/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -321,7 +321,7 @@ def __str__(self) -> str:

def get_default_processing_start_time() -> datetime:
"""Return the default start time for activity processing."""
return timezone.make_aware(datetime(2025, 1, 1)) - timedelta(days=1)
return timezone.make_aware(datetime(2025, 6, 1)) - timedelta(days=1)


class ActivityProcessingQueue(models.Model):
Expand Down
2 changes: 1 addition & 1 deletion backend/api/strava_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -360,7 +360,7 @@ def fetch_activity_streams(
response = self.get(
url=STRAVA_API_STREAMS_URL_TEMPLATE.format(activity_id=activity_id),
access_token=self.access_token,
params={"keys": "heartrate,time", "key_by_type": "true"},
params={"keys": "heartrate,time,distance,moving", "key_by_type": "true"},
)
response.raise_for_status() # Raise HTTPError for bad responses (4XX or 5XX)
return response.json()
Expand Down
Loading