It would be ideal to have a function like am.get_within_time_range(segments: pd.DataFrame, **kwargs) that lets us get within time range over segments, instead of passing in something like am.annotations.
I've had the need for such a function on a few occasions.
I paste such a function below I used a while back for some quick experimentation, but it seems to behave slightly differently from CP. For one, I don't think the CP one actually cuts segments at the edges. Of course we could parametrize that.
def _get_within_time_range(
segments: pd.DataFrame, interval: TimeInterval
) -> pd.DataFrame:
"""
The ChildProject function doesn't seem to be correct sometimes(?) Besides, it clips segments at the edge,
which I don't want for the purpose of this script
This function is also significantly faster and from profiling it's clear that
the ChildProject routine was the main performance bottleneck in running this script
"""
segments = segments[
(
segments["offset_time"].map(lambda t: t.to_pydatetime().time())
>= interval.start.time()
)
& (
segments["onset_time"].map(lambda t: t.to_pydatetime().time())
<= interval.stop.time()
)
]
segments = segments.apply(get_row_callback_min(interval), axis=1)
segments = segments.apply(get_row_callback_max(interval), axis=1)
return segments
def get_row_callback_min(
time_interval: TimeInterval,
) -> Callable[[pd.Series], pd.Series]:
def row_callback(row: pd.Series) -> bool:
onset_time: pd.Timestamp = row["onset_time"]
if onset_time.to_pydatetime().time() <= time_interval.start.time():
row["onset_time"] = pd.Timestamp(
year=onset_time.year,
month=onset_time.month,
day=onset_time.day,
hour=time_interval.start.hour,
minute=time_interval.start.minute,
second=time_interval.start.second,
)
return row
return row_callback
def get_row_callback_max(
time_interval: TimeInterval,
) -> Callable[[pd.Series], pd.Series]:
def row_callback(row: pd.Series) -> bool:
offset_time: pd.Timestamp = row["offset_time"]
if offset_time.to_pydatetime().time() >= time_interval.stop.time():
row["offset_time"] = pd.Timestamp(
year=offset_time.year,
month=offset_time.month,
day=offset_time.day,
hour=time_interval.stop.hour,
minute=time_interval.stop.minute,
second=time_interval.stop.second,
)
return row
return row_callback
It would be ideal to have a function like
am.get_within_time_range(segments: pd.DataFrame, **kwargs)that lets us get within time range over segments, instead of passing in something likeam.annotations.I've had the need for such a function on a few occasions.
I paste such a function below I used a while back for some quick experimentation, but it seems to behave slightly differently from CP. For one, I don't think the CP one actually cuts segments at the edges. Of course we could parametrize that.