Hello,
In KT1, does the elapsed_time correpond to the prior question, as the case in the Kaggle competition, or to the current question? Also, is it the average time spent on questions from the same bundle?
Specifically, could you explain why the lag_time computed below can contain negative values:
import pandas as pd
df = pd.read_csv('u42.csv')
bundle_size = dict(df.groupby('solving_id').size())
if not df['timestamp'].is_monotonic_increasing:
df = df.sort_values('timestamp')
df['lag_time'] = df.apply(
lambda r:
0 if r.name == 0 or r['solving_id'] == df.loc[r.name - 1, 'solving_id']
else r['timestamp'] - (df.loc[r.name - bundle_size[r['solving_id'] - 1], 'timestamp'] + df.loc[r.name - 1, 'elapsed_time'] * bundle_size[r['solving_id'] - 1]),
axis=1
).squeeze()
print(df[df['lag_time'] < 0])
Thank you.
Hello,
In KT1, does the
elapsed_timecorrepond to the prior question, as the case in the Kaggle competition, or to the current question? Also, is it the average time spent on questions from the same bundle?Specifically, could you explain why the
lag_timecomputed below can contain negative values:Thank you.