looking at tests, there is a lot of boilerplate that could be reduced, and tests could be made more readable, if we could specify dtypes for functions in core.construction (including from_any, from_list, and from_series).
for example:
df1 = pd.DataFrame(
[
['chr1', 1, 1]
],
columns=['chrom','start','end']
).astype({"start": pd.Int64Dtype(), "end": pd.Int64Dtype()})
would become
df1 = bf.from_any(['chr1', 1, 1])
We provide a dictionary for default columns names in core.specs, however there does not seem to be a dictionary (or other specification) for default dtypes.
One option would be to add them right after the default column names in core.specs:
https://github.com/open2c/bioframe/blob/main/bioframe/core/specs.py#L11C1-L12C1
If added, should they be int, pd.Int64Dtype(), or something else for start and end?
looking at tests, there is a lot of boilerplate that could be reduced, and tests could be made more readable, if we could specify dtypes for functions in core.construction (including
from_any,from_list, andfrom_series).for example:
would become
We provide a dictionary for default columns names in
core.specs, however there does not seem to be a dictionary (or other specification) for default dtypes.One option would be to add them right after the default column names in
core.specs:https://github.com/open2c/bioframe/blob/main/bioframe/core/specs.py#L11C1-L12C1
If added, should they be
int,pd.Int64Dtype(), or something else forstartandend?