Skip to content

Handle host frames in serialization#5309

Merged
kkraus14 merged 6 commits into
rapidsai:branch-0.15from
jakirkham:hdl_host_frames_ser
May 31, 2020
Merged

Handle host frames in serialization#5309
kkraus14 merged 6 commits into
rapidsai:branch-0.15from
jakirkham:hdl_host_frames_ser

Conversation

@jakirkham
Copy link
Copy Markdown
Member

Depends on PR ( #5308 )

As some serialization strategies may rely on some or all frames on host, add logic to Serializable to track which frames are on host or device and ensure they are handled appropriately. This should help cases like pack/unpack ( #5025 ) and allow Serializable to be used by other objects using host frames like those in FIL ( rapidsai/cuml#2263 ).

Note: Adding this as a draft for now, though am not aware of anything missing from it (unless reviewers spot something ;)

cc @shwina @hcho3

As some serialization strategies may rely on some or all frames on host,
add logic to `Serializable` to track which frames are on host or device
and ensure they are handled appropriately.
@jakirkham jakirkham added 2 - In Progress Currently a work in progress Python Affects Python cuDF API. labels May 28, 2020
@jakirkham jakirkham marked this pull request as ready for review May 28, 2020 21:26
@jakirkham jakirkham requested a review from a team as a code owner May 28, 2020 21:26
@jakirkham jakirkham changed the title WIP: Handle host frames in serialization Handle host frames in serialization May 28, 2020
@jakirkham jakirkham added 4 - Needs cuDF (Python) Reviewer and removed 2 - In Progress Currently a work in progress labels May 28, 2020
Copy link
Copy Markdown
Contributor

@shwina shwina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shwina
Copy link
Copy Markdown
Contributor

shwina commented May 28, 2020

FWIW the pickle failure you're seeing here is resolved in the downstream PR #5025; although I'm not clear on why it comes up here.

@jakirkham
Copy link
Copy Markdown
Member Author

Thanks! Yeah was looking at that as well. It looks like you were able to drop serialize/deserialize from DataFrame. Wonder if there is something missing in those methods.

@kkraus14 kkraus14 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 4 - Needs cuDF (Python) Reviewer labels May 29, 2020
@jakirkham
Copy link
Copy Markdown
Member Author

After looking more closely at the test failures here, I think this is the same issue we ran into yesterday. IOW the pickle test is assuming approximate equality of the size of the pickled data to the size of the Python object. However pickling will always tack in more information for the unpickling process. So we should just be checking that the pickled data is at least as big as the original data (though most likely bigger). Submitted PR ( #5334 ), which grabs your change from PR ( #5025 ). Hopefully once that is in we will clear the failure here.

@jakirkham
Copy link
Copy Markdown
Member Author

Merged in branch-0.15 with the pickle test fix. Let's see if that resolves it.

@kkraus14 kkraus14 merged commit 8209b96 into rapidsai:branch-0.15 May 31, 2020
@jakirkham jakirkham deleted the hdl_host_frames_ser branch May 31, 2020 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

5 - Ready to Merge Testing and reviews complete, ready to merge Python Affects Python cuDF API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants