Pass device memory to MParT through bindings

Ultimately I'd like to create a pytorch Tensor (or Julia array), send it to the device, and then wrap that GPU-memory in a way that Kokkos can use inside MParT.   This issue is devoted to assessing the feasibility of that and possibly an initial implementation in the python bindings.