-
Notifications
You must be signed in to change notification settings - Fork 21
Description
Can we add support for running operations on GPU arrays?
RIght now we rely on local numpy primitives to implement tile level operations; if we wanted to use GPU arrays, we would likely need to implement operations ourselves instead so we could generate CUDA code as necessary. This is easy for simple operations like add or multiply, but much harder if the user wants to do:
def f(data):
# some python code
return new_data
Y = map(f, X)
Here we would need to analyze arbitrary Python code. I'm not sure how to handle this case without some serious amount of work. It might be possible to capture the internal operations using a lazy expression:
input = lazy_expr()
result = f(input)
Now the result will be the cumulative effect of any internal user operations. But this relies on the user not requiring things like control flow, or trying to print a result.