Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
fdca505
Add basic slice support to positivify.
bgrant May 7, 2014
47e89ba
Add a docstring to positivify.
bgrant May 7, 2014
8b4d6d3
Add better support for slices to positivify.
bgrant May 7, 2014
29707fe
Merge branch 'master' into feature/add-slicing
bgrant May 20, 2014
3732ced
Fix indexing errors by using Integral instead of int.
bgrant May 20, 2014
4b8e155
Merge branch 'master' into feature/add-slicing
bgrant May 20, 2014
72b5ab8
WIP: Add failing slice test.
bgrant May 20, 2014
4c66e91
Add a tuple_intersection function to metadata_utils.
bgrant May 20, 2014
b6d0ae2
Add slice support to dist/maps (for BlockMap)...
bgrant May 20, 2014
9eb9a03
Add slice support to local/maps (for BlockMap).
bgrant May 20, 2014
9bdae51
Allow multiple results through.
bgrant May 20, 2014
74b8acf
Merge branch 'refactor/restrict-use-of-checked-getitem' into feature/…
bgrant May 21, 2014
ef49a24
Unwrap a docstring.
bgrant May 21, 2014
176cfd9
Slicing works for __getitem__.
bgrant May 21, 2014
5a9483a
Fix positivify's behavior with slices.
bgrant May 22, 2014
0fa6ac3
Don't test for int, test for Integral
bgrant May 22, 2014
d969417
Add `targets` arg to context.apply calls
bgrant May 22, 2014
f0ddae1
Factor sanitize_indices out into metadata_utils
bgrant May 22, 2014
0813be2
Fix positivify and add regression tests.
bgrant May 22, 2014
9459142
Get rid of reference to old `client_map` module
bgrant May 22, 2014
19c7d52
Add classmethod Distribution.from_slice.
bgrant May 22, 2014
35668f3
`__getitem__` slicing works!?
bgrant May 23, 2014
96367c9
Fix bug.
bgrant May 23, 2014
f5b0149
Make Distribution.from_slice into slice instancemethod
bgrant May 23, 2014
4b829ad
Move new method below constructors.
bgrant May 23, 2014
e37f472
Clean up Distribution slice tests.
bgrant May 23, 2014
42e64cc
Slightly expand a Distribution.slice test.
bgrant May 23, 2014
909a64f
Add a Distribution.slice test.
bgrant May 23, 2014
cb43abf
Generalize sanitize_indices for incomplete indexing.
bgrant May 23, 2014
61788f0
Fill out sanitize_indices docstring.
bgrant May 23, 2014
70195c0
Remove a competing sanitize_indices.
bgrant May 23, 2014
c2d63e2
Add more tests to test_distarray. Some fail.
bgrant May 23, 2014
514857b
Add a call to positivify.
bgrant May 23, 2014
b392a84
Call positivify in sanitize_indices...
bgrant May 23, 2014
a9f161c
Whitespace.
bgrant May 23, 2014
6f284a0
Call `sanitize_indices` with full args.
bgrant May 23, 2014
3287d2d
Rename a value more descriptively.
bgrant May 23, 2014
af94432
Add a local slicing test.
bgrant May 26, 2014
1a86d81
Fix the slicing bug.
bgrant May 26, 2014
f779fcd
Remove another skiptest.
bgrant May 26, 2014
f94b9e3
Fix the local slicing test after API change.
bgrant May 26, 2014
64845dd
Fix output dimensionality.
bgrant May 26, 2014
03eef23
Unskip the last test.
bgrant May 26, 2014
daa6d7b
Line wrap.
bgrant May 26, 2014
a3c6efc
Remove a debugging statement.
bgrant May 26, 2014
78277dd
Preserve no-dist maps.
bgrant May 26, 2014
cb22426
Merge branch 'master' into feature/add-slicing
bgrant May 26, 2014
608d0ba
Test dist_type preservation.
bgrant May 26, 2014
c17f456
Add ellipsis support to sanitize_indices.
bgrant May 26, 2014
178356f
Add getitem ellipsis tests.
bgrant May 27, 2014
68e6e31
WIP: Add failing test.
bgrant May 27, 2014
ac301b9
Add a comment.
bgrant May 27, 2014
992fad5
Already works for a setitem that doesn't span procs.
bgrant May 27, 2014
c03b560
Add a new failing test.
bgrant May 27, 2014
b38afa3
Add `__setitem__` slicing.
bgrant May 27, 2014
d59e31e
Make work for 2d slices.
bgrant May 27, 2014
23afae0
Add more setitem slice tests.
bgrant May 27, 2014
ec659ad
Add more tests.
bgrant May 27, 2014
5ddd354
Make setUp and tearDown classmethods.
bgrant May 27, 2014
858ac21
Remove completed TODO comment.
bgrant May 27, 2014
342f19f
Convert a non-array rvalue to array.
bgrant May 27, 2014
6c7e84a
Add failing ValueError test.
bgrant May 27, 2014
df4f699
Remove an obsolete comment.
bgrant May 27, 2014
29a33a7
Raise an IndexError instead of a TypeError...
bgrant May 27, 2014
c78e8d2
Raise a ValueError if rvalue shape is incorrect...
bgrant May 27, 2014
9f98b80
Add failing test.
bgrant May 28, 2014
678d76d
Generalize tuple_intersection.
bgrant May 28, 2014
857ba3b
Merge branch 'master' into feature/add-slicing
bgrant May 29, 2014
af1e916
Add a slice method to individual client Map types.
bgrant May 29, 2014
e1ae360
Merge branch 'feature/add-slicing' into feature/add-ellipsis-support-…
bgrant May 29, 2014
572b1a4
Merge branch 'feature/add-ellipsis-support-to-slicing' into feature/s…
bgrant May 29, 2014
46f94dc
Merge branch 'feature/setitem-slicing' into feature/allow-slice-steps
bgrant May 29, 2014
53ca612
Fix tuple_intersection for start0 >= start1.
bgrant May 29, 2014
913f4a0
Slicing with steps.
bgrant May 29, 2014
546acac
Fix a bug.
bgrant May 30, 2014
2b45d3f
Add some setitem tests.
bgrant May 30, 2014
9db6153
Merge branch 'master' into feature/add-slicing
bgrant Jun 5, 2014
dd4c74c
Merge branch 'feature/add-slicing' into feature/add-ellipsis-support-…
bgrant Jun 5, 2014
d34bb17
Fix test in test_distarray.
bgrant Jun 5, 2014
8e2c109
Merge branch 'feature/add-ellipsis-support-to-slicing' into feature/s…
bgrant Jun 5, 2014
dbc70f4
Merge branch 'feature/setitem-slicing' into feature/allow-slice-steps
bgrant Jun 5, 2014
5e933e4
Fix tests.
bgrant Jun 5, 2014
13c7a43
Add APUG slides.
bgrant Jun 12, 2014
fac83d9
Add accompanying IPython notebook.
bgrant Jun 12, 2014
91c408d
Add CSS to use with Notebook slideshow view.
bgrant Jun 12, 2014
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
165 changes: 109 additions & 56 deletions distarray/dist/distarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,12 @@
import operator
from itertools import product
from functools import reduce
from collections import Sequence

import numpy as np

import distarray
from distarray.metadata_utils import sanitize_indices
from distarray.dist.maps import Distribution
from distarray.utils import _raise_nie
from distarray.metadata_utils import normalize_reduction_axes
Expand All @@ -32,7 +34,6 @@
# Code
# ---------------------------------------------------------------------------


class DistArray(object):

__array_priority__ = 20.0
Expand Down Expand Up @@ -84,7 +85,8 @@ def get_dim_datas_and_dtype(arr):

# has context, get dist and dtype
elif (distribution is None) and (dtype is None):
res = context.apply(get_dim_datas_and_dtype, args=(key,))
res = context.apply(get_dim_datas_and_dtype, args=(key,),
targets=targets)
dim_datas = [i[0] for i in res]
dtypes = [i[1] for i in res]
da._dtype = dtypes[0]
Expand All @@ -95,7 +97,8 @@ def get_dim_datas_and_dtype(arr):
# has context and dtype, get dist
elif (distribution is None) and (dtype is not None):
da._dtype = dtype
dim_datas = context.apply(getattr, args=(key, 'dim_data'))
dim_datas = context.apply(getattr, args=(key, 'dim_data'),
targets=targets)
da.distribution = Distribution.from_dim_data_per_rank(context,
dim_datas,
targets)
Expand Down Expand Up @@ -128,10 +131,31 @@ def __repr__(self):
(self.shape, self.targets)
return s

def _process_return_value(self, result, return_proxy, index, targets):

if return_proxy:
# proxy returned as result of slice
# slicing shouldn't alter the dtype
result = result[0]
return DistArray.from_localarrays(key=result,
context=self.context,
targets=targets,
dtype=self.dtype)

elif isinstance(result, Sequence):
somethings = [i for i in result if i is not None]
if len(somethings) == 0:
# using checked_getitem and all return None
raise IndexError("Index %r is is not present." % (index,))
if len(somethings) == 1:
return somethings[0]
else:
return result
else:
assert False # impossible is nothing


def __getitem__(self, index):
#TODO: FIXME: major performance improvements possible here,
# especially for special cases like `index == slice(None)`.
# This would dramatically improve tondarray's performance.

# to be run locally
def checked_getitem(arr, index):
Expand All @@ -141,38 +165,35 @@ def checked_getitem(arr, index):
def raw_getitem(arr, index):
return arr.global_index[index]

if isinstance(index, int) or isinstance(index, slice):
tuple_index = (index,)
return self.__getitem__(tuple_index)

elif isinstance(index, tuple):
targets = self.distribution.owning_targets(index)

args = (self.key, index)
if self.distribution.has_precise_index:
result = self.context.apply(raw_getitem, args=args,
targets=targets)
else:
result = self.context.apply(checked_getitem, args=args,
targets=targets)
result = [i for i in result if i is not None]
if len(result) != 1:
raise IndexError("Getting more than one result (%s) is not "
"supported yet." % (result,))
elif result is None:
raise IndexError("Index %r is out of bounds" % (index,))
else:
return result[0]
else:
raise TypeError("Invalid index type.")
# to be run locally
def get_slice(arr, index, ddpr, comm):
from distarray.local.maps import Distribution
local_distribution = Distribution(comm=comm,
dim_data=ddpr[comm.Get_rank()])
result = arr.global_index.get_slice(index, local_distribution)
return proxyize(result)

return_type, index = sanitize_indices(index, ndim=self.ndim,
shape=self.shape)
return_proxy = (return_type == 'view')
targets = self.distribution.owning_targets(index)

args = [self.key, index]
if self.distribution.has_precise_index:
if return_proxy: # returning a new DistArray view
new_distribution = self.distribution.slice(index)
ddpr = new_distribution.get_dim_data_per_rank()
args.extend([ddpr, new_distribution.comm])
local_fn = get_slice
else: # returning a value
local_fn = raw_getitem
else: # returning a value from unstructured
local_fn = checked_getitem

result = self.context.apply(local_fn, args=args, targets=targets)
return self._process_return_value(result, return_proxy, index, targets)

def __setitem__(self, index, value):
#TODO: FIXME: major performance improvements possible here.
# Especially when `index == slice(None)` and value is an
# ndarray, since for block and cyclic, we can generate slices of
# `value` and assign to local arrays. This would dramatically
# improve the fromndarray method's performance.

# to be run locally
def checked_setitem(arr, index, value):
return arr.global_index.checked_setitem(index, value)
Expand All @@ -181,26 +202,57 @@ def checked_setitem(arr, index, value):
def raw_setitem(arr, index, value):
arr.global_index[index] = value

if isinstance(index, int) or isinstance(index, slice):
tuple_index = (index,)
return self.__setitem__(tuple_index, value)

elif isinstance(index, tuple):
targets = self.distribution.owning_targets(index)
args = (self.key, index, value)
if self.distribution.has_precise_index:
self.context.apply(raw_setitem, args=args, targets=targets)
# to be run locally
def set_slice(arr, index, value, value_slices):
local_slice = value_slices[arr.comm_rank]
arr.global_index[index] = value[local_slice]

set_type, index = sanitize_indices(index, ndim=self.ndim,
shape=self.shape)

targets = self.distribution.owning_targets(index)
args = [self.key, index, value]
if self.distribution.has_precise_index:
if set_type == 'value':
local_fn = raw_setitem
elif set_type == 'view':
args[-1] = np.asarray(args[-1]) # convert to array
# this could be made more efficient
# we only need the bounds computed by distribution.slice
new_distribution = self.distribution.slice(index)
if args[-1].shape != new_distribution.shape:
msg = "Slice shape does not equal rvalue shape."
raise ValueError(msg)
ddpr = new_distribution.get_dim_data_per_rank()
def bounds_slice(dd):
if dd['dist_type'] == 'b':
return slice(dd['start'], dd['stop'])
elif dd['dist_type'] == 'n':
return slice(0, dd['size'])
else:
msg = "Function only works for 'n' and 'b' 'dist_type's"
raise TypeError(msg)
value_slices = [tuple(bounds_slice(dd) for dd in dim_data)
for dim_data in ddpr]
# but we need a data structure indexable by a target's rank
# assume contigious range of targets here
value_slices_per_target = [None] * len(self.targets)
value_slices_per_target[targets[0]:targets[-1]] = value_slices
args.append(value_slices_per_target)
local_fn = set_slice
else:
result = self.context.apply(checked_setitem, args=args,
targets=targets)
result = [i for i in result if i is not None]
if len(result) > 1:
raise IndexError("Setting more than one result (%s) is "
"not supported yet." % (result,))
elif result == []:
raise IndexError("Index %s is out of bounds" % (index,))
else:
raise TypeError("Invalid index type.")
assert False
self.context.apply(local_fn, args=args, targets=targets)

else: # setting unstructured elements
local_fn = checked_setitem
result = self.context.apply(local_fn, args=args, targets=targets)
result = [i for i in result if i is not None]
if len(result) > 1:
raise IndexError("Setting more than one result (%s) is "
"not supported yet." % (result,))
elif result == []:
raise IndexError("Index %s is out of bounds" % (index,))

@property
def context(self):
Expand Down Expand Up @@ -246,7 +298,8 @@ def tondarray(self):
"""Returns the distributed array as an ndarray."""
arr = np.empty(self.shape, dtype=self.dtype)
local_name = self.context._generate_key()
self.context._execute('%s = %s.copy()' % (local_name, self.key), targets=self.targets)
self.context._execute('%s = %s.copy()' % (local_name, self.key),
targets=self.targets)
local_arrays = self.context._pull(local_name, targets=self.targets)
for local_array in local_arrays:
maps = (list(ax_map.global_iter) for ax_map in
Expand Down
Loading