Skip to content

Reduce the number of roundtrips for DistArray.get_localshapes to zero.#423

Merged
kwmsmith merged 10 commits into
enthought:masterfrom
cowlicks:get_localshapes
Jun 11, 2014
Merged

Reduce the number of roundtrips for DistArray.get_localshapes to zero.#423
kwmsmith merged 10 commits into
enthought:masterfrom
cowlicks:get_localshapes

Conversation

@cowlicks
Copy link
Copy Markdown
Contributor

@cowlicks cowlicks commented Jun 4, 2014

This also adds functions to metadata utils that could be useful for local map construction. But that can wait for another PR.

Comment thread distarray/metadata_utils.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if stop or start is None? Is stop - start supposed to raise an exception in that case?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeError: unsupported operand type(s) for -: `int` and `NoneType`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit more straightforward to let it error out earlier, giving a KeyError rather than a TypeError, which to my mind is more informative:

stop, start = dim_data['stop'], dim_data['start']
return stop - start

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does make more sense. Which things in the dim_data should have default values incase they are not provided. Did I get it right here and here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Tue, Jun 10, 2014 at 11:35 AM, Blake Griffith notifications@github.com
wrote:

In distarray/metadata_utils.py:

@@ -229,3 +229,87 @@ def normalize_reduction_axes(axes, ndim):
else:
axes = tuple(positivify(a, ndim) for a in axes)
return axes
+
+
+# functions for getting a size from a dim_data for each dist_type
+# n
+def non_dist_size(dim_data):

  • return dim_data['size']

+# b
+def block_size(dim_data):

  • stop = dim_data.get('stop', None)
  • start = dim_data.get('start', None)
  • return stop - start

That does make more sense. Which things in the dim_data should have
default values incase they are not provided. Did I get it right here
https://github.com/enthought/distarray/pull/423/files#diff-4eed00415103bcf79667fada17612dbaR271
and here
https://github.com/enthought/distarray/pull/423/files#diff-4eed00415103bcf79667fada17612dbaR282
?

The distributed array protocol goes into the details:

http://distributed-array-protocol.readthedocs.org/en/rel-0.10.0/


Reply to this email directly or view it on GitHub
https://github.com/enthought/distarray/pull/423/files#r13604231.

Kurt W. Smith, Ph.D. ksmith@enthought.com
Enthought, Inc. http://www.enthought.com | 512.536.1057

@cowlicks
Copy link
Copy Markdown
Contributor Author

@kwmsmith @bgrant what is the standard way to make a cyclic distribution with block_size > 1?

@bgrant
Copy link
Copy Markdown
Contributor

bgrant commented Jun 10, 2014

Use a global_dim_data:

For example,

In [3]: from distarray.dist import DistArray, Context, Distribution
In [4]: c = Context()
In [5]: bc_dim = dict(dist_type='c', proc_grid_size=4, size=50, block_size=3)
In [6]: global_dim_data = (bc_dim,)
In [8]: d = Distribution(c, global_dim_data)
In [11]: d.maps[0]
Out[11]: <distarray.dist.maps.BlockCyclicMap at 0x104ca8750>
In [12]: d.maps[0].block_size
Out[12]: 3

Comment thread distarray/dist/distarray.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cowlicks I'd prefer if there was a .get_localshapes() method on client-side Distribution objects, and DistArray.get_localshapes() just returns self.distribution.get_localshapes(). The Distribution.get_localshapes() would just do:

return get_shape_from_dim_data_per_rank(self.get_dim_data_per_rank())

Having a Distribution.get_localshapes() method will be useful in other places as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On that note, since these no longer require communication, should we drop the get_ prefixes?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, keeping it Distribution.get_localshapes() is fine, the get_ prefix doesn't imply communication, just that it's a getter. We might make it into a localshapes property down the line, but since there's some non-trivial logic going on here, I'd prefer the get_localshapes() name.

@cowlicks
Copy link
Copy Markdown
Contributor Author

I think this is ready, just waiting on Travis.

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage increased (+0.1%) when pulling 548fc40 on cowlicks:get_localshapes into e4acdac on enthought:master.

@kwmsmith
Copy link
Copy Markdown
Contributor

Restarting Travis.

@cowlicks
Copy link
Copy Markdown
Contributor Author

Travis is having a bad day.

kwmsmith added a commit that referenced this pull request Jun 11, 2014
Reduce the number of roundtrips for DistArray.get_localshapes to zero.
@kwmsmith kwmsmith merged commit de18b91 into enthought:master Jun 11, 2014
@cowlicks cowlicks deleted the get_localshapes branch June 11, 2014 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants