fix #925: ht.nonzero() returns tuple of 1-D arrays instead of n-D arrays by Mystic-Slice · Pull Request #937 · helmholtz-analytics/heat

Mystic-Slice · 2022-03-23T03:52:21Z

Description

Switched out the torch's non-zero function with the NumPy version. Works as described in the issue ticket. Testing to be done.

Issue/s resolved: #925

Due Diligence

All split configurations tested
Multiple dtypes tested in relevant functions
Documentation updated (if needed)
Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

no

skip ci

different python and pytorch versions

for more information, see https://pre-commit.ci

delete example with different split axis

…oc/901-tutorial_update Update tutorial.ipynb

…ocs/927-citation Create CITATION.cff

…nhancement/203-ghactions-matrix run tests on gh actions

Removal of old logo

mtar · 2022-03-23T03:52:24Z

GPU cluster tests are currently disabled on this Pull Request.

Mystic-Slice · 2022-03-23T03:55:35Z

Hi @ClaudiaComito. This is the solution that I could come up with at first glance. It works well for the arrays I tested. But I still have to test different edge-cases like 1-D arrays, etc... Just thought I could get your input before proceeding further.
I also don't know if I handled the split, gout, and other things correctly. Let me know what you think.

Mystic-Slice · 2022-03-23T08:04:21Z

My fix failed for 1-D arrays. I think I fixed it now!

Mystic-Slice · 2022-03-23T08:05:57Z

I will update the documentation and the tests a little bit later.

ClaudiaComito

Hey @Mystic-Slice thanks a lot for diving into this right away!

Note that we only use torch operations internally. Our tensors might reside on GPUs, and a numpy operation would require copying the tensor to CPU.

ClaudiaComito

Well done @Mystic-Slice, we're getting close, thanks a lot! The tests will be more work because many existing functions rely on the previous nonzero format.

ClaudiaComito · 2022-03-25T08:55:35Z

@Mystic-Slice I changed the base branch of this PR, because the change you implemented is needed in our indexing overhaul PR #938.

I've taken care of resolving conflicts (hence the commits). I hope you don't mind.

Mystic-Slice · 2022-03-25T12:54:44Z

Not at all.
I still can make the required changes to this branch right?

ClaudiaComito · 2022-03-25T13:12:40Z

Not at all. I still can make the required changes to this branch right?

Absolutely!

Mystic-Slice · 2022-03-25T13:12:40Z

@ClaudiaComito
I don't know if we even need to keep track of gout (shape of the output), split, etc...
Because anyways we convert it into a tuple and we lose all this information.
What do you think?

Mystic-Slice · 2022-03-28T17:02:38Z

@ClaudiaComito Hi, I made the fixes. But I am not able to get rid of a few errors that arise in testing. Can you please take a look?

ClaudiaComito · 2022-03-30T07:22:22Z

@ClaudiaComito I don't know if we even need to keep track of gout (shape of the output), split, etc... Because anyways we convert it into a tuple and we lose all this information. What do you think?

@Mystic-Slice , the DNDarray metadata always have to be correct. The information doesn't get lost, it gets used in all subsequent operations, and the returned tuple is a tuple of DNDarrays whose metadata will be based on the original one.

ClaudiaComito · 2022-03-30T07:24:17Z

-        comm=x.comm,
-        balanced=False,
+    lcl_nonzero = lcl_nonzero.transpose(0, 1)
+


gout must reflect the transposed global shape

ClaudiaComito

@Mystic-Slice thanks for the changes. Just two more changes are needed as far as I'm concerned.

Don't worry about the tests, they fail because our getitem/setitem is transitioning (among other things) from the old nonzero format to the new one, so basically every operation fails. We will take care of this in the parent branch/PR.

Mystic-Slice · 2022-03-30T14:41:13Z

@Mystic-Slice thanks for the changes. Just two more changes are needed as far as I'm concerned.

Don't worry about the tests, they fail because our getitem/setitem is transitioning (among other things) from the old nonzero format to the new one, so basically every operation fails. We will take care of this in the parent branch/PR.

Ahh okok. I was so lost becoz the changes I made had nothing to do with the errors I got. Thnkx for the clarification.
I have made the changes you requested.

Mystic-Slice · 2022-03-30T14:43:38Z

I made this change because when the DNDarray is converted into a tuple, the meta-data is not really being passed on to the tuple members. This fix solves it.

ClaudiaComito

I've been losing (lengthy) comments so I'll submit this review now even if it's not quite finished and fix it later. Thanks @Mystic-Slice !

ClaudiaComito · 2022-03-31T03:17:40Z

+def nonzero(x: DNDarray) -> Tuple[DNDarray, ...]:
    """
-    Return a :class:`~heat.core.dndarray.DNDarray` containing the indices of the elements that are non-zero.. (using ``torch.nonzero``)
+    Return a Tuple of :class:`~heat.core.dndarray.DNDarray`s, one for each dimension of a,


"... one for each dimension of x"

ClaudiaComito · 2022-03-31T03:18:55Z

    """
-    Return a :class:`~heat.core.dndarray.DNDarray` containing the indices of the elements that are non-zero.. (using ``torch.nonzero``)
+    Return a Tuple of :class:`~heat.core.dndarray.DNDarray`s, one for each dimension of a,
+    containing the indices of the non-zero elements in that dimension. (using ``torch.nonzero``)


I know it was there before you started working on it, but I would remove "(using torch.nonzero)" now.

ClaudiaComito · 2022-03-31T03:35:20Z

    can be UNBALANCED as it contains the indices of the non-zero elements on each node.
-    Returns an array with one entry for each dimension of ``x``, containing the indices of the non-zero elements in that dimension.
-    The values in ``x`` are always tested and returned in row-major, C-style order.
+    The values in ``x`` are always tested and returned in column-major, F-style order.


No, they are still tested in row-major, C-style order, otherwise nonzero would return indices starting from the last dimension. Here's a good reference: Internal memory layout of an ndarray.

ClaudiaComito · 2022-04-01T04:32:00Z

        gout = list(lcl_nonzero.size())
        gout[0] = x.comm.allreduce(gout[0], MPI.SUM)


I had written a lengthy detailed explanation of what's happening here, and then I lost it. Doh. Anyway.

The original torch.nonzero output shape is (number_of_nonzero_elements, number_of_dimensions). The process-local lcl_nonzero only know about how many nonzero elements are on process. But the global output DNDarray must know the total.

That allreduce call replaces the local value of gout[0] (local number of rows = local number of nonzero elements) with the sum of all gout[0] on all processes, in order to synchronize gout to the global (albeit memory-distributed) number of rows of the output.

Problem: lcl_nonzero has been transposed, so it's no longer the rows, but the columns that represent the number of nonzero elements.

My suggestion:

transpose before the if split check

adjust line 71 lcl_nonzero[..., x.split] += displs[x.comm.rank] to correct the row (not the column) corresponding to the split dimension

the MPI collective call at line 78 should sum along the dimension containing the number of nonzero elements, now the columns

Mystic-Slice · 2022-04-01T14:16:26Z

@ClaudiaComito Thanks for the detailed review. I have made the changes.

ClaudiaComito

Thanks @Mystic-Slice, one more change needed and then I think we're good to go.

ClaudiaComito · 2022-04-05T03:21:44Z

+        [
+            DNDarray(
+                dim_indices,
+                gshape=tuple(dim_indices.size()),


gshape needs to be the global size of the array (you calculated it with the allreduce call), dim_indices are local slices of the array.

Ahh makes sense....will make the change

ClaudiaComito

@Mystic-Slice can you update the changelog? I'll merge this one into the parent branch afterwards. Thanks a lot!

Mystic-Slice · 2022-04-08T04:11:18Z

@Mystic-Slice can you update the changelog? I'll merge this one into the parent branch afterwards. Thanks a lot!

Done :)

ClaudiaComito

Great, thanks a lot @Mystic-Slice , I will merge this into the parent branch and we can take care of the tests with the rest.

mtar and others added 19 commits February 25, 2022 13:28

Create ci.yaml

f7adcf2

Update ci.yaml

2ab82b5

Update ci.yaml

f261e8e

Create CITATION.cff

9b863a7

Update CITATION.cff

2b2622a

Update ci.yaml

a15b299

different python and pytorch versions

Update ci.yaml

8910bf7

[pre-commit.ci] auto fixes from pre-commit.com hooks

f8dc8b8

for more information, see https://pre-commit.ci

Delete pre-commit.yml

767eabc

Merge branch 'main' into enhancement/203-ghactions-matrix

3cd1d33

Update ci.yaml

61cef7f

Update CITATION.cff

74b1a30

Update tutorial.ipynb

93cd831

delete example with different split axis

Merge pull request helmholtz-analytics#931 from helmholtz-analytics/d…

2a25d22

…oc/901-tutorial_update Update tutorial.ipynb

Merge branch 'main' into docs/927-citation

e154ab9

Merge pull request helmholtz-analytics#929 from helmholtz-analytics/d…

7c57942

…ocs/927-citation Create CITATION.cff

Merge branch 'main' into enhancement/203-ghactions-matrix

114e74e

Merge pull request helmholtz-analytics#924 from helmholtz-analytics/e…

14aae08

…nhancement/203-ghactions-matrix run tests on gh actions

Delete logo_heAT.pdf

dd1b83d

Removal of old logo

Mystic-Slice force-pushed the NonzeroFunction branch from 1d683cf to 96fabe5 Compare March 23, 2022 07:30

Mystic-Slice changed the title ~~fix #925: Nonzero function works like the numpy counterpart~~ fix #925: ht.nonzero() returns tuple of 1-D arrays instead of n-D arrays Mar 23, 2022

ClaudiaComito requested changes Mar 23, 2022

View reviewed changes

Comment thread heat/core/indexing.py Outdated

Comment thread heat/core/indexing.py Outdated

Comment thread heat/core/indexing.py Outdated

Mystic-Slice closed this Mar 23, 2022

Mystic-Slice force-pushed the NonzeroFunction branch from be699be to dd1b83d Compare March 23, 2022 15:18

ht.nonzero() returns tuple of 1-D arrays instead of n-D arrays

7e6ad4a

Mystic-Slice reopened this Mar 23, 2022

ClaudiaComito requested changes Mar 25, 2022

View reviewed changes

Comment thread heat/core/indexing.py Outdated

Comment thread heat/core/tests/test_indexing.py Outdated

ClaudiaComito changed the base branch from main to 914_adv-indexing-outshape-outsplit March 25, 2022 08:43

Merge branch '914_adv-indexing-outshape-outsplit' into NonzeroFunction

03e1287

ClaudiaComito reviewed Mar 25, 2022

View reviewed changes

Comment thread heat/core/indexing.py Outdated

replace x.larray with local_x

420f064

Code fixes

a00ed61

ClaudiaComito reviewed Mar 30, 2022

View reviewed changes

Comment thread heat/core/indexing.py Outdated

ClaudiaComito reviewed Mar 30, 2022

View reviewed changes

ClaudiaComito requested changes Mar 30, 2022

View reviewed changes

Mystic-Slice added 2 commits March 30, 2022 20:01

Fix return type of nonzero function and gout value

d4a8813

Made sure DNDarray meta-data is available to the tuple members

67fcdc8

ClaudiaComito requested changes Apr 1, 2022

View reviewed changes

Transpose before if-branching + adjustments to accomodate it

39103fa

ClaudiaComito requested changes Apr 5, 2022

View reviewed changes

Fixed global shape assignment

3ed205c

ClaudiaComito requested changes Apr 8, 2022

View reviewed changes

Updated changelog

70dded6

ClaudiaComito approved these changes Apr 8, 2022

View reviewed changes

ClaudiaComito merged commit eb297fb into helmholtz-analytics:914_adv-indexing-outshape-outsplit Apr 8, 2022

ClaudiaComito mentioned this pull request Apr 27, 2023

ht.nonzero() should return tuple of 1-D arrays instead of an n-D array #925

Closed

		gout = list(lcl_nonzero.size())
		gout[0] = x.comm.allreduce(gout[0], MPI.SUM)

Conversation

Mystic-Slice commented Mar 23, 2022 • edited by ClaudiaComito Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Due Diligence

Does this change modify the behaviour of other functions? If so, which?

Uh oh!

mtar commented Mar 23, 2022

Uh oh!

Mystic-Slice commented Mar 23, 2022

Uh oh!

Mystic-Slice commented Mar 23, 2022

Uh oh!

Mystic-Slice commented Mar 23, 2022

Uh oh!

ClaudiaComito left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ClaudiaComito left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ClaudiaComito commented Mar 25, 2022

Uh oh!

Mystic-Slice commented Mar 25, 2022

Uh oh!

ClaudiaComito commented Mar 25, 2022

Uh oh!

Mystic-Slice commented Mar 25, 2022

Uh oh!

Mystic-Slice commented Mar 28, 2022

Uh oh!

ClaudiaComito commented Mar 30, 2022

Uh oh!

Uh oh!

ClaudiaComito Mar 30, 2022

Choose a reason for hiding this comment

Uh oh!

ClaudiaComito left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mystic-Slice commented Mar 30, 2022

Uh oh!

Mystic-Slice commented Mar 30, 2022

Uh oh!

ClaudiaComito left a comment

Choose a reason for hiding this comment

Uh oh!

ClaudiaComito Mar 31, 2022

Choose a reason for hiding this comment

Uh oh!

ClaudiaComito Mar 31, 2022

Choose a reason for hiding this comment

Uh oh!

ClaudiaComito Mar 31, 2022

Choose a reason for hiding this comment

Uh oh!

ClaudiaComito Apr 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mystic-Slice commented Apr 1, 2022

Uh oh!

ClaudiaComito left a comment

Choose a reason for hiding this comment

Uh oh!

ClaudiaComito Apr 5, 2022

Choose a reason for hiding this comment

Uh oh!

Mystic-Slice Apr 5, 2022

Choose a reason for hiding this comment

Uh oh!

ClaudiaComito left a comment

Choose a reason for hiding this comment

Mystic-Slice commented Mar 23, 2022 •

edited by ClaudiaComito

Loading

ClaudiaComito left a comment •

edited

Loading

ClaudiaComito Apr 1, 2022 •

edited

Loading