I'm receiving index error on the line #317:
random_element = random.choice(clusters[biggest_cluster].members)
I have a large dataframe (10000+ rows and 15+ columns). I tried this first with k=2. I debugged the program and it is because cluster_sizes gets 0 as value in two of its elements, but I'm not able to understand why.
If I limit my dataframe by say, a 100 rows, this error goes away, but then I get another error after 3 iterations of the algorithm: 'More clusters than data points?'
Any ideas on how to solve this?
I'm receiving index error on the line #317:
random_element = random.choice(clusters[biggest_cluster].members)
I have a large dataframe (10000+ rows and 15+ columns). I tried this first with k=2. I debugged the program and it is because cluster_sizes gets 0 as value in two of its elements, but I'm not able to understand why.
If I limit my dataframe by say, a 100 rows, this error goes away, but then I get another error after 3 iterations of the algorithm: 'More clusters than data points?'
Any ideas on how to solve this?