Hello,
I am getting a TypeError in the current version of this module. Whether it appears or not depends on the number of clusters I request. On the same dataset, with 2 clusters requested I never see this error, with 4 clusters I see it sometimes, with 10 I always see it.
File "/usr/local/lib/python3.5/dist-packages/pyspark_kmodes/pyspark_kmodes.py", line 430, in fit
self.n_clusters,self.max_dist_iter)
File "/usr/local/lib/python3.5/dist-packages/pyspark_kmodes/pyspark_kmodes.py", line 271, in k_modes_partitioned
clusters = check_for_empty_cluster(clusters, rdd)
File "/usr/local/lib/python3.5/dist-packages/pyspark_kmodes/pyspark_kmodes.py", line 315, in check_for_empty_cluster
partition_sizes = cluster_sizes[n_clusters*(partition_index):n_clusters*(partition_index+1)]
TypeError: slice indices must be integers or None or have an index method
This is Spark 2.2.
Any ideas will be appreciated.
Hello,
I am getting a TypeError in the current version of this module. Whether it appears or not depends on the number of clusters I request. On the same dataset, with 2 clusters requested I never see this error, with 4 clusters I see it sometimes, with 10 I always see it.
File "/usr/local/lib/python3.5/dist-packages/pyspark_kmodes/pyspark_kmodes.py", line 430, in fit
self.n_clusters,self.max_dist_iter)
File "/usr/local/lib/python3.5/dist-packages/pyspark_kmodes/pyspark_kmodes.py", line 271, in k_modes_partitioned
clusters = check_for_empty_cluster(clusters, rdd)
File "/usr/local/lib/python3.5/dist-packages/pyspark_kmodes/pyspark_kmodes.py", line 315, in check_for_empty_cluster
partition_sizes = cluster_sizes[n_clusters*(partition_index):n_clusters*(partition_index+1)]
TypeError: slice indices must be integers or None or have an index method
This is Spark 2.2.
Any ideas will be appreciated.