Skip to content

Latest commit

 

History

History
53 lines (40 loc) · 1.51 KB

File metadata and controls

53 lines (40 loc) · 1.51 KB

Parallel-KNN

A Parallel implementation of the famous K-Nearest-Neighbor (KNN) algorithm in CUDA as well as MPI.

Dependencies

For KNN in MPI:

  • gcc-5.4.0
  • openmpi-1.10.2

For KNN in CUDA:

  • nvcc release 9.2

Usage

Clone the repository using:

foo@bar:~$ git clone https://github.com/neeldani/Parallel-KNN.git

Navigate to the project directory:

For MPI:

foo@bar:~$ cd Knn-CUDA 

Compile and execute the MPI code:

foo@bar:~$ mpicc knnInMPI.c -o knnInMpi.out -Wall
foo@bar:~$ mpirun -n 3 knnInMpi.out

For CUDA:

foo@bar:~$ cd Knn-MPI 

Compile and execute the CUDA code:

foo@bar:~$ nvcc -o Knn-Cuda.out Knn-Cuda.cu 

Configuration

The config.h file in MPI and CUDA can be used to set hyperparameters and the path to the dataset. The dataset should be split in a .csv file having individual files for training examples (X_train), training labels (y_train), testing examples (X_test) and testing labels (y_test). An example of the sample data (Iris Dataset) is present in the cloned repoisitory.

Future Works

Currently, the code is functional only for matrices having number of rows divisible by number of processes (in MPI) or number of threads (in CUDA) (both can be configured in the config.h file). Future works include generalization of the algorithm by allowing any dimension matrices and designing of thee algorithm to minimize the overhead due to communication.