Parallel-KNN

A Parallel implementation of the famous K-Nearest-Neighbor (KNN) algorithm in CUDA as well as MPI.

Dependencies

For KNN in MPI:

gcc-5.4.0
openmpi-1.10.2

For KNN in CUDA:

nvcc release 9.2

Usage

Clone the repository using:

foo@bar:~$ git clone https://github.com/neeldani/Parallel-KNN.git

Navigate to the project directory:

For MPI:

foo@bar:~$ cd Knn-CUDA

Compile and execute the MPI code:

foo@bar:~$ mpicc knnInMPI.c -o knnInMpi.out -Wall
foo@bar:~$ mpirun -n 3 knnInMpi.out

For CUDA:

foo@bar:~$ cd Knn-MPI

Compile and execute the CUDA code:

foo@bar:~$ nvcc -o Knn-Cuda.out Knn-Cuda.cu

Configuration

The config.h file in MPI and CUDA can be used to set hyperparameters and the path to the dataset. The dataset should be split in a .csv file having individual files for training examples (X_train), training labels (y_train), testing examples (X_test) and testing labels (y_test). An example of the sample data (Iris Dataset) is present in the cloned repoisitory.

Future Works

Currently, the code is functional only for matrices having number of rows divisible by number of processes (in MPI) or number of threads (in CUDA) (both can be configured in the config.h file). Future works include generalization of the algorithm by allowing any dimension matrices and designing of thee algorithm to minimize the overhead due to communication.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel-KNN

Dependencies

Usage

Configuration

Future Works

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Parallel-KNN

Dependencies

Usage

Configuration

Future Works