A simple version of ARC-II model implemented in Keras. Please reference paper:Convolutional Neural Network Architectures for Matching Natural Language Sentences
Based on the assumption that big amount of tweets having similar structure can indicate a bot, we suggest a convolutional architecture. The architectures include: Word embedding, Convolution and pooling layers, Flattening, Multi- layer perceptron (MLP) with activation function, Post - processing, Tweet classification.
Assumption: tweets written by the same generator have the same sentence structure. Main goal: measure the similarity between two tweets.
We have the tweet the user want to classify. we created a list that contains bot tweets from the dataset at the training phase. Then using our CNN architecture we get the measure of similarity between every bot tweet from the list and the tweet we want to classify. One feed forward and back propagate. Than we convert the degrees to classes when zero means the two tweet have different sentence structure and one means the opposite. The next step we sum the classes IDs. If the percentage of similarity pass the threshold selected by the user, the model will identify the tweet as bot tweet. Else, we will say that the tweet probably was written by human.
different threshold value could contribute to various purposes of the application. Therefore we gave the user the option to change the threshold value according to the purpose for which he use the application.
- first things you need to download the DB files and the vector files and put them in the db folder:
- DB files - https://drive.google.com/file/d/1zzdaJkl-647LKOXLZ_OUtqQx5Cto7wfJ/view?usp=sharing https://drive.google.com/file/d/1-8Lz5TLU1uVAvDpjpVbHqGw3OK0sG1Ep/view?usp=sharing
- vector file - https://www.kaggle.com/yesbutwhatdoesitmean/wikinews300d1mvec
-
Executable file - If you just want to run the project without a compiler or the liabery. you can download the Executable file from here: https://drive.google.com/file/d/1Go1kh0UDIt5eTSDCCWZcYAMuJm_Wavna/view?usp=sharing
-
Notbook (There are explanations inside the folder) in the folder 'Notebook' there is our notbook that have only pure code without GUI.
-
Complete project (There are explanations inside the folder). all the project files are in the 'TwitterBotDetector' folder
-
I put my environment in the root 'environment.yml' you can use it to have the same version like me.
-
video of the program: https://drive.google.com/file/d/15V1YjUt0BQeysEbKZiOeMLJkL2Y41ZU1/view?usp=sharing

