Big-data

This repository is a collection of codes implementing Big data concepts and algorithms, many of them being direct implementation of some well known research papers. This reposirory includes following:

Classification based on Associations
A-Close
Improved Apriori implementation using hashing
Improved Apriori implementation using partition based approach
Improved Apriori implementation using transaction reduction
CHARM
Dynamic Itemset Counting
Equivalence Class LAttice Traversal
MAximal Frequent Itemset Analysis
Pincer Search
Pyspark programs: A collection of basic programs written with pyspark

The generate_itemsets.py can be used to generate many custom datasets for the given codes.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
pyspark_progs		pyspark_progs
CBA.py		CBA.py
README.md		README.md
aclose.py		aclose.py
apriori_hash.py		apriori_hash.py
apriori_partition.py		apriori_partition.py
apriori_transaction_reduction.py		apriori_transaction_reduction.py
charm.py		charm.py
dataset.csv		dataset.csv
dic.py		dic.py
eclat.py		eclat.py
friend_data.txt		friend_data.txt
generate_itemsets.py		generate_itemsets.py
itemsets.csv		itemsets.csv
mafia.py		mafia.py
pincer.py		pincer.py
preprocess.py		preprocess.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Big-data

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Big-data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages