-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathccws.m
More file actions
48 lines (39 loc) · 1.88 KB
/
ccws.m
File metadata and controls
48 lines (39 loc) · 1.88 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
% ======================================================================= %
% *** Canonical Consistent Weighted Sampling *** %
% Author: Wei WU (william.third.wu@gmail.com) %
% CAI, University of Technology, Sydney (UTS) %
% ----------------------------------------------------------------------- %
% Citation: W. Wu, B. Li, L. Chen, & C. Zhang, "Canonical Consistent %
% Weighted Sampling for Real-Value Weighted Min-Hash", ICDM 2016. %
% ======================================================================= %
function [ fingerprintK,fingerprintY, runtime ] = ccws( C, weightedSet, D)
% Input:
% C - the scaling parameter
% weightedSet - a m*n matrix of weighted sets
% rows - the number of features in the universal sets
% columns - the number of weighted sets
% D - the number of hash functions
% Output:
% fingerprintK - 'k' in the returned hash code '(k,y)'
% fingerprintY - 'y' in the returned hash code '(k,y)'
% runtime - total runtime in seconds
n = size(weightedSet, 2); % the number of weighted sets
fingerprintI=zeros(n,D); % fingerprints with the length of D for n weighted sets
fingerprintT=zeros(n,D);
m = size(weightedSet, 1); % the number of features
tic;
gamma = betarnd(2,1, m, D);
c = gamrnd(2,1, m, D);
beta = unifrnd(0,1, m, D);
for j=1:n
[wordId,~] = find(weightedSet(:,j)>0);
tMatrix = floor(C*repmat(weightedSet(wordId,j),1,D)./ gamma(wordId,:) + beta(wordId,:));
yMatrix = gamma(wordId,:) .* (tMatrix - beta(wordId,:));
aMatrix = c(wordId,:)./yMatrix - 2*gamma(wordId,:).*c(wordId,:);
[~,Imin] = min(aMatrix,[],1);
yWM1 = yMatrix(size(aMatrix,1).*(0:size(aMatrix,2)-1)+Imin);
fingerprintK(j,:) = wordId(Imin); % fingerprint for the n-th weighted sets
fingerprintY(j,:) = yWM1;
end
runtime = toc;
end