-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathperfect-hashing.html
More file actions
220 lines (160 loc) · 7.93 KB
/
perfect-hashing.html
File metadata and controls
220 lines (160 loc) · 7.93 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
<!DOCTYPE html>
<HTML>
<HEAD>
<title>Free Variables Perfect Hashing Algorithm - Alexandr Poltavsky, software developer</title>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
<meta http-equiv="Pragma" content="no-cache" />
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" type="text/css" href="css/common.css" />
<link rel="stylesheet" href="highlight/styles/hybrid.css">
</HEAD>
<BODY>
<div id="wrap">
<!-- header part -->
<div id="header">
<a href="/color-throne.html"><img class="ad" src="images/color-throne-logo-promo.png" title="Color Throne Game"/></a>
<div class="avatar">
<a href="/">
<img class="avatar" width="160" src="images/alexandr-poltavsky-avatar.jpeg" title="Alexandr Poltavsky" align="left"/>
</a>
</div>
<div class="text">
<span class="bio">
Alexandr Poltavsky <br/>
Software Developer <br/>
Location: Russia, Moscow <br/>
<span class="email">
<a href="mailto:poltavsky.alexandr@gmail.com">poltavsky.alexandr@gmail.com</a><br/>
</span>
</span>
<div class="break"></div>
<span class="links">
<a href="/">Blog</a>
<a href="https://github.com/alexpolt/">Github</a>
<a href="https://www.shadertoy.com/user/alexpolt">Shadertoy</a>
<a href="https://twitter.com/poltavsky_alex">Twitter</a>
</span>
</div>
</div>
<div id="content" style="clear: left">
<!-- here goes the main content -->
<div id="main-menu">
<a id="main-menu-0" href="index.html">Programming</a>
<a id="main-menu-1" href="index-gfx.html">Graphics</a>
<a id="main-menu-2" href="index-off.html">Off-topic</a>
</div>
<h2>Free Variables Perfect Hashing Algorithm</h2>
<p>Perfect hashing is an alluring idea of having a hashing function that maps keys to
bucket locations in a 1-to-1 way. Ideally you also get a minimal perfect hash function,
so that no space is wasted. There is a <a href="http://cmph.sourceforge.net/">C Minimal Perfect Hashing Library</a>,
that contains a good summary of most popular algorithms.</p>
<p>There is a perfect hashing algorithm called <a href="http://cmph.sourceforge.net/chm.html">CHM</a>.
It uses an acyclic graph to compute the unique bucket locations. So implementing their algorithm involves
quite a bit of work. My post is about a simplification and generalization of their scheme. It is called
a <em>Free Variables Perfect Hashing Algorithm</em>.</p>
<p>So we have an object (a key). Let's assume that we have <em>N hash functions</em>. N=1 case is going to be the usual
hash mapping, N=2 will be roughly equivalent to CHM.
We get N bucket locations:</p>
<pre><code>size_t const ts = size_of_hash_table;
b1 = h1( object ) % ts;
b2 = h2( object ) % ts;
...
bN = hN( object ) % ts;
</code></pre>
<p>We call these N bucket locations the free variables. They are so called because we can use a still unoccupied
bucket to adjust the resulting value. The perfect hash location during hash table lookup is computed as follows:</p>
<pre><code>perfect_hash_location = hash_table[b1] ^ hash_table[b2] ^ ... ^ hash_table[bN];
</code></pre>
<p>Using <em>XOR</em> works really well. And here's how we compute the values in the hash table during table construction:</p>
<pre><code>b2 = find_first_free_bucket( b1, b2, ..., bN );
//let's assume it is b2 that is unoccupied
//now we use that b2 to adjust the perfect hash location, computed as shown above
hash_table[ b2 ] = perfect_hash_location ^ hash_table[b1] ^ ... ^ hash_Table[bN]
</code></pre>
<p>The free bucket can be marked as -1 for example (it is implementation detail). Now what remains is - how do
we handle the case of no free variables. There are two options: either increase the number of hash functions or table
size to avoid having to deal with such problem, or do search in the graph. The graph is formed by aliasing buckets.</p>
<p>For N=3 you could have all 3 locations occupied by different values. You need to keep backreferences to
what objects are linked to what buckets, so that you could then search the graph for a free variable. If you happen
to have all 3 locations occupied with some other single object then you need to redesign you hash functions or
increase their number or increase the hash table size.</p>
<p>Here is a rough illustration of what the graph looks like for N = 3.</p>
<p><img src="images/free-variables-N3.png" alt="" title="Free Variables Algorithm for N=3" /></p>
<p>Each triangle is on object that is hashed. One object (dark red) has all 3 locations occupied, so we have
to search for a free variable and, after finding it, propagate the change across the path (the change bubbles).</p>
<script>
document.addEventListener( "DOMContentLoaded", function() {
if( /iPhone|iPad|iPod|Android|Windows Phone/i.test(navigator.userAgent) ) {
document.getElementById("wrap").classList.add("mobile");
}
} );
</script>
<script src="highlight/highlight.pack.js"></script>
<script>
//hljs.configure({languages: ["C++"]});
hljs.initHighlightingOnLoad();
</script>
<div id="disqus_comments"><a href="javascript: run_disqus()">read and write commentaries</a></div>
<div id="disqus_thread"></div>
<script>
/**
* RECOMMENDED CONFIGURATION VARIABLES: EDIT AND UNCOMMENT THE SECTION BELOW TO INSERT DYNAMIC VALUES FROM YOUR PLATFORM OR CMS.
* LEARN WHY DEFINING THESE VARIABLES IS IMPORTANT: https://disqus.com/admin/universalcode/#configuration-variables
*/
/*
var disqus_config = function () {
this.page.url = PAGE_URL; // Replace PAGE_URL with your page's canonical URL variable
this.page.identifier = PAGE_IDENTIFIER; // Replace PAGE_IDENTIFIER with your page's unique identifier variable
};
*/
function run_disqus() { // DON'T EDIT BELOW THIS LINE
var d = document, s = d.createElement('script');
s.src = '//alexpolt-github-io.disqus.com/embed.js';
s.setAttribute('data-timestamp', +new Date());
(d.head || d.body).appendChild(s);
d.getElementById("disqus_comments").innerHTML = ""; // alexpolt
return undefined;
};
if( location.hash.indexOf("comment") != -1 ) run_disqus();
</script>
<!--noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript" rel="nofollow">comments
powered by Disqus.</a></noscript-->
</div> <!-- end content -->
</div> <!-- end wrap -->
<!-- Google Analytics -->
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-75341409-1', 'auto');
ga('send', 'pageview');
</script>
<!-- Yandex.Metrika counter -->
<script type="text/javascript">
(function (d, w, c) {
(w[c] = w[c] || []).push(function() {
try {
w.yaCounter43425229 = new Ya.Metrika({
id:43425229,
clickmap:true,
trackLinks:true,
accurateTrackBounce:true
});
} catch(e) { }
});
var n = d.getElementsByTagName("script")[0],
s = d.createElement("script"),
f = function () { n.parentNode.insertBefore(s, n); };
s.type = "text/javascript";
s.async = true;
s.src = "https://mc.yandex.ru/metrika/watch.js";
if (w.opera == "[object Opera]") {
d.addEventListener("DOMContentLoaded", f, false);
} else { f(); }
})(document, window, "yandex_metrika_callbacks");
</script>
<noscript><div><img src="https://mc.yandex.ru/watch/43425229" style="position:absolute; left:-9999px;" alt="" /></div></noscript>
<!-- /Yandex.Metrika counter -->
</BODY>