tag:blogger.com,1999:blog-2715968472735546962.post6633711826933804459..comments2023-12-14T02:21:18.222+01:00Comments on Bannalia: trivial notes on themes diverse: Lookup cost in hash tables with duplicate elementsJoaquín M López Muñozhttp://www.blogger.com/profile/08579853272674211100noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-2715968472735546962.post-30449665663233465632014-07-20T12:35:07.251+02:002014-07-20T12:35:07.251+02:00Hi Mute Invert,
According to Wolfram MathWorld, t...Hi Mute Invert,<br /><br />According to <a href="http://mathworld.wolfram.com/BinomialDistribution.html" rel="nofollow">Wolfram MathWorld</a>, the second moment of a binomial distribution with parameters <i>N</i> and <i>p</i> (= 1/<i>B</i> in our case) is<br /><br /><i>Np</i>(1 - <i>p</i> + <i>Np</i>) = <i>F</i> + <i>F</i>^2 -<i>F</i>/<i>B</i>,<br /><br />so the average number of elements checked is smaller than in the Poisson case by an offset of 1/(2<i>B</i>).Joaquín M López Muñozhttps://www.blogger.com/profile/08579853272674211100noreply@blogger.comtag:blogger.com,1999:blog-2715968472735546962.post-54178791319754086532014-07-20T07:59:25.514+02:002014-07-20T07:59:25.514+02:00Fascinating. I too happened to work on this proble...Fascinating. I too happened to work on this problem, a few months ago! (After comparing to yours, turns out I was off by a factor of F!)<br /><br />I'd just like to point out that you can use the binomial distribution directly (rather than approximating using a poisson) because the moments work out to the same thing.Mute Inverthttps://www.blogger.com/profile/15274158594031361717noreply@blogger.comtag:blogger.com,1999:blog-2715968472735546962.post-9843070654143237072014-04-30T12:12:58.562+02:002014-04-30T12:12:58.562+02:00Hi Pete,
Please note that the purpose of adding b...Hi Pete,<br /><br />Please note that the purpose of adding bucket sorting is <i>not</i> to increase performance in the general case (where hashing is working OK and buckets are consequently sparsely populated) but to defend against malicious DoS attacks via injection values calculated to defeat hashing.<br /><br />The scenario described in the article, where we consider duplicate values, might benefit of bucket ordering, but group skipping is going to be faster --tipically only ~2 nodes are checked regardless of duplicate group sizes.Joaquín M López Muñozhttps://www.blogger.com/profile/08579853272674211100noreply@blogger.comtag:blogger.com,1999:blog-2715968472735546962.post-81249924271505011552014-04-28T11:48:51.593+02:002014-04-28T11:48:51.593+02:00Hi Joaquín,
I read this fascinating series a few ...Hi Joaquín,<br /><br />I read this fascinating series a few months ago and should have congratulated you on it then.<br /><br />I was reminded of it today when I ready about changed HashMap behaviour in Java 8 - http://www.javacodegeeks.com/2014/04/hashmap-performance-improvements-in-java-8.html<br /><br />So, in the case of heavily-occupied buckets, Java 8 is putting an order on the bucket (assuming the key supports it) and getting O(log(bucket size)). Is this something you looked at? (Sorry if I missed it)<br /><br />Pete<br /> Anonymoushttps://www.blogger.com/profile/07978090648632337561noreply@blogger.com