alan.mcgovern at gmail.com
Sat Dec 1 13:02:40 EST 2007
Also, worst case scenario for a zero length string would mean a 22%
increase, not a 100% increase as was said before.
On Dec 1, 2007 6:01 PM, Alan McGovern <alan.mcgovern at gmail.com> wrote:
> Well, 'N' could be computed by asking what percent bloat is 'OK' or it
> could be computed by asking 'What % speedup do we want to make this a
> worthwhile tradeoff'.
> Currently a string has these two fields, an int and a char. Add in the
> object overhead of (i think) 12 bytes per object, you're up to 18 bytes
> straight off for a zero length string. Lets assume that you're only going to
> precalculate the hashcode for strings greater than length 4.
> So, take the case of a string of length 5. This puts the size of the
> string at 18 + 5*2 = 28 bytes. Adding 4 bytes to this is 15% extra memory
> usage. As string length goes up, this % goes down.
> I'm unsure what % speedup precalculating the hashcode would result in, but
> i'm sure someone could run a few benchmarks if they have the time free. You
> could say that you want at least a 5x or maybe 10x improvement before it
> becomes worthwhile, this would then put a lower bound on the string length
> to make it worthwhile.
> On Dec 1, 2007 5:29 PM, Robert Jordan <robertj at gmx.net> wrote:
> > Hi Alan,
> > Alan McGovern wrote:
> > > Also, just looking at the string source a bit more closely, it has a
> > > GetCaseInsensitiveHashcode method too, so i'd assume that would need
> > to be
> > > cached too which would mean 8 bytes would be needed. This wouldn't
> > scale
> > > well.
> > >
> > > Fair enough. Twas just an idea.
> > nah, don't give up too early! :) If the string is longer than "n",
> > you could allocate 4 or even 8 additional bytes at the end of the
> > array for lazily computed hash codes.
> > Open questions:
> > - What's the optimal "n" for all common application classes
> > mono is used for? :) It looks like it should be dynamically
> > computed & adjusted based on profiling data collected at
> > runtime.
> > - What's the real gain of this optimization? If "n" tends to
> > be really large (say 1-4KB), what's the probability of
> > such large strings being used as a hash key?
> > This question can be reduced to "how frequently is
> > s.GetHashCode called, where s.Length > n".
> > In mono's class libs sources there is only one place (I know
> > of) where large strings are potentially hashed: Sys.Web's
> > cache management.
> > Robert
> > _______________________________________________
> > Mono-devel-list mailing list
> > Mono-devel-list at lists.ximian.com
> > http://lists.ximian.com/mailman/listinfo/mono-devel-list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Mono-devel-list