Package com.github.tommyettinger.ds
Class Utilities
java.lang.Object
com.github.tommyettinger.ds.Utilities
Utility code shared by various data structures in this package.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final floatA float that is meant to be used as the smallest reasonable tolerance for methods likeisEqual(float, float, float).static final int[]A final array of 512 int multipliers that have shown to have few collisions in some kinds of hash table.static final ObjectA placeholder Object that should never be reference-equivalent to any Object used as a key or value. -
Method Summary
Modifier and TypeMethodDescriptionstatic voidSet all elements inobjectsto null.static voidSet allsizeelements inobjects, starting at indexstart, to null.static intLikeString.compareToIgnoreCase(String), but works for allCharSequencevalues, such asStrings orStringBuilders; this ignores case by upper-casing any cased letters.static booleanA simple equality comparison forCharSequencevalues such asStrings orStringBuilders that ignores case by upper-casing any cased letters.static floatGets the default load factor, meant to be used when no load factor is specified during the construction of a data structure such as a map or set.static intGets the default capacity for maps and sets backed by hash tables, meant to be used when no capacity is specified during the construction of a map or set.static intGets a 32-bit thoroughly-random hashCode from the given CharSequence, ignoring the case of any cased letters.static inthashCodeIgnoreCase(CharSequence data, int seed) Gets a 32-bit thoroughly-random hashCode from the given CharSequence, ignoring the case of any cased letters.static booleanisEqual(float a, float b) Equivalent to libGDX's isEqual() method in MathUtils; this compares two floats for equality and allows just enough tolerance to ignore a rounding error.static booleanisEqual(float a, float b, float tolerance) Equivalent to libGDX's isEqual() method in MathUtils; this compares two floats for equality and allows the given tolerance during comparison.static longGets a 64-bit thoroughly-random hashCode from the given CharSequence, ignoring the case of any cased letters.static longlongHashCodeIgnoreCase(CharSequence data, long seed) Gets a 64-bit thoroughly-random hashCode from the given CharSequence, ignoring the case of any cased letters.static voidsetDefaultLoadFactor(float loadFactor) Sets the load factor that will be used when none is specified during construction (for data structures that have a load factor, such as all sets and maps here).static inttableSize(int capacity, float loadFactor) Used to establish the size of a hash table forObjectSet,ObjectObjectMap, and related code.
-
Field Details
-
GOOD_MULTIPLIERS
public static final int[] GOOD_MULTIPLIERSA final array of 512 int multipliers that have shown to have few collisions in some kinds of hash table. These multipliers aren't currently used within jdkgdxds except inhashCodeIgnoreCase(CharSequence, int), the CaseInsensitive maps and sets, and the Filtered maps and sets, but may still be used externally.
You can mutate this array, but you should only do so if you encounter high collision rates or resizes with a particular multiplier from this table. Any int m you set into this array must satisfy(m & 0x80000001) == 0x80000001, and should ideally have an "unpredictable" bit pattern. This last quality is the hardest to define, but generally dividing 2 to the 32 by an irrational number between 1 and 2 using BigDecimal or double math gets such an unpredictable pattern. Not all irrational numbers work, and the ones here were found empirically.
The specific numbers in this array were chosen because they performed well (that is, without ever colliding more than about 25% over the average) on a data set of 200000 Vector2 objects. Vector2 is from libGDX, and it can have an extremely challenging hashCode() for hashed data structures to handle if they don't adequately mix the hash's bits. On this data set, no multiplier used recorded fewer than 600000 collisions, and any multipliers with collision counts above a threshold of 677849 were rejected. That means these 512 are about as good as it gets. When tested with an equivalent to GridPoint2 from libGDX, vastly more objects had identical hashCode() results to a different GridPoint2 in the set. Only about a quarter as many hashCode() values were returned as there would be with all-unique results. Even with that more-challenging situation, this set did not have any multipliers that were significantly worse than average. An earlier version of the multipliers did have 97 "problem multipliers" for GridPoint2. The multipliers were also tested on a similar data set of 235970 English dictionary words (including many proper nouns), with fewer collisions here (a little more than 1/20 as many). All multipliers satisfied a threshold of 35000 collisions with the dictionary data set. -
neverIdentical
A placeholder Object that should never be reference-equivalent to any Object used as a key or value. This is only public so data structures can use it for comparisons; never put it into a data structure. -
FLOAT_ROUNDING_ERROR
public static final float FLOAT_ROUNDING_ERRORA float that is meant to be used as the smallest reasonable tolerance for methods likeisEqual(float, float, float).- See Also:
-
-
Method Details
-
setDefaultLoadFactor
public static void setDefaultLoadFactor(float loadFactor) Sets the load factor that will be used when none is specified during construction (for data structures that have a load factor, such as all sets and maps here). The load factor will be clamped so that it is greater than 0 (the lowest possible isFLOAT_ROUNDING_ERROR, but it should never actually be that low) and less than or equal to 1. The initial value for the default load factor is 0.7.
If multiple libraries and/or your own code depend on jdkgdxds, then they may attempt to set the default load factor independently of each other, but this only has one setting at a time. The best solution for this is to specify the load factor you want when it matters, possibly to a variable set per-library or even per section of a library that needs some load factor. That means not using the default load factor in this class, and always using the constructors that specify a load factor. Libraries are generally discouraged from setting the default load factor; that decision should be left up to the application using the library.- Parameters:
loadFactor- a float that will be clamped between 0 (exclusive) and 1 (inclusive)
-
getDefaultLoadFactor
public static float getDefaultLoadFactor()Gets the default load factor, meant to be used when no load factor is specified during the construction of a data structure such as a map or set. The initial value for the default load factor is 0.7.- Returns:
- the default load factor, always between 0 (exclusive) and 1 (inclusive)
-
getDefaultTableCapacity
public static int getDefaultTableCapacity()Gets the default capacity for maps and sets backed by hash tables, meant to be used when no capacity is specified during the construction of a map or set. This depends on the currentgetDefaultLoadFactor(), and is equivalent to the floor of64 * getDefaultLoadFactor(), or 44 if unchanged.- Returns:
- the default capacity for hash-based maps and sets, when none is given
-
tableSize
public static int tableSize(int capacity, float loadFactor) Used to establish the size of a hash table forObjectSet,ObjectObjectMap, and related code. The table size will always be a power of two, and should be the next power of two that is at least equal tocapacity / loadFactor.- Parameters:
capacity- the amount of items the hash table should be able to holdloadFactor- between 0.0 (exclusive) and 1.0 (inclusive); the fraction of how much of the table can be filled- Returns:
- the size of a hash table that can handle the specified capacity with the given loadFactor
-
clear
Set all elements inobjectsto null. This method is faster thanArrays.fill(long[], long)for large arrays (> 128).
From Apache Fury's ObjectArray class. -
clear
Set allsizeelements inobjects, starting at indexstart, to null. This method is faster thanArrays.fill(long[], long)for large arrays (> 128).
From Apache Fury's ObjectArray class. -
isEqual
public static boolean isEqual(float a, float b) Equivalent to libGDX's isEqual() method in MathUtils; this compares two floats for equality and allows just enough tolerance to ignore a rounding error. An example is0.3f - 0.2f == 0.1fvs.isEqual(0.3f - 0.2f, 0.1f); the first is incorrectly false, while the second is correctly true.- Parameters:
a- the first float to compareb- the second float to compare- Returns:
- true if a and b are equal or extremely close to equal, or false otherwise.
-
isEqual
public static boolean isEqual(float a, float b, float tolerance) Equivalent to libGDX's isEqual() method in MathUtils; this compares two floats for equality and allows the given tolerance during comparison. An example is0.3f - 0.2f == 0.1fvs.isEqual(0.3f - 0.2f, 0.1f, 0.000001f); the first is incorrectly false, while the second is correctly true.- Parameters:
a- the first float to compareb- the second float to comparetolerance- the maximum difference between a and b permitted for this to return true, inclusive- Returns:
- true if a and b have a difference less than or equal to tolerance, or false otherwise.
-
equalsIgnoreCase
A simple equality comparison forCharSequencevalues such asStrings orStringBuilders that ignores case by upper-casing any cased letters. This works for all alphabets in Unicode except Georgian.- Parameters:
a- a non-null CharSequence, such as a String or StringBuilderb- a non-null CharSequence, such as a String or StringBuilder- Returns:
- whether the contents of
aandbare equal ignoring case
-
compareIgnoreCase
LikeString.compareToIgnoreCase(String), but works for allCharSequencevalues, such asStrings orStringBuilders; this ignores case by upper-casing any cased letters. This technique works for all alphabets in Unicode except Georgian.- Parameters:
l- a non-null CharSequence, such as a String or StringBuilderr- a non-null CharSequence, such as a String or StringBuilder- Returns:
- ignoring case: 0 if the two
CharSequences are equal; a negative integer iflis lexicographically less thanr; or a positive integer iflis lexicographically greater thanr
-
longHashCodeIgnoreCase
Gets a 64-bit thoroughly-random hashCode from the given CharSequence, ignoring the case of any cased letters. Uses Water hash, which is a variant on mum-hash and wyhash. This gets the hash as if all cased letters have been converted to upper case byCasing.caseUp(char); this should be correct for all alphabets in Unicode except Georgian. Typically, place() methods in Sets and Maps here that want case-insensitive hashing would use this with(int)(longHashCodeIgnoreCase(text) >>> shift).
This is very similar to theHasher.hash64(CharSequence)method, and shares the same constants and mum() method with Hasher.- Parameters:
data- a non-null CharSequence; often a String, but this has no trouble with a StringBuilder- Returns:
- a long hashCode; quality should be similarly good across any bits
-
longHashCodeIgnoreCase
Gets a 64-bit thoroughly-random hashCode from the given CharSequence, ignoring the case of any cased letters. Uses Water hash, which is a variant on mum-hash and wyhash. This gets the hash as if all cased letters have been converted to upper case byCasing.caseUp(char); this should be correct for all alphabets in Unicode except Georgian. Typically, place() methods in Sets and Maps here that want case-insensitive hashing would use this with(int)(longHashCodeIgnoreCase(text) >>> shift).
This is very similar to theHasher.hash64(CharSequence)method, and shares the same constants and mum() method with Hasher.- Parameters:
data- a non-null CharSequence; often a String, but this has no trouble with a StringBuilderseed- any long; must be the same between calls if two equivalent values fordatamust be the same- Returns:
- a long hashCode; quality should be similarly good across any bits
-
hashCodeIgnoreCase
Gets a 32-bit thoroughly-random hashCode from the given CharSequence, ignoring the case of any cased letters. Uses wyhash version 4.2, but shrunk down to work on 16-bit char values instead of 64-bit long values. This gets the hash as if all cased letters have been converted to upper case byCasing.caseUp(char); this should be correct for all alphabets in Unicode except Georgian. Typically, place() methods in Sets and Maps here that want case-insensitive hashing would use this with(hashCodeIgnoreCase(text) >>> shift).- Parameters:
data- a non-null CharSequence; often a String, but this has no trouble with a StringBuilder- Returns:
- an int hashCode; quality should be similarly good across any bits
-
hashCodeIgnoreCase
Gets a 32-bit thoroughly-random hashCode from the given CharSequence, ignoring the case of any cased letters. Uses wyhash version 4.2, but shrunk down to work on 16-bit char values instead of 64-bit long values. This gets the hash as if all cased letters have been converted to upper case byCasing.caseUp(char); this should be correct for all alphabets in Unicode except Georgian. Typically, place() methods in Sets and Maps here that want case-insensitive hashing would use this with(hashCodeIgnoreCase(text, seed) >>> shift).- Parameters:
data- a non-null CharSequence; often a String, but this has no trouble with a StringBuilderseed- any int; must be the same between calls if two equivalent values fordatamust be the same- Returns:
- an int hashCode; quality should be similarly good across any bits
-