Class ObjectSet<T>

java.lang.Object
com.github.tommyettinger.ds.ObjectSet<T>
All Implemented Interfaces:
EnhancedCollection<T>, Iterable<T>, Collection<T>, Set<T>
Direct Known Subclasses:
CaseInsensitiveSet, FilteredIterableSet, FilteredStringSet, IdentitySet, ObjectOrderedSet

public class ObjectSet<T> extends Object implements Iterable<T>, Set<T>, EnhancedCollection<T>
An unordered set where the keys are objects. Null keys are not allowed. No allocation is done except when growing the table size.

This class performs fast contains and remove (typically O(1), worst case O(n) but that is rare in practice). Add may be slightly slower, depending on hash collisions. Hashcodes are rehashed to reduce collisions and the need to resize. Load factors greater than 0.91 greatly increase the chances to resize to the next higher POT size.

Unordered sets and maps are not designed to provide especially fast iteration. Iteration is faster with Ordered types like ObjectOrderedSet and ObjectObjectOrderedMap.

You can customize most behavior of this map by extending it. place(Object) can be overridden to change how hashCodes are calculated (which can be useful for types like StringBuilder that don't implement hashCode()), and equate(Object, Object) can be overridden to change how equality is calculated.

This implementation uses linear probing with the backward shift algorithm for removal. It tries different hashes from a simple family, with the hash changing on resize. Linear probing continues to work even when all hashCodes collide; it just works more slowly in that case.

  • Field Details

    • size

      protected int size
    • keyTable

      protected T[] keyTable
    • loadFactor

      protected float loadFactor
      Between 0f (exclusive) and 1f (inclusive, if you're careful), this determines how full the backing table can get before this increases its size. Larger values use less memory but make the data structure slower.
    • threshold

      protected int threshold
      Precalculated value of (int)(keyTable.length * loadFactor), used to determine when to resize.
    • shift

      protected int shift
      Used by place(Object) typically, this should always equal com.github.tommyettinger.digital.BitConversion.countLeadingZeros(mask). For a table that could hold 2 items (with 1 bit indices), this would be 64 - 1 == 63. For a table that could hold 256 items (with 8 bit indices), this would be 64 - 8 == 56.
    • mask

      protected int mask
      A bitmask used to confine hashcodes to the size of the table. Must be all 1 bits in its low positions, ie a power of two minus 1. If place(Object) is overridden, this can be used instead of shift to isolate usable bits of a hash.
    • hashMultiplier

      protected int hashMultiplier
      Used by place(Object) to mix hashCode() results. Changes on every call to resize(int) by default. This should always change when shift changes, meaning, when the backing table resizes. This only needs to be serialized if the full key and value tables are serialized, or if the iteration order should be the same before and after serialization. Iteration order is better handled by using ObjectOrderedSet.
    • iterator1

      protected transient ObjectSet.ObjectSetIterator<T> iterator1
    • iterator2

      protected transient ObjectSet.ObjectSetIterator<T> iterator2
  • Constructor Details

    • ObjectSet

      public ObjectSet()
      Creates a new set with an initial capacity of Utilities.getDefaultTableCapacity() and a load factor of Utilities.getDefaultLoadFactor().
    • ObjectSet

      public ObjectSet(int initialCapacity)
      Creates a new set with a load factor of Utilities.getDefaultLoadFactor().
      Parameters:
      initialCapacity - If not a power of two, it is increased to the next nearest power of two.
    • ObjectSet

      public ObjectSet(int initialCapacity, float loadFactor)
      Creates a new set with the specified initial capacity and load factor. This set will hold initialCapacity items before growing the backing table.
      Parameters:
      initialCapacity - If not a power of two, it is increased to the next nearest power of two.
      loadFactor - what fraction of the capacity can be filled before this has to resize; 0 < loadFactor <= 1
    • ObjectSet

      public ObjectSet(Iterator<? extends T> coll)
      Creates a new instance containing the items in the specified iterator.
      Parameters:
      coll - an iterator that will have its remaining contents added to this
    • ObjectSet

      public ObjectSet(ObjectSet<? extends T> set)
      Creates a new set identical to the specified set.
    • ObjectSet

      public ObjectSet(Collection<? extends T> coll)
      Creates a new set that contains all distinct elements in coll.
    • ObjectSet

      public ObjectSet(T[] array, int offset, int length)
      Creates a new set using length items from the given array, starting at offset (inclusive).
      Parameters:
      array - an array to draw items from
      offset - the first index in array to draw an item from
      length - how many items to take from array; bounds-checking is the responsibility of the using code
    • ObjectSet

      public ObjectSet(T[] array)
      Creates a new set containing all of the items in the given array.
      Parameters:
      array - an array that will be used in full, except for duplicate items
  • Method Details

    • place

      protected int place(Object item)
      Returns an index >= 0 and <= mask for the specified item, mixed.
      Parameters:
      item - a non-null Object; its hashCode() method should be used by most implementations
      Returns:
      an index between 0 and mask (both inclusive)
    • equate

      protected boolean equate(Object left, Object right)
      Compares the objects left and right, which are usually keys, for equality, returning true if they are considered equal. This is used by the rest of this class to determine whether two keys are considered equal. Normally, this returns left.equals(right), but subclasses can override it to use reference equality, fuzzy equality, deep array equality, or any other custom definition of equality. Usually, place(Object) is also overridden if this method is.
      Parameters:
      left - must be non-null; typically a key being compared, but not necessarily
      right - may be null; typically a key being compared, but can often be null for an empty key slot, or some other type
      Returns:
      true if left and right are considered equal for the purposes of this class
    • locateKey

      protected int locateKey(Object key)
      Returns the index of the key if already present, else ~index for the next empty index. This calls equate(Object, Object) to determine if two keys are equivalent.
      Parameters:
      key - a non-null K key
      Returns:
      a negative index if the key was not found, or the non-negative index of the existing key if found
    • add

      public boolean add(T key)
      Returns true if the key was not already in the set. If this set already contains the key, the call leaves the set unchanged and returns false.
      Specified by:
      add in interface Collection<T>
      Specified by:
      add in interface Set<T>
    • containsAll

      public boolean containsAll(Collection<?> c)
      Specified by:
      containsAll in interface Collection<T>
      Specified by:
      containsAll in interface Set<T>
    • containsAll

      public boolean containsAll(Object[] array)
      Exactly like containsAll(Collection), but takes an array instead of a Collection.
      Specified by:
      containsAll in interface EnhancedCollection<T>
      Parameters:
      array - array to be checked for containment in this set
      Returns:
      true if this set contains all the elements in the specified array
      See Also:
    • containsAll

      public boolean containsAll(Object[] array, int offset, int length)
      Like containsAll(Object[]), but only uses at most length items from array, starting at offset.
      Specified by:
      containsAll in interface EnhancedCollection<T>
      Parameters:
      array - array to be checked for containment in this set
      offset - the index of the first item in array to check
      length - how many items, at most, to check from array
      Returns:
      true if this set contains all the elements in the specified range of array
      See Also:
    • containsAnyIterable

      public boolean containsAnyIterable(Iterable<?> values)
      Returns true if this set contains any of the specified values.
      Specified by:
      containsAnyIterable in interface EnhancedCollection<T>
      Parameters:
      values - must not contain nulls, and must not be null itself
      Returns:
      true if this set contains any of the items in values, false otherwise
    • containsAny

      public boolean containsAny(Object[] values)
      Returns true if this set contains any of the specified values.
      Specified by:
      containsAny in interface EnhancedCollection<T>
      Parameters:
      values - must not contain nulls, and must not be null itself
      Returns:
      true if this set contains any of the items in values, false otherwise
    • containsAny

      public boolean containsAny(Object[] values, int offset, int length)
      Returns true if this set contains any items from the specified range of values.
      Specified by:
      containsAny in interface EnhancedCollection<T>
      Parameters:
      values - must not contain nulls, and must not be null itself
      offset - the index to start checking in values
      length - how many items to check from values
      Returns:
      true if this set contains any of the items in the given range of values, false otherwise
    • addAll

      public boolean addAll(Collection<? extends T> coll)
      Specified by:
      addAll in interface Collection<T>
      Specified by:
      addAll in interface Set<T>
    • retainAll

      public boolean retainAll(Collection<?> c)
      Specified by:
      retainAll in interface Collection<T>
      Specified by:
      retainAll in interface Set<T>
    • removeAll

      public boolean removeAll(Collection<?> c)
      Specified by:
      removeAll in interface Collection<T>
      Specified by:
      removeAll in interface Set<T>
    • removeAll

      public boolean removeAll(Object[] values)
      Description copied from interface: EnhancedCollection
      Removes from this collection all occurrences of any elements contained in the specified array.
      Specified by:
      removeAll in interface EnhancedCollection<T>
      Parameters:
      values - a non-null array of items to remove fully
      Returns:
      true if this collection was modified.
    • removeAll

      public boolean removeAll(Object[] values, int offset, int length)
      Description copied from interface: EnhancedCollection
      Removes from this collection all occurrences of any elements contained in the specified array, but only starts reading elements from the array starting at the given offset and only uses length items.
      Specified by:
      removeAll in interface EnhancedCollection<T>
      Parameters:
      values - a non-null array of items to remove fully
      offset - the first index in array to use
      length - how many items from array should be used
      Returns:
      true if this collection was modified.
    • addAll

      public boolean addAll(T[] array)
      Specified by:
      addAll in interface EnhancedCollection<T>
    • addAll

      public boolean addAll(T[] array, int offset, int length)
      Specified by:
      addAll in interface EnhancedCollection<T>
    • addAll

      public boolean addAll(ObjectSet<T> set)
    • addResize

      protected void addResize(T key)
      Like add(Object), but skips checks for existing keys, and doesn't increment size.
    • remove

      public boolean remove(Object key)
      Returns true if the key was removed.
      Specified by:
      remove in interface Collection<T>
      Specified by:
      remove in interface Set<T>
    • notEmpty

      public boolean notEmpty()
      Returns true if the set has one or more items.
    • size

      public int size()
      Returns the number of elements in this set (its cardinality). If this set contains more than Integer.MAX_VALUE elements, returns Integer.MAX_VALUE.
      Specified by:
      size in interface Collection<T>
      Specified by:
      size in interface Set<T>
      Returns:
      the number of elements in this set (its cardinality)
    • isEmpty

      public boolean isEmpty()
      Returns true if the set is empty.
      Specified by:
      isEmpty in interface Collection<T>
      Specified by:
      isEmpty in interface Set<T>
    • shrink

      public void shrink(int maximumCapacity)
      Reduces the size of the backing arrays to be the specified capacity / loadFactor, or less. If the capacity is already less, nothing is done. If the set contains more items than the specified capacity, the next highest power of two capacity is used instead.
    • clear

      public void clear(int maximumCapacity)
      Clears the set and reduces the size of the backing arrays to be the specified capacity / loadFactor, if they are larger. The reduction is done by allocating new arrays, though for large arrays this can be faster than clearing the existing array.
    • clear

      public void clear()
      Clears the set, leaving the backing arrays at the current capacity. When the capacity is high and the population is low, iteration can be unnecessarily slow. clear(int) can be used to reduce the capacity.
      Specified by:
      clear in interface Collection<T>
      Specified by:
      clear in interface Set<T>
    • contains

      public boolean contains(Object key)
      Specified by:
      contains in interface Collection<T>
      Specified by:
      contains in interface Set<T>
    • get

      public T get(T key)
    • first

      public T first()
      Description copied from interface: EnhancedCollection
      Attempts to get the first item in this EnhancedCollection, where "first" is only defined meaningfully if this type is ordered. Many times, this applies to a class that is not ordered, and in those cases it can get an arbitrary item, and that item is permitted to be different for different calls to first().
      This is useful for cases where you would normally be able to call something like List.get(int) to get an item, any item, from a collection, but whatever class you're using doesn't necessarily provide a get(), first(), peek(), or similar method.
      The default implementation uses Collection.iterator(), tries to get the first item, or throws an IllegalStateException if this is empty.
      Specified by:
      first in interface EnhancedCollection<T>
      Returns:
      the first item in this EnhancedCollection, as produced by Collection.iterator()
    • ensureCapacity

      public void ensureCapacity(int additionalCapacity)
      Increases the size of the backing array to accommodate the specified number of additional items / loadFactor. Useful before adding many items to avoid multiple backing array resizes.
      Parameters:
      additionalCapacity - how many additional items this should be able to hold without resizing (probably)
    • resize

      protected void resize(int newSize)
    • getHashMultiplier

      public int getHashMultiplier()
      Gets the current hashMultiplier, used in place(Object) to mix hash codes. If setHashMultiplier(int) is never called, the hashMultiplier will always be drawn from Utilities.GOOD_MULTIPLIERS, with the index equal to 64 - shift.
      Returns:
      the current hashMultiplier
    • setHashMultiplier

      public void setHashMultiplier(int hashMultiplier)
      Sets the hashMultiplier to the given int, which will be made odd if even and always negative (by OR-ing with 0x80000001). This can be any negative, odd int, but should almost always be drawn from Utilities.GOOD_MULTIPLIERS or something like it.
      Parameters:
      hashMultiplier - any int; will be made odd if even.
    • getTableSize

      public int getTableSize()
      Gets the length of the internal array used to store all items, as well as empty space awaiting more items to be entered. This is also called the capacity.
      Returns:
      the length of the internal array that holds all items
    • toArray

      public Object[] toArray()
      Specified by:
      toArray in interface Collection<T>
      Specified by:
      toArray in interface Set<T>
    • toArray

      public <E> E[] toArray(E[] a)
      Returns an array containing all the elements in this set; the runtime type of the returned array is that of the specified array. If the set fits in the specified array, it is returned therein. Otherwise, a new array is allocated with the runtime type of the specified array and the size of this set.
      Implementation is mostly copied from GWT, but uses Arrays.copyOf() instead of their internal APIs.
      Specified by:
      toArray in interface Collection<T>
      Specified by:
      toArray in interface Set<T>
      Type Parameters:
      E - must be the same as this ObjectSet's T or a superclass/interface of it; E items may be null
      Parameters:
      a - the array into which the elements of this set are to be stored, if it is big enough; otherwise, a new array of the same runtime type is allocated for this purpose. This array must not be null.
      Returns:
      an array containing all the elements in this set
    • getLoadFactor

      public float getLoadFactor()
    • setLoadFactor

      public void setLoadFactor(float loadFactor)
    • hashCode

      public int hashCode()
      Specified by:
      hashCode in interface Collection<T>
      Specified by:
      hashCode in interface Set<T>
      Overrides:
      hashCode in class Object
    • equals

      public boolean equals(Object o)
      Specified by:
      equals in interface Collection<T>
      Specified by:
      equals in interface Set<T>
      Overrides:
      equals in class Object
    • appendTo

      public StringBuilder appendTo(StringBuilder builder, String separator)
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • truncate

      public void truncate(int newSize)
      Reduces the size of the set to the specified size. If the set is already smaller than the specified size, no action is taken. This indiscriminately removes items from the backing array until the requested newSize is reached, or until the full backing array has had its elements removed.
      This tries to remove from the end of the iteration order, but because the iteration order is not guaranteed by an unordered set, this can remove essentially any item(s) from the set if it is larger than newSize.
      Parameters:
      newSize - the target size to try to reach by removing items, if smaller than the current size
    • iterator

      public ObjectSet.ObjectSetIterator<T> iterator()
      Returns an iterator for the keys in the set. Remove is supported.

      Reuses one of two iterators for this set. For nested or multithreaded iteration, use ObjectSetIterator(ObjectSet).

      Specified by:
      iterator in interface Collection<T>
      Specified by:
      iterator in interface Iterable<T>
      Specified by:
      iterator in interface Set<T>
    • with

      public static <T> ObjectSet<T> with()
      Constructs an empty set given the type as a generic type argument. This is usually less useful than just using the constructor, but can be handy in some code-generation scenarios when you don't know how many arguments you will have.
      Type Parameters:
      T - the type of items; must be given explicitly
      Returns:
      a new set containing nothing
    • with

      public static <T> ObjectSet<T> with(T item)
      Creates a new ObjectSet that holds only the given item, but can be resized.
      Type Parameters:
      T - the type of item, typically inferred
      Parameters:
      item - one T item
      Returns:
      a new ObjectSet that holds the given item
    • with

      public static <T> ObjectSet<T> with(T item0, T item1)
      Creates a new ObjectSet that holds only the given items, but can be resized.
      Type Parameters:
      T - the type of item, typically inferred
      Parameters:
      item0 - a T item
      item1 - a T item
      Returns:
      a new ObjectSet that holds the given items
    • with

      public static <T> ObjectSet<T> with(T item0, T item1, T item2)
      Creates a new ObjectSet that holds only the given items, but can be resized.
      Type Parameters:
      T - the type of item, typically inferred
      Parameters:
      item0 - a T item
      item1 - a T item
      item2 - a T item
      Returns:
      a new ObjectSet that holds the given items
    • with

      public static <T> ObjectSet<T> with(T item0, T item1, T item2, T item3)
      Creates a new ObjectSet that holds only the given items, but can be resized.
      Type Parameters:
      T - the type of item, typically inferred
      Parameters:
      item0 - a T item
      item1 - a T item
      item2 - a T item
      item3 - a T item
      Returns:
      a new ObjectSet that holds the given items
    • with

      public static <T> ObjectSet<T> with(T item0, T item1, T item2, T item3, T item4)
      Creates a new ObjectSet that holds only the given items, but can be resized.
      Type Parameters:
      T - the type of item, typically inferred
      Parameters:
      item0 - a T item
      item1 - a T item
      item2 - a T item
      item3 - a T item
      item4 - a T item
      Returns:
      a new ObjectSet that holds the given items
    • with

      public static <T> ObjectSet<T> with(T item0, T item1, T item2, T item3, T item4, T item5)
      Creates a new ObjectSet that holds only the given items, but can be resized.
      Type Parameters:
      T - the type of item, typically inferred
      Parameters:
      item0 - a T item
      item1 - a T item
      item2 - a T item
      item3 - a T item
      item4 - a T item
      item5 - a T item
      Returns:
      a new ObjectSet that holds the given items
    • with

      public static <T> ObjectSet<T> with(T item0, T item1, T item2, T item3, T item4, T item5, T item6)
      Creates a new ObjectSet that holds only the given items, but can be resized.
      Type Parameters:
      T - the type of item, typically inferred
      Parameters:
      item0 - a T item
      item1 - a T item
      item2 - a T item
      item3 - a T item
      item4 - a T item
      item5 - a T item
      item6 - a T item
      Returns:
      a new ObjectSet that holds the given items
    • with

      public static <T> ObjectSet<T> with(T item0, T item1, T item2, T item3, T item4, T item5, T item6, T item7)
      Creates a new ObjectSet that holds only the given items, but can be resized.
      Type Parameters:
      T - the type of item, typically inferred
      Parameters:
      item0 - a T item
      item1 - a T item
      item2 - a T item
      item3 - a T item
      item4 - a T item
      item5 - a T item
      item6 - a T item
      Returns:
      a new ObjectSet that holds the given items
    • with

      @SafeVarargs public static <T> ObjectSet<T> with(T... varargs)
      Creates a new ObjectSet that holds only the given items, but can be resized. This overload will only be used when an array is supplied and the type of the items requested is the component type of the array, or if varargs are used and there are 9 or more arguments.
      Type Parameters:
      T - the type of item, typically inferred
      Parameters:
      varargs - a T varargs or T array; remember that varargs allocate
      Returns:
      a new ObjectSet that holds the given items
    • parse

      public static <T> ObjectSet<T> parse(String str, String delimiter, PartialParser<T> parser)
      Calls parse(String, String, PartialParser, boolean) with brackets set to false.
      Parameters:
      str - a String that will be parsed in full
      delimiter - the delimiter between items in str
      parser - a PartialParser that returns a T item from a section of str
      Returns:
      a new collection parsed from str
    • parse

      public static <T> ObjectSet<T> parse(String str, String delimiter, PartialParser<T> parser, boolean brackets)
      Creates a new collection and fills it by calling EnhancedCollection.addLegible(String, String, PartialParser, int, int) on either all of str (if brackets is false) or str without its first and last chars (if brackets is true). Each item is expected to be separated by delimiter.
      Parameters:
      str - a String that will be parsed in full (depending on brackets)
      delimiter - the delimiter between items in str
      parser - a PartialParser that returns a T item from a section of str
      brackets - if true, the first and last chars in str will be ignored
      Returns:
      a new collection parsed from str
    • parse

      public static <T> ObjectSet<T> parse(String str, String delimiter, PartialParser<T> parser, int offset, int length)
      Creates a new collection and fills it by calling EnhancedCollection.addLegible(String, String, PartialParser, int, int) with the given five parameters as-is.
      Parameters:
      str - a String that will have the given section parsed
      delimiter - the delimiter between items in str
      parser - a PartialParser that returns a T item from a section of str
      offset - the first position to parse in str, inclusive
      length - how many chars to parse, starting from offset
      Returns:
      a new collection parsed from str