Class CharBitSet

java.lang.Object
com.github.tommyettinger.ds.CharBitSet
All Implemented Interfaces:
PrimitiveCollection<Character>, PrimitiveCollection.OfChar, PrimitiveSet<Character>, PrimitiveSet.SetOfChar, com.github.tommyettinger.function.CharPredicate

public class CharBitSet extends Object implements PrimitiveSet.SetOfChar, com.github.tommyettinger.function.CharPredicate
A set of primitive char items, implicitly sorted in ascending order, that can resize based on its highest-value char. This is like CharBitSetFixedSize, but doesn't need 2048 ints in an array for smaller sets, such as ASCII characters. For the specific case of ASCII characters, this only uses an array of 4 ints. This is based on OffsetBitSet, but doesn't have an offset (it acts as if its offset was always 0). Like CharBitSetFixedSize, this is a CharPredicate. It is also a PrimitiveCollection.OfChar and PrimitiveSet.SetOfChar.
This is very similar to the CharBitSet in RegExodus, but isn't compatible because RegExodus doesn't have the PrimitiveSet.SetOfChar class available to it.
  • Field Details

    • bits

      protected int[] bits
      The raw bits, each one representing the presence or absence of an integer at a position.
  • Constructor Details

    • CharBitSet

      public CharBitSet()
      Creates a bit set with an initial size that can store positions between 0 and 31, inclusive, without needing to resize. This can resize to fit larger positions.
    • CharBitSet

      public CharBitSet(int bitCapacity)
      Creates a bit set whose initial size is large enough to explicitly represent bits with indices in the range 0 through bitCapacity-1. This can resize to fit larger positions.
      Parameters:
      bitCapacity - the initial size of the bit set
    • CharBitSet

      public CharBitSet(char end)
      Creates a bit set whose initial size is large enough to explicitly represent bits with indices in the range 0 through end-1. This can resize to fit larger positions. This will not contain end at the start, though it can be added without needing to resize.
      Parameters:
      end - the initial end of the range of the bit set
    • CharBitSet

      public CharBitSet(CharBitSet toCopy)
      Creates a bit set from another bit set. This will copy the raw bits.
      Parameters:
      toCopy - bitset to copy
    • CharBitSet

      public CharBitSet(CharSequence toCopy)
      Creates a bit set from a CharSequence, such as a CharList or String.
      Parameters:
      toCopy - the char sequence to copy
    • CharBitSet

      public CharBitSet(char[] toCopy)
      Creates a bit set from an entire char array.
      Parameters:
      toCopy - the non-null char array to copy
    • CharBitSet

      public CharBitSet(char[] toCopy, int off, int length)
      Creates a bit set from an char array, starting reading at an offset and continuing for a given length.
      Parameters:
      toCopy - the char array to copy
      off - which index to start copying from toCopy
      length - how many items to copy from toCopy
    • CharBitSet

      public CharBitSet(com.github.tommyettinger.function.CharPredicate predicate)
      Meant primarily for offline use to store the results of a CharPredicate on one target platform so those results can be recalled identically on all platforms. This can be relevant because of changing Unicode versions on newer JDK versions, or partial implementations of JDK predicates like Character.isLetter(char) on GWT.
      Parameters:
      predicate - a CharPredicate, which could be a method reference like Character::isLetter
      See Also:
    • CharBitSet

      public CharBitSet(int[] ints, boolean useAsRawBits)
      Allows passing an int array either to be treated as char contents to enter (ignoring any ints outside the valid char range) or as the raw bits that are used internally (which can be accessed with getRawBits(). Note that ints should always have a length of 1 or more; otherwise, it won't be used directly (or if useAsRawBits is false, it won't have any contents copied out).
      Parameters:
      ints - depending on useAsRawBits, this will be used as either char items or raw bits
      useAsRawBits - if true, ints will be used as raw bits and used directly, not copied as char items
  • Method Details

    • getRawBits

      public int[] getRawBits()
      This gets the internal int[] used to store bits in bulk. This is not meant for typical usage; it may be useful for serialization or other code that would typically need reflection to access the internals here. This may and often does include padding at the end.
      Returns:
      the raw int array used to store positions, one bit per on and per off position
    • setRawBits

      public void setRawBits(int[] bits)
      This allows setting the internal int[] used to store bits in bulk. This is not meant for typical usage; it may be useful for serialization or other code that would typically need reflection to access the internals here. Be very careful with this method. If bits is null or empty, it is ignored; this is the only error validation this does.
      Parameters:
      bits - a non-null, non-empty int array storing positions, typically obtained from getRawBits()
    • contains

      public boolean contains(char index)
      Returns true if the given position is contained in this bit set. If the index is out of bounds, this returns false.
      Specified by:
      contains in interface PrimitiveCollection.OfChar
      Parameters:
      index - the index of the bit
      Returns:
      whether the bit is set
    • test

      public boolean test(char value)
      Returns true if the given char is contained in this bit set, or false otherwise.
      Specified by:
      test in interface com.github.tommyettinger.function.CharPredicate
      Parameters:
      value - the char to check
      Returns:
      true if the char is present, or false otherwise
    • remove

      public boolean remove(char index)
      Deactivates the given position and returns true if the bit set was modified in the process. If the index is out of bounds, this does not modify the bit set and returns false.
      Specified by:
      remove in interface PrimitiveCollection.OfChar
      Parameters:
      index - the index of the bit
      Returns:
      true if this modified the bit set
    • add

      public boolean add(char index)
      Activates the given position and returns true if the bit set was modified in the process. If the index is out of bounds, this does not modify the bit set and returns false.
      Specified by:
      add in interface PrimitiveCollection.OfChar
      Parameters:
      index - the index of the bit
      Returns:
      true if this modified the bit set
    • add

      public boolean add(int index)
      Activates the given position and returns true if the bit set was modified in the process. If the index is out of bounds, this does not modify the bit set and returns false.
      Parameters:
      index - the index of the bit
      Returns:
      true if this modified the bit set
    • addAll

      public boolean addAll(char[] indices)
      Specified by:
      addAll in interface PrimitiveCollection.OfChar
    • addAll

      public boolean addAll(char[] indices, int off, int length)
      Specified by:
      addAll in interface PrimitiveCollection.OfChar
    • addAll

      public boolean addAll(int[] indices)
    • addAll

      public boolean addAll(int[] indices, int off, int length)
    • addSeq

      public boolean addSeq(CharSequence indices)
      Like addAll(char[]), but takes a CharSequence. Named differently to avoid ambiguity between addAll(OfChar) when a type is both a CharSequence and a PrimitiveCollection.OfChar .
      Parameters:
      indices - the CharSequence to read distinct chars from
      Returns:
      true if this was modified, or false otherwise
    • addSeq

      public boolean addSeq(CharSequence indices, int off, int length)
      Like addAll(char[], int, int), but takes a CharSequence. Named differently to avoid ambiguity between addAll(OfChar) when a type is both a CharSequence and a PrimitiveCollection.OfChar .
      Parameters:
      indices - the CharSequence to read distinct chars from
      off - the first position to read from indices
      length - how many chars to read from indices; because the CharSequence may have duplicates, this is not necessarily the length that will be added
      Returns:
      true if this was modified, or false otherwise
    • addAll

      public boolean addAll(PrimitiveCollection.OfChar indices)
      Adds another PrimitiveCollection.OfChar, such as a CharList, to this set. If you have another CharBitSet, you can use or(CharBitSet), which is faster.
      Specified by:
      addAll in interface PrimitiveCollection.OfChar
      Parameters:
      indices - another primitive collection of char
      Returns:
      true if this was modified
    • activateAll

      public void activateAll(char[] indices)
    • activateAll

      public void activateAll(char[] indices, int off, int length)
    • activateAll

      public void activateAll(int[] indices)
    • activateAll

      public void activateAll(int[] indices, int off, int length)
    • activateSeq

      public void activateSeq(CharSequence indices)
      Like activateAll(char[]), but takes a CharSequence. Named differently to avoid ambiguity between activateAll(OfChar) when a type is both a CharSequence and a PrimitiveCollection.OfChar .
      Parameters:
      indices - the CharSequence to read distinct chars from
    • activateSeq

      public void activateSeq(CharSequence indices, int off, int length)
      Like activateAll(char[], int, int), but takes a CharSequence. Named differently to avoid ambiguity between activateAll(OfChar) when a type is both a CharSequence and a PrimitiveCollection.OfChar .
      Parameters:
      indices - the CharSequence to read distinct chars from
      off - the first position to read from indices
      length - how many chars to read from indices; because the CharSequence may have duplicates, this is not necessarily the length that will be added
    • activateAll

      public void activateAll(PrimitiveCollection.OfChar indices)
      Adds another PrimitiveCollection.OfChar, such as a CharList, to this set. If you have another CharBitSet, you can use or(CharBitSet), which is faster.
      Parameters:
      indices - another primitive collection of char
    • iterator

      public CharBitSet.CharBitSetIterator iterator()
      Returns a new iterator for the keys in the set; remove is supported.
      Specified by:
      iterator in interface PrimitiveCollection<Character>
      Specified by:
      iterator in interface PrimitiveCollection.OfChar
      Returns:
      a new iterator for the keys in the set; remove is supported
    • activate

      public void activate(char index)
      Sets the given char position to true.
      Parameters:
      index - the index of the bit to set
    • deactivate

      public void deactivate(char index)
      Sets the given char position to false.
      Parameters:
      index - the index of the bit to clear
    • toggle

      public void toggle(char index)
      Changes the given char position from true to false, or from false to true.
      Parameters:
      index - the index of the bit to flip
    • activate

      public void activate(int index)
      Sets the given int position to true, unless the position is negative or greater than 65535 (then it does nothing).
      Parameters:
      index - the index of the bit to set
    • deactivate

      public void deactivate(int index)
      Sets the given int position to false, unless the position is out of bounds (then it does nothing).
      Parameters:
      index - the index of the bit to clear
    • toggle

      public void toggle(int index)
      Changes the given int position from true to false, or from false to true, unless the position is negative or greater than 65535 (then it does nothing).
      Parameters:
      index - the index of the bit to flip
    • clear

      public void clear()
      Clears the entire bitset, removing all contained ints. Doesn't change the capacity.
      Specified by:
      clear in interface PrimitiveCollection<Character>
    • numBits

      public int numBits()
      Gets the capacity in bits, including both true and false values, and including any false values that may be after the last contained position. Runs in O(1) time.
      Returns:
      the number of bits currently stored, not the highest set bit
    • length

      public int length()
      Returns the "logical extent" of this bitset: the index of the highest set bit in the bitset plus one. Returns zero if the bitset contains no set bits. If this has any set bits, it will return an int at least equal to 1. Runs in O(n) time.
      Returns:
      the logical extent of this bitset
    • size

      public int size()
      Returns the size of the set, or its cardinality; this is the count of distinct activated positions in the set. Note that unlike most Collection types, which typically have O(1) size() runtime, this runs in O(n) time, where n is on the order of the capacity.
      Specified by:
      size in interface PrimitiveCollection<Character>
      Returns:
      the count of distinct activated positions in the set.
    • notEmpty

      public boolean notEmpty()
      Checks if there are any positions contained in this at all. Run in O(n) time, but usually takes less.
      Specified by:
      notEmpty in interface PrimitiveCollection<Character>
      Returns:
      true if this bitset contains at least one bit set to true
    • isEmpty

      public boolean isEmpty()
      Checks if there are no positions contained in this at all. Run in O(n) time, but usually takes less.
      Specified by:
      isEmpty in interface PrimitiveCollection<Character>
      Returns:
      true if this bitset contains no bits that are set to true
    • nextSetBit

      public int nextSetBit(int fromIndex)
      Returns the index of the first bit that is set to true that occurs on or after the specified starting index. If no such bit exists then getOffset() - 1 is returned.
      Parameters:
      fromIndex - the index to start looking at
      Returns:
      the first position that is set to true that occurs on or after the specified starting index
    • nextClearBit

      public int nextClearBit(int fromIndex)
      Returns the index of the first bit that is set to false that occurs on or after the specified starting index. If no such bit exists then numBits() + getOffset() is returned.
      Parameters:
      fromIndex - the index to start looking at
      Returns:
      the first position that is set to true that occurs on or after the specified starting index
    • and

      public void and(CharBitSet other)
      Performs a logical AND of this target bit set with the argument bit set. This bit set is modified so that each bit in it has the value true if and only if it both initially had the value true and the corresponding bit in the bit set argument also had the value true.
      Parameters:
      other - another CharBitSet
    • andNot

      public void andNot(CharBitSet other)
      Clears all the bits in this bit set whose corresponding bit is set in the specified bit set. This can be seen as an optimized version of PrimitiveCollection.OfChar.removeAll(com.github.tommyettinger.ds.PrimitiveCollection.OfChar).
      Parameters:
      other - another CharBitSet
    • or

      public void or(CharBitSet other)
      Performs a logical OR of this bit set with the bit set argument. This bit set is modified so that a bit in it has the value true if and only if it either already had the value true or the corresponding bit in the bit set argument has the value true.
      Parameters:
      other - another CharBitSet
    • xor

      public void xor(CharBitSet other)
      Performs a logical XOR of this bit set with the bit set argument. This bit set is modified so that a bit in it has the value true if and only if one of the following statements holds:
      • The bit initially has the value true, and the corresponding bit in the argument has the value false.
      • The bit initially has the value false, and the corresponding bit in the argument has the value true.
      Parameters:
      other - another CharBitSet
    • intersects

      public boolean intersects(CharBitSet other)
      Returns true if the specified CharBitSet has any bits set to true that are also set to true in this CharBitSet.
      Parameters:
      other - another CharBitSet
      Returns:
      boolean indicating whether this bit set intersects the specified bit set
    • containsAll

      public boolean containsAll(CharBitSet other)
      Returns true if this bit set is a super set of the specified set, i.e. it has all bits set to true that are also set to true in the specified CharBitSet.
      Parameters:
      other - another CharBitSet
      Returns:
      boolean indicating whether this bit set is a super set of the specified set
    • hashCode

      public int hashCode()
      Specified by:
      hashCode in interface PrimitiveCollection<Character>
      Specified by:
      hashCode in interface PrimitiveSet<Character>
      Overrides:
      hashCode in class Object
    • equals

      public boolean equals(Object obj)
      Specified by:
      equals in interface PrimitiveCollection<Character>
      Specified by:
      equals in interface PrimitiveSet<Character>
      Overrides:
      equals in class Object
    • contents

      public char[] contents()
      Gets every char in this CharBitSet, as a char[]. This simply delegates to PrimitiveCollection.OfChar.toArray().
      Returns:
      a char[] of every char in this set, in ascending order
    • appendContents

      public StringBuilder appendContents(StringBuilder builder, String delimiter)
      Given a StringBuilder, this appends part of the toString() representation of this CharBitSet, without allocating a String. This does not include the opening [ and closing ] chars, and only appends the int positions in this CharBitSet, each pair separated by the given delimiter String. You can use this to choose a different delimiter from what toString() uses.
      Parameters:
      builder - a StringBuilder that will be modified in-place and returned
      delimiter - the String that separates every pair of integers in the result
      Returns:
      the given StringBuilder, after modifications
    • appendTo

      public StringBuilder appendTo(StringBuilder builder)
      Given a StringBuilder, this appends the toString() representation of this CharBitSet, without allocating a String. This includes the opening [ and closing ] chars; it uses ", " as its delimiter.
      Parameters:
      builder - a StringBuilder that will be modified in-place and returned
      Returns:
      the given StringBuilder, after modifications
    • appendTo

      public <S extends CharSequence & Appendable> S appendTo(S sb, String separator, boolean brackets, CharAppender appender)
      Appends to a StringBuilder from the contents of this PrimitiveCollection, but uses the given CharAppender to convert each item to a customizable representation and append them to a StringBuilder. To use the default String representation, you can use CharAppender.DEFAULT as an appender.
      Specified by:
      appendTo in interface PrimitiveCollection.OfChar
      Type Parameters:
      S - any type that is both a CharSequence and an Appendable, such as StringBuilder, StringBuffer, CharBuffer, or CharList
      Parameters:
      sb - a StringBuilder that this can append to
      separator - how to separate items, such as ", "
      brackets - true to wrap the output in square brackets, or false to omit them
      appender - a function that takes a StringBuilder and an int, and returns the modified StringBuilder
      Returns:
      sb, with the appended items of this PrimitiveCollection
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • toJavaCode

      public String toJavaCode()
      A convenience method that returns a String of Java source that constructs this CharBitSet directly from its raw bits, without any extra steps involved.
      This is intended to allow tests on one platform to set up CharBitSet values that store the results of some test, such as Character.isLetter(char), and to load those results on any platform without having to recalculate the results (potentially with incorrect results on other platforms). Notably, GWT doesn't calculate many Unicode queries correctly (at least according to their JVM documentation), and this can store their results for a recent Unicode version by running on the most recent desktop JDK, and storing to be loaded on other platforms. Some already-calculated bit sets are available in CharPredicates.
      Returns:
      a String of Java code that can be used to construct an exact copy of this CharBitSet
    • with

      public static CharBitSet with(char index)
      Static builder for an CharBitSet; this overload does not allocate an array for the index/indices, but only takes one index.
      Parameters:
      index - the one position to place in the built bit set; must be non-negative
      Returns:
      a new CharBitSet with the given item
    • with

      public static CharBitSet with(char... chars)
      Static builder for an CharBitSet; this overload allocates an array for the indices unless given an array already, and can take many indices.
      Parameters:
      chars - the positions to place in the built bit set; must be non-negative
      Returns:
      a new CharBitSet with the given items
    • parse

      public static CharBitSet parse(String str, String delimiter)
      Calls parse(String, String, boolean) with brackets set to false.
      Parameters:
      str - a String that will be parsed in full
      delimiter - the delimiter between items in str
      Returns:
      a new collection parsed from str
    • parse

      public static CharBitSet parse(String str, String delimiter, boolean brackets)
      Creates a new collection and fills it by calling PrimitiveCollection.OfChar.addLegible(String, String, int, int) on either all of str (if brackets is false) or str without its first and last chars (if brackets is true). Each item is expected to be separated by delimiter.
      Parameters:
      str - a String that will be parsed in full (depending on brackets)
      delimiter - the delimiter between items in str
      brackets - if true, the first and last chars in str will be ignored
      Returns:
      a new collection parsed from str
    • parse

      public static CharBitSet parse(String str, String delimiter, int offset, int length)
      Creates a new collection and fills it by calling PrimitiveCollection.OfChar.addLegible(String, String, int, int) with the given four parameters as-is.
      Parameters:
      str - a String that will have the given section parsed
      delimiter - the delimiter between items in str
      offset - the first position to parse in str, inclusive
      length - how many chars to parse, starting from offset
      Returns:
      a new collection parsed from str