Class CsvFormatDetector

    • Constructor Summary

      Constructors 
      Constructor Description
      CsvFormatDetector​(int maxRowSamples, CsvParserSettings settings, int whitespaceRangeStart)
      Builds a new CsvFormatDetector
    • Method Summary

      All Methods Static Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      (package private) abstract void apply​(char delimiter, char quote, char quoteEscape)
      Applies the discovered CSV format elements to the CsvParser
      private java.util.Map<java.lang.Character,​java.lang.Integer> calculateTotals​(java.util.List<java.util.Map<java.lang.Character,​java.lang.Integer>> symbolsPerRow)  
      void execute​(char[] characters, int length)
      A sequence of characters of the input buffer to be analyzed.
      private char getChar​(java.util.Map<java.lang.Character,​java.lang.Integer> map, java.util.Map<java.lang.Character,​java.lang.Integer> totals, char defaultChar, boolean min)
      Returns the character with the highest or lowest associated number.
      private static void increment​(java.util.Map<java.lang.Character,​java.lang.Integer> map, char symbol)
      Increments the number associated with a character in a map by 1
      private static void increment​(java.util.Map<java.lang.Character,​java.lang.Integer> map, char symbol, int incrementSize)
      Increments the number associated with a character in a map
      private boolean isAllowedDelimiter​(char ch)  
      private boolean isSymbol​(char ch)  
      private char max​(java.util.Map<java.lang.Character,​java.lang.Integer> map, java.util.Map<java.lang.Character,​java.lang.Integer> totals, char defaultChar)
      Returns the character with the highest associated number.
      private char min​(java.util.Map<java.lang.Character,​java.lang.Integer> map, java.util.Map<java.lang.Character,​java.lang.Integer> totals, char defaultChar)
      Returns the character with the lowest associated number.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • MAX_ROW_SAMPLES

        private final int MAX_ROW_SAMPLES
      • comment

        private final char comment
      • suggestedDelimiter

        private final char suggestedDelimiter
      • normalizedNewLine

        private final char normalizedNewLine
      • whitespaceRangeStart

        private final int whitespaceRangeStart
      • allowedDelimiters

        private char[] allowedDelimiters
      • delimiterPreference

        private char[] delimiterPreference
    • Constructor Detail

      • CsvFormatDetector

        CsvFormatDetector​(int maxRowSamples,
                          CsvParserSettings settings,
                          int whitespaceRangeStart)
        Builds a new CsvFormatDetector
        Parameters:
        maxRowSamples - the number of row samples to collect before analyzing the statistics
        settings - the configuration provided by the user with potential defaults in case the detection is unable to discover the proper column delimiter or quote character.
        whitespaceRangeStart - starting range of characters considered to be whitespace.
    • Method Detail

      • calculateTotals

        private java.util.Map<java.lang.Character,​java.lang.Integer> calculateTotals​(java.util.List<java.util.Map<java.lang.Character,​java.lang.Integer>> symbolsPerRow)
      • execute

        public void execute​(char[] characters,
                            int length)
        Description copied from interface: InputAnalysisProcess
        A sequence of characters of the input buffer to be analyzed.
        Specified by:
        execute in interface InputAnalysisProcess
        Parameters:
        characters - the input buffer
        length - the last character position loaded into the buffer.
      • increment

        private static void increment​(java.util.Map<java.lang.Character,​java.lang.Integer> map,
                                      char symbol)
        Increments the number associated with a character in a map by 1
        Parameters:
        map - the map of characters and their numbers
        symbol - the character whose number should be increment
      • increment

        private static void increment​(java.util.Map<java.lang.Character,​java.lang.Integer> map,
                                      char symbol,
                                      int incrementSize)
        Increments the number associated with a character in a map
        Parameters:
        map - the map of characters and their numbers
        symbol - the character whose number should be increment
        incrementSize - the size of the increment
      • min

        private char min​(java.util.Map<java.lang.Character,​java.lang.Integer> map,
                         java.util.Map<java.lang.Character,​java.lang.Integer> totals,
                         char defaultChar)
        Returns the character with the lowest associated number.
        Parameters:
        map - the map of characters and their numbers
        defaultChar - the default character to return in case the map is empty
        Returns:
        the character with the lowest number associated.
      • max

        private char max​(java.util.Map<java.lang.Character,​java.lang.Integer> map,
                         java.util.Map<java.lang.Character,​java.lang.Integer> totals,
                         char defaultChar)
        Returns the character with the highest associated number.
        Parameters:
        map - the map of characters and their numbers
        defaultChar - the default character to return in case the map is empty
        Returns:
        the character with the highest number associated.
      • getChar

        private char getChar​(java.util.Map<java.lang.Character,​java.lang.Integer> map,
                             java.util.Map<java.lang.Character,​java.lang.Integer> totals,
                             char defaultChar,
                             boolean min)
        Returns the character with the highest or lowest associated number.
        Parameters:
        map - the map of characters and their numbers
        defaultChar - the default character to return in case the map is empty
        min - a flag indicating whether to return the character associated with the lowest number in the map. If false then the character associated with the highest number found will be returned.
        Returns:
        the character with the highest/lowest number associated.
      • isSymbol

        private boolean isSymbol​(char ch)
      • isAllowedDelimiter

        private boolean isAllowedDelimiter​(char ch)
      • apply

        abstract void apply​(char delimiter,
                            char quote,
                            char quoteEscape)
        Applies the discovered CSV format elements to the CsvParser
        Parameters:
        delimiter - the discovered delimiter character
        quote - the discovered quote character
        quoteEscape - the discovered quote escape character.