Class: Ferret::Analysis::AsciiLetterAnalyzer

Summary

An AsciiLetterAnalyzer creates a TokenStream that splits the input up into maximal strings of ASCII characters. If implemented in Ruby it would look like;

  class AsciiLetterAnalyzer
    def initialize(lower = true)
      @lower = lower
    end

    def token_stream(field, str)
      if @lower
        return AsciiLowerCaseFilter.new(AsciiLetterTokenizer.new(str))
      else
        return AsciiLetterTokenizer.new(str)
      end
    end
  end

As you can see it makes use of the AsciiLetterTokenizer and AsciiLowerCaseFilter. Note that this tokenizer won‘t recognize non-ASCII characters so you should use the LetterAnalyzer is you want to analyze multi-byte data like "UTF-8".

Public Class Methods


AsciiLetterAnalyzer.new(lower = true) → analyzer

Create a new AsciiWhiteSpaceAnalyzer which downcases tokens by default but can optionally leave case as is. Lowercasing will only be done to ASCII characters.

lower:set to false if you don‘t want the field‘s tokens to be downcased
/*
 *  call-seq:
 *     AsciiLetterAnalyzer.new(lower = true) -> analyzer
 *
 *  Create a new AsciiWhiteSpaceAnalyzer which downcases tokens by default
 *  but can optionally leave case as is. Lowercasing will only be done to
 *  ASCII characters.
 *
 *  lower:: set to false if you don't want the field's tokens to be downcased
 */
static VALUE
frt_a_letter_analyzer_init(int argc, VALUE *argv, VALUE self)
{
    Analyzer *a;
    GET_LOWER(true);
    a = letter_analyzer_new(lower);
    Frt_Wrap_Struct(self, NULL, &frt_analyzer_free, a);
    object_add(a, self);
    return self;
}