Class: Ferret::Analysis::WhiteSpaceAnalyzer

Summary

The WhiteSpaceAnalyzer recognizes tokens as maximal strings of non-whitespace characters. If implemented in Ruby the WhiteSpaceAnalyzer would look like;

  class WhiteSpaceAnalyzer
    def initialize(lower = true)
      @lower = lower
    end

    def token_stream(field, str)
      return WhiteSpaceTokenizer.new(str, @lower)
    end
  end

As you can see it makes use of the WhiteSpaceTokenizer.

Public Class Methods


WhiteSpaceAnalyzer.new(lower = false) → analyzer

Create a new WhiteSpaceAnalyzer which downcases tokens by default but can optionally leave case as is. Lowercasing will be done based on the current locale.

lower:set to false if you don‘t want the field‘s tokens to be downcased
/*
 *  call-seq:
 *     WhiteSpaceAnalyzer.new(lower = false) -> analyzer
 *
 *  Create a new WhiteSpaceAnalyzer which downcases tokens by default but can
 *  optionally leave case as is. Lowercasing will be done based on the current
 *  locale.
 *
 *  lower:: set to false if you don't want the field's tokens to be downcased
 */
static VALUE
frt_white_space_analyzer_init(int argc, VALUE *argv, VALUE self)
{
    Analyzer *a;
    GET_LOWER(false);
#ifndef POSH_OS_WIN32
    if (!frt_locale) frt_locale = setlocale(LC_CTYPE, "");
#endif
    a = mb_whitespace_analyzer_new(lower);
    Frt_Wrap_Struct(self, NULL, &frt_analyzer_free, a);
    object_add(a, self);
    return self;
}