Class: Ferret::Analysis::AsciiStandardTokenizer

Summary

The standard tokenizer is an advanced tokenizer which tokenizes most words correctly as well as tokenizing things like email addresses, web addresses, phone numbers, etc.

Example

  "Dave's résumé, at http://www.davebalmain.com/ 1234"
    => ["Dave's", "r", "sum", "at", "http://www.davebalmain.com", "1234"]

Public Class Methods


AsciiStandardTokenizer.new() → tokenizer

Create a new AsciiStandardTokenizer

/*
 *  call-seq:
 *     AsciiStandardTokenizer.new() -> tokenizer
 *
 *  Create a new AsciiStandardTokenizer 
 */
static VALUE
frb_a_standard_tokenizer_init(VALUE self, VALUE rstr) 
{
    return get_wrapped_ts(self, rstr, standard_tokenizer_new());
}