Ticket #68 (closed defect: wontfix)
Exponential behaviour in StandardTokenizer
| Reported by: | anonymous | Owned by: | somebody |
|---|---|---|---|
| Priority: | critical | Milestone: | |
| Component: | component1 | Version: | |
| Keywords: | Cc: |
Description
the regular expression used in StandardTokenizer? shows exponential behaviour using simple strings [every added underscore in given example will double processing time]
too reproduce:
def test_lots_of_underscore()
sa = StandardAnalyzer.new
input = "_________________________"
t = sa.token_stream("field", input)
now = Time.new
t.each do |token|
puts "#{token.text}"
end
assert Time.new - now < 1, "tokenizing taking to long"
end
see http://www.codinghorror.com/blog/archives/000488.html for more information
Attachments
Note: See
TracTickets for help on using
tickets.
