Changes between Version 1 and Version 2 of FerretVsLucene


Ignore:
Timestamp:
10/11/06 00:53:31 (4 years ago)
Author:
dbalmain
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • FerretVsLucene

    v1 v2  
    55*Disclaimer*: These benchmarks were written by myself, the developer of Ferret so they may be slightly biased. I have submitted them to the Lucene mailing list so that Lucene developers can check the fairness of these benchmarks. The numbers below are in no way an indication of the quality of either library. Lucene is currently a lot more stable than Ferret. The reason I have run these benchmarks against Lucene is that Ferret was originally ported from Lucene and is still very strongly influence by that library. I also believe Lucene is the gold standard for information retrieval libraries. 
    66 
    7 All suggestions/comments/critiques are most welcome and should be directed to myself at dbalmain.ml at gmail.com. 
     7All suggestions/comments/critiques are most welcome and should be directed to myself at dbalmain.ml at gmail.com. Alternatively just add your comments to this page. 
    88 
    99[http://lucene.apache.org/java/docs/index.html Apache Lucene] can be downloaded [http://www.apache.org/dyn/closer.cgi/lucene/java/ here]. 
     
    2121== Indexing performance == 
    2222 
    23 For the indexing benchmark we need to look a few different situations. Most importantly, the following benchmarks look at performance when storing the field with term-vectors and not storing the field or term vectors. They also have options for reopening the IndexWriter at regular intervals. I use a !WhiteSpaceAnalyzer so that analysis time will have little effect on the results. Here are the indexing benchmarking programs: 
     23For the indexing benchmark we need to look a few different situations. Most importantly, the following benchmarks look at performance when storing the field with term-vectors and not storing the field or term vectors. They also have options for reopening the !IndexWriter at regular intervals. I use a !WhiteSpaceAnalyzer so that analysis time will have little effect on the results. Here are the indexing benchmarking programs: 
    2424 
    2525  * LuceneIndexingBenchmarker 
    2626  * FerretIndexingBenchmarker 
    2727 
    28 I run each test 6 times and the top and bottom results are thrown away to the HotSpot warmup should have no effect on the Lucene results. 
     28I run each test 6 times and the top and bottom results are thrown away to the !HotSpot warmup should have no effect on the Lucene results. 
    2929 
    3030=== Unstored Without Term-Vectors === 
    3131 
    32 to be continued. 
     32{{{ 
     33dbalmain@ubuntu:~/sandpit/benchmarks $ java -classpath lucene-core-2.0.0.jar:. -server -Xmx500M -XX:CompileThreshold=100 LuceneIndexer -reps 6 
     34--------------------------------------------------- 
     351   Secs: 37.96  Docs: 19043 
     362   Secs: 24.17  Docs: 19043 
     373   Secs: 23.19  Docs: 19043 
     384   Secs: 22.43  Docs: 19043 
     395   Secs: 21.23  Docs: 19043 
     406   Secs: 21.86  Docs: 19043 
     41--------------------------------------------------- 
     42Lucene 2.0.0 
     43JVM 1.5.0_06 (Sun Microsystems Inc.) 
     44Linux 2.6.15-27-386 i386 
     45Mean: 25.14 secs 
     46Truncated mean (4 kept, 2 discarded): 22.91 secs 
     47--------------------------------------------------- 
     48dbalmain@ubuntu:~/sandpit/benchmarks $ ruby ferret_indexer.rb --reps 6 
     49------------------------------------------------------------ 
     500  Secs: 6.18  Docs: 19043 
     511  Secs: 6.37  Docs: 19043 
     522  Secs: 7.25  Docs: 19043 
     533  Secs: 6.15  Docs: 19043 
     544  Secs: 6.15  Docs: 19043 
     555  Secs: 6.23  Docs: 19043 
     56------------------------------------------------------------ 
     57Mean 6.39 secs 
     58Truncated Mean (4 kept, 2 discarded): 6.23 secs 
     59------------------------------------------------------------ 
     60}}} 
     61 
     62=== Stored Without Term-Vectors === 
     63{{{ 
     64dbalmain@ubuntu:~/sandpit/benchmarks $ java -classpath lucene-core-2.0.0.jar:. -server -Xmx500M -XX:CompileThreshold=100 LuceneIndexer -reps 6 -store 1 
     65--------------------------------------------------- 
     661   Secs: 53.70  Docs: 19043 
     672   Secs: 37.56  Docs: 19043 
     683   Secs: 36.50  Docs: 19043 
     694   Secs: 34.90  Docs: 19043 
     705   Secs: 41.11  Docs: 19043 
     716   Secs: 34.32  Docs: 19043 
     72--------------------------------------------------- 
     73Lucene 2.0.0 
     74JVM 1.5.0_06 (Sun Microsystems Inc.) 
     75Linux 2.6.15-27-386 i386 
     76Mean: 39.68 secs 
     77Truncated mean (4 kept, 2 discarded): 37.52 secs 
     78--------------------------------------------------- 
     79dbalmain@ubuntu:~/sandpit/benchmarks $ ruby ferret_indexer.rb --reps 6 --store 
     80------------------------------------------------------------ 
     810  Secs: 12.47  Docs: 19043 
     821  Secs: 13.59  Docs: 19043 
     832  Secs: 12.50  Docs: 19043 
     843  Secs: 12.44  Docs: 19043 
     854  Secs: 12.60  Docs: 19043 
     865  Secs: 12.81  Docs: 19043 
     87------------------------------------------------------------ 
     88Mean 12.74 secs 
     89Truncated Mean (4 kept, 2 discarded): 12.60 secs 
     90------------------------------------------------------------ 
     91}}}