Ticket #340 (closed enhancement: fixed)

Opened 2 years ago

Last modified 3 months ago

[PATCH] Batch processing functions

Reported by: francois.lagunas@… Owned by: somebody
Priority: major Milestone: milestone2
Component: component1 Version: 2.0
Keywords: batch processing Cc:

Description

This patch add new functions to batch update and batch delete documents in a ferret index.
They are directly inspired of their single document versions.
The main advantage is that locking and committing costs (file system access mainly) are shared between a full set of documents. In practice, batch updating a few thousands documents at a time lead to a 10x speed-up on indexing.
This can be used in the acts_as_ferret rails plugin, to speed-up re-indexing and other operations :  http://projects.jkraemer.net/acts_as_ferret/ticket/202

Francois Lagunas
Scientific Director, Dailymotion
 http://www.tourteaser.com

Attachments

batch_processing.diff Download (8.2 KB) - added by francois.lagunas@… 2 years ago.
Patch for batch processing

Change History

Changed 2 years ago by francois.lagunas@…

Patch for batch processing

Changed 2 years ago by dbalmain

  • status changed from new to closed
  • resolution set to fixed

Applied patch and then made significant modifications. Here are some examples.

To batch delete documents you can use the IndexWriter#delete? method;

@index_writer.delete(:id, ['12', '34', '123'])

You can also batch delete with the Ferret::Index::Index#delete method;

# The field used is whatever the :id_field was set to
@index.delete(['12', '34', '123'])

# You can also batch delete by Ferret document number
@index.delete([12, 34, 123])

To batch update you need to use the Index#batch_update method;

@index.batch_update([
    {:id => 234, :content => 'yada yada yada'},
    {:id => 897, :content => 'blah blah blah'},
    {:id => 932, :content => 'nani nani nani'}
  ])

Add/Change #340 ([PATCH] Batch processing functions)

Author


E-mail address and user name can be saved in the Preferences.


Change Properties
<Author field>
Action
as closed
Next status will be 'reopened'
 
Note: See TracTickets for help on using tickets.