[plug] importing a large text database for fast search

Michael Van Delft michael at hybr.id.au
Mon Sep 5 08:34:20 WST 2011


You could also look at ack (http://betterthangrep.com), I'm not sure
because I've never used it on really large files but it clames to be
faster than grep.

On Mon, Sep 5, 2011 at 6:52 AM, Onno Benschop <onno at itmaze.com.au> wrote:
> I know you've said that grep is slow, but in my experience that's only
> really true for case in-sensitive (-i) searching.
>
> Have you tested a non-case sensitive search?
>
> If that speed is acceptable, perhaps converting the whole lot to one case
> and searching that might be a whole lot simpler.
>
> Also, consider your file system mount options, update on access etc.
>
> Onno Benschop, ITmaze
>
> On 04/09/2011 12:56 PM, "Michael Holland" <michael.holland at gmail.com> wrote:
>> Thanks folks. Perl frontend with SQL datbase it is then.
>> I'll go look at DBD::SQLite and DBIx::Class My PERL is rusty, but I like
>> it.
>>
>> On Fri, Sep 2, 2011 at 4:30 PM, Alexander Hartner <alex at j2anywhere.com>
>> wrote:
>>> When importing large amounts of data there are some things to consider
>>> which are common to all databases.
>>>
>>> If you are using individual insert statements would would want to disable
>>> AUTOCOMMIT and manuall commit every X-thousand entries as well as use a
>>> prepared statement to avoid the SQL statement from being processes over and
>>> over again.
>>>
>>> If you not going to use insert statements and opt to use something like
>>> postgresql's COPY or DB2's LOAD command you should bet much better
>>> performance for the import. Typically these import the data without applying
>>> triggers, referential integrity etc. You should have a look at the options
>>> available by the database engine you decide on.
>>>
>>> Regards
>>> Alex
>>>
>> _______________________________________________
>> PLUG discussion list: plug at plug.org.au
>> http://lists.plug.org.au/mailman/listinfo/plug
>> Committee e-mail: committee at plug.org.au
>> PLUG Membership: http://www.plug.org.au/membership
>
> _______________________________________________
> PLUG discussion list: plug at plug.org.au
> http://lists.plug.org.au/mailman/listinfo/plug
> Committee e-mail: committee at plug.org.au
> PLUG Membership: http://www.plug.org.au/membership
>



More information about the plug mailing list