Speeding up small databases on Linux

ArticleCategory: [Choose a category for your article]

Applications

AuthorImage:[Here we need a little image form you]

[Photo of the Author]

TranslationInfo:[Author and translation history]

original in en Mark Nielsen 

AboutTheAuthor:[A small biography about the author]

The author works at The Computer Underground, Inc. as a Linux geek and enjoys doing silly things and making up silly projects, because hey, computers are SUPPOSED to be fun. Mark also works at ZING, www.genericbooks.com as a volunteer to help and promote free and open software and literature.

Abstract:[Here you write a little summary]

This article shows how to speed up a PostgreSQL database server using a RAMDISK.

ArticleIllustration:[This is the title picture for your article]

[Illustration]

ArticleBody:[The article body]

Introduction

Okay, what is RAMDISK and how do we use it to speed up a database server like PostgreSQL?

Ramdisk is basically memory that you pretend to use as a hard disk drive. Anytime you use a ramdisk, you basically are using memory and not your hard drive. There are advantages and disadvantages to this. Basically, the big advantage is that since you are using memory, whatever you are doing will be much faster since your hard drive is slower than your memory. The big disadvantage is that if you make changes to the database server and you reboot your computer, you will loose all your changes.

For the database server PostgreSQL, if you load all the databases in memory, you will increase its speed.

Setting up a ramdisk for /Test

To setup your one of your default ramdisks to the directory /Test, do this.

mkdir -p /Test
mkfs -t ext2 /dev/ram0
mount /dev/ram0 /Test
If this fails then you might have no ramdisk support compiled into the Kernel. CONFIG_BLK_DEV_RAM is the Kernel configuration option which you need to enable it.

The above gives you a ramdisk with available space of just under 4Mb. Have a look at the Ramdisk Article to see how you can change this to e.g 50Mb.

Setting up a ramdisk for PostgreSQL

Ideally, you would have two database servers. One which you can make you changes to, and the other which is a copy that exists in your ramdisk. To do this, you would use commands like "pg_dump" or "pg_dumpall".

NOTE: We assume you have set the ramdisk option in your lilo.conf file to be larger than the size of your database server. If you need to find the approximate size, issue this command "cd /var/lib/pgsql; du ".

However, to take your current postgresql server installed at "/var/lib/pgsql" and to put it into memory, do this,

        ### Stop the current postgresql server
/etc/rc.d/init.d/postgres stop
        ### rename the current directory
mv /var/lib/pgsql /var/lib/pgsql_main
        #### Create a directory to have our ramdisk on
mkdir -p /var/lib/pgsql_memory
 	#### change the ownership of the new directory to postgres or whatever
	#### the actual owner is. 
chown postgres /var/lib/pgsql_memory
	#### Make an alias or link to the original name, /var/lib/pgsql
ln -s /var/lib/pgsql_memory /var/lib/pgsql

	#### Format the ramdisk
mkfs -t ext2 /dev/ram0
	#### Mount the ramdisk to the postgresql directory
mount /dev/ram0 /var/lib/pgsql_memory

	#### Copy everything from the main directory into the ramdisk
tar -C /var/lib/pgsql_main -cp . | tar -C /var/lib/pgsql_memory -xp
        ### Start the current postgresql server
/etc/rc.d/init.d/postgres start


Comments on the speed

I have been trying to test the speed performance with and without putting postgresql in a ramdisk. Sometimes the speed performance increased by 50% when postgresql was in ramdisk, and other times it did not.

What I noticed is that when the data is cached by the system, the performance gain is not very much, about 10% or 20%. When there is lots of unique data being pumped into the database, the performance goes up dramatically.

The biggest advantage with the ramdisk trick is that you force the database to remain in memory and you avoid using your hard drive. If you don't use ramdisk, your database will probably get swapped out of cache eventually, and then when it gets used again, it will have to reload from the hard drive.

Comments

  1. For the Linux operating system, apparently once a file is read, it is kept in memory until that memory space needs to get used by something else. Thus, it caches it. Well, this might be a solution, except from practical experience, when my database server gets hit, I always hear the hard drive moving. Using a ramdisk forces the database server into memory making it so it doesn't get "swapped out". If you use your database server a lot, I would force it into memory.
  2. The advantages of using a ramdisk for your database is speed, clear and simple. For read-only databases, this is really cool.
  3. The disadvantage is that you have to know how to maintain it and if you make changes, you better make sure you backup the changes you made.

References

  1. How to use a Ramdisk for Linux by Mark Nielsen
  2. PostgreSQL homepage

mirror server hosted at Truenetwork, Russian Federation.