Saturday, April 6, 2013

Speedup mysql index using ssd disk

You probably ask yourself sometimes how you can boost MySQL indexes when you are working with BIG databases/tables.

Really the solution is here for a long time already. You can use SSD disks (also known as flash disks). These disks are not that big that traditional SATA hard drives but they provide a superior performance. If your database is rather big to be placed on SSD disk you can still place your database indexes onto it. So, your data will be saved on an old disk and the index will be placed in faster SSD disk.

If can be easily done.

1.    You need to create a directory on your SSD disk that will be used for mysql indexes. In addition you need to give to mysql service write permission to that directory.


mkdir /ssd/mysql-indexes
mkdir /ssd/mysql-indexes/my-big-db-name
chmod 700  /ssd/mysql-indexes
chown –R mysql:mysql /ssd/mysql-indexes


2.    OPTIONAL STEP:

Here is another tip shared with me by another MySQL expert. Add the following line to [mysqld] configuration section in mysql. It will speed up the index rebuild operation in a number of times if you work with MyISAM tables. You need to restart mysql server for the changes take effect.

[mysqld]
myisam_repair_threads=10


3.    Finally you can run the following query to alter location of the index files.

ALTER TABLE my-table INDEX DIRECTORY= '/ssd/mysql-indexes/my-big-db-name';


After that you will have to wait some time till MySQL indexes will be rebuild and you are ready to work with your faster database!

You are welcome to publish your benchmarks.

Best regards,

Yuli

Monday, February 25, 2013

Optimize MySQL COUNT (*) query

There is not magic solution to count table contents especially if you have a client that has a table with 40+ million records and you need to do the filtering using the 'OR' SQL statement.

Original query was the following:

SELECT COUNT(*) from table1 WHERE field1 IN ('val1','val2') OR field2 IN ('val3','val4');

First benchmark before doing optimization showed me results after 4 minutes.

My first step was to create a compound key (multi-column key) using the field1 and field2.
I created it as following:
ALTER TABLE table1 ADD INDEX `field1_field2_idx` (`field1`,`field2`);

After waiting for 3 hours (you remember I had a table with 40 million records) I was able to continue.

This new compound key does not gave me any performance increase because OR statement was used in the original query.

As some of you know "ORs are notoriously bad performers, because it splinters the execution path" (from: http://stackoverflow.com/questions/6551682/mysql-indexing-in-an-or-statement).

After some thinking I decided to get rid of the OR and create 2 queries instead.

The following 2 queries gave me the same result ( I had the sum up the results):

SELECT COUNT(*) from table1 WHERE field1 IN ('val1','val2') AND field2 NOT IN ('val3','val4');
SELECT COUNT(*) from table1 WHERE field2 IN ('val3','val4');

Now the results were ready after 2 minutes. The first query was almost instant because it used the compound key and the second query took all the time. It was because the index was missing for field2.

So, as a final step I had to add an another index on field2. I used the following command:

ALTER TABLE table1 ADD INDEX `field2_idx` (`field2`);

I had to waite another 3 hours for the index to be ready and finally:

JUST 5 SECS to count rows on 40+ million records' table result was achieved.


Hope you will find this information useful in your case.

Best regards,
Yuli



Thursday, October 4, 2012

3 times speed increase with static pages

Internet was born from static pages. People were creating websites with emacs and vi. With increased popularity of scripting languages and open source databases like MySQL and Postgresql, more and more developers started to use these advanced tools.

CMS systems were build. They made our lives much easier. Instead of altering file contents with editor, the user can simply go to the website admin panel and change page from their.

The speed of loading page generated in PHP with MySQL is much slower than a static page. In addition, on every page request CMS can generate tens or event hundreds of database requests.

With the advances of new web technologies like Jquery and HTML5 we have a chance to get back to the roots. We can create static websites that behaves the same as dynamic ones.

I would like to show you some benchmarks.

In my test I will fetch two small test pages. One with static content and another one generated with PHP and MySQL.

Contents of test.html

<html>
<head>
<title>hello world</title>
</head>
<body>
<h2>Hello world!</h2>
</body>
</html>


 Contents of test.php

<html>
<head>
<title>hello world</title>
</head>
<body>
<?php
$handler =  mysql_connect('127.0.0.1', 'user', 'password');
$result = mysql_query("SELECT 'Hello World!'");
$row = mysql_fetch_row($result);
print "<h2>$row[0]</h2>\n";
?>
</body>
</html>


In my test I used "ab" command to test web application speed. It stands for Apache Bench – a small application to perform quick test for website speed.

To test static page speed I use the following command:
ab -c 100 -n 10000 http://website.com/test.html

To test dynamic page speed I use the following command:
ab -c 100 -n 10000 http://website.com/test.php

In this test I got the following result. Static pages are at least 3 times faster than dynamic pages generated with PHP and MySQL.

In my production server, at average, ab was able to fetch almost 15k requests per second for test.html static page and 5k requests per second for test.php dynamic page.

Feel free to leave your benchmark results here.

Best regards,
Yuli

Sunday, September 30, 2012

Speedup mysql inserts

By default, when connected to MySQL server, every command is executed as a distinct transaction. As a result, after each INSERT SQL statement MYSQL saves data to the database and commits transaction. When your application inserts hundreds and thousands of records, your database is busy in saving new records all the time and you are loosing in speed.

The solution for this problem is very easy. You need to prevent MySQL from committing transaction after every INSERT and commit transaction for example every 50 records.

It can be easily done. You just need to issue "START TRANSACTION;" SQL statement, insert your records, and when the counter reaches 50 new records, do commit "COMMIT';

You can easily do it in a web based scripts and in stand alone applications.

For example:

START TRANSACTION;
INSERT INTO TABLE t1 (f1,f2,f3) values ('v1','v2','v3');
INSERT INTO TABLE t1 (f1,f2,f3) values ('v1','v2','v3');
INSERT INTO TABLE t1 (f1,f2,f3) values ('v1','v2','v3');
INSERT INTO TABLE t1 (f1,f2,f3) values ('v1','v2','v3');
-- more inserts

COMMIT;


After the COMMIT query all new records are flushed and written to disk.

This method gives in some cases up to 10x performance boost.

It can be applied to other types of databases too.


Thursday, September 27, 2012

Micro optimization or why code runs slower on 64 bit

One of my friends works for the company doing video processing. He told me that the company code runs slowly on 64 bit processor than on 32. It can be number of reasons. I would like to talk about the most important one.

The main problem is the size of the integer.

When your code runs on 32 bit processor you were using int for all integer numbers. When you compile the same code on 64 bit processor, the integer stays 32 bit.

  int i;

NOTE: The size of int variable in C++ stays 32 bit when code is compiled for 64bit processor or for 32 bit.

So, when you have the following code:

  unsigned char * data = new char[n];
  int i = 10;
  data[i] = 'a';

When the code is compiled and executed on 64 bit processor, the memory addressing is 64 bit too. As a result when doing simple and quick operation like data[i] which equals to offset_to_data+I is converted to offset_to_data + convert_to_64(i). The int i variable needs to be converted to 64 bit.

As a result, when working with memory all the int variables are converted to 64 bit. And eventually additional code is executed each time you access memory !


Solution.

The solution is very simple. You need to convert your code to use ssize_t  type variable instead of integers and use size_t type variable instead of unsigned int.

When the code is compiled on 32 bit processor the ssize_t type variable has the size of 32 bit.

When code is compiled on 64 bit processor, the ssize_t type  variable has the size of 64 bit.


Any problems?

One issue discovered when working with code that must compile on windows and linux. The problem is with printing these variables. Somehow Linux and Windows compilers use different standards when printing these numbers.

Here is solution:

#ifdef WIN32
#define SIZE_T_FORMAT "%Iu"
#define SSIZE_T_FORMAT "%Id"
#else
#define SIZE_T_FORMAT "%zd"
#define SSIZE_T_FORMAT "%zu"
#endif
 

ssize_t num = 1234;
printf("ssize_t number: " SIZE_T_FORMAT "\n", num);


Feel free to leave a comments/questions here.

Tuesday, September 25, 2012

Hello World Post

Hello !

This is my first post at this blog.

Enjoy your stay !