IBM would add further complications for MySQL

March 19th, 2009 by atomic

The rumoured acquisition of Sun by IBM makes sense on many levels. IBM has always been a strong proponent of business-friendly open source and is probably more invested in Java than Sun itself. The indirect impact on recently-acquired MySQL is a bit personal for me, however.

Many moons ago, I worked in the IBM software lab in Toronto where DB2 UDB is primarily developed. I learned a great deal there, but something stirred within me, a desire for freedom and flexibility in software. I took a risk and moved to California, working for a number of start ups, believing that open source database technology (and software in general) was the wave of the future, and I wanted to work in companies that were willing to use OSS software.

To say the least, many of my then-current and former IBM colleagues that had taken cushy jobs with big companies to administer large DB2 installations, or were on track to do so, thought I was crazy. MySQL was a toy database, and PostgreSQL was just a research project! In retrospect I feel it was the right decision.

So, here we are, years later, and IBM may now end up owning MySQL.

On the positive side, I have no doubt of IBM’s commitment to open source and that they will continue to support MySQL with the tremendous resources it can bring to bear. It would be quite ironic if the Toronto lab had some people shift to MySQL development!

However, if IBM’s previous acquisitions are any indication, they will almost certainly push MySQL as a free “entry-level” database to a larger pool of customers and smooth the way for eventual upgrading to more expensive enterprise products. This is not inherently a bad thing, but it could translate into poor support and release-inflation (i.e. beta = alpha, GA = beta, etc) as a way to encourage captive customers to just bite the bullet and upgrade to DB2.

IBM may be able to squeeze quite a bit of profit out of MySQL in this way from enterprise customers who are used to this kind of abuse, but there are significant challenges:

They will need to go out of their way to convince skeptics like myself that quality and stability will not be compromised as MySQL gets integrated into IBM’s product offerings, otherwise MySQL users without deep pockets may begin defecting to postgres rather than be stuck with a key part of their infrastructure poorly-supported.

Then there is the still-unresolved problem of Oracle owning the maker of MySQL’s only viable transactional engine, InnoDB, an issue I wrote about last year when Sun acquired MySQL. With the release of the new InnoDB as a downloadable “plug-in”, Oracle has made it clear that they want to track and find as many of MySQL’s customers as they can.

Most important is that IBM will need to maintain MySQL’s reputation as the DB of choice for new web sites and other budget-conscious startups. Mindshare is everything. This is inherently difficult to track and acheive, because the next twitter will be developed by a few clever people who start up something for fun and just download all the OSS tools they need, based on what others before them did. Right now, typically, that’s a standard LAMP stack. If people starting up sites today begin to question the “M” part of LAMP (eg. swap postgresql for mysql), this effect on MySQL’s reputation and stature won’t be obvious for quite a while but it will be difficult to reverse.

Posted in mysql | 2 Comments »

Some partition movement tricks with MySQL 5.1

January 21st, 2009 by atomic

I have been using partitioning in MySQL 5.1 since it has been in late beta, and with the odd bump I’ve been quite satisfied with its stability and ease of use. Perhaps the days of maintaining nasty application logic to maintain tables like table_yyyymm manually are finally over!

While some control freaks out there might still prefer the painful approach of manually keeping dated tables and using merge tables/views to link them together, you still have a lot of flexibility in how you manage individual partitions with the ALTER TABLE REORGANIZE/CHANGE/ADD PARTITION commands.

One trick i’ve discovered to work pretty well, is that you can swap in and out partitions right out from under MySQL’s nose. This can be useful in some situations.

WARNING!! This is not supported by MySQL to the best of my knowledge, and possibly is a risky, bad idea. Proceed only if you think you know what you’re doing!

Let’s say we have a table big_table, with self-explanatory partitions

p_200811
p_200812
p_200901

In your data directory, assuming you are using MyISAM, you will see MySQL has created files like these:

big_table#P#p_200811.MYD/MYI
big_table#P#p_200812.MYD/MYI
big_table#P#p_200901.MYD/MYI

Now on the server, running live, to make sure MySQL releases any locks on files at the OS level:
LOCK TABLE big_table WRITE; # alternatively use FLUSH TABLES WITH READ LOCK to lock whole DB
FLUSH TABLE big_table;

Now let’s say you want to do some intensive processing on your November data set. You can simply copy  the file to a temporary location, or possibly to another machine:

cp $DATADIR/dbname/big_table*200811* /tmplocation

Then you can unlock your table with UNLOCK TABLES;

Let’s say you have a box you want to do this on. Simply create a table-structure only copy of your table on that machine (*exactly* the same layout!! Bad Things(TM) will probably happen if this gets confused). Then repeat the procedure above in MySQL and on the OS:

mysql> FLUSH TABLES WITH READ LOCK;
$ cp /tmploc/big_table*200811* $DATADIR_SECOND_BOX/dbname/ # this overwrites the empty files; make sure they are owned by “mysql” or the db owner!
mysql> UNLOCK TABLES;

After unlocking, mysql will reload the files, and assuming the permissions and table structures are all correct, doing a count(*) on your date range should confirm that the data has moved over successfully. Now apply your datafixes, and repeat the procedure to return the data back to the host machine!

One possible application of this approach could be to distribute a very large index rebuild to a number of machines if your tables are partitioned into several chunks.

Given that InnoDB is sensitive to the dbname, I don’t expect that this would be possible with innodb tables, although I have not tried. I expect it to work with the ARCHIVE engine but also have not confirmed.

Hoping to get some comments and suggestions on how to improve on this process, or hear about other people’s experience hacking at partitioned tables.

Posted in mysql, partitioning | 3 Comments »

Setting up MySQL monitoring with Nagios

October 8th, 2008 by atomic

It’s been a while since my last post! I have a few half-written entries, but I figured I just get one out there so I don’t appear to have dropped off the face of the planet :)

Being more of a DBA than a sysadmin myself, I’ve noticed that there is a surprising dearth of information regarding how to actually get nagios set up to monitor MySQL, especially if you’re not experienced with nagios.

In my own experience, MySQL monitoring often ends up being a homegrown thing, especially if you’re a small shop. It is definitely a good idea to consolidate all of your monitoring, DB-related or not, into one solution: roll-your-own monitoring solutions can be brittle and burn you in the end. One example that comes to mind, a few years back i had a quick perl-based script to monitor replication and set it to mail a few people based on some threshold. Took me 1 hour to write and worked well enough, except, six months later, when some package upgrades were done on the machine I had the monitoring script running on, the mailer stopped working, and we found out the hard way about an alert situation with our replication.

Nagios

Nagios is something your systems/operations people are probably familiar with. In case you are the systems/ops person, it’s a 15-20 minute process to get nagios installed on a typical ubuntu machine. Make sure to set up nagios with the embedded perl interpreter in case you decide to make use of this feature in the future. Once you have it set up and monitoring, here’s how you can set up nagios to monitor your mysql databases:

You can grab some perl-based nagios plugins to monitor mysql here:

http://www.shinguz.ch/MySQL/nagios-mysql-plugins-0.3.tar.gz

Untar the plugins into /usr/local/nagios/libexec

If you want to check out how a particular plugin is executed, run it from the command line as follows:

vmware@vmware-desktop:~/dl/nagios-mysql-plugins-0.3$ perl check_repl_mysql_seconds_behind_master.pl --help

SYNOPSIS

check_repl_mysql_seconds_behind_master.pl flags

DESCRIPTION

Nagios plugin to check how many seconds slave is behind master...

FLAGS

help, ?      Print this help.
host, h      Host where database is located (default localhost)
password, p  Password of user root (default ><)
port         Port where database listens to (default 3306)
user, u      User for connecting to the database (default root)
critical|c   Seconds for critical level (default 600)
warning|w    Seconds for warning level (default 60)

PARAMETERS

none

As an example, let’s say we want to monitor the lag of a particular replicator. Nagios operates on thresholds, the first stage being a warning, the second being a critical alert.

Adding a Nagios Command

In your /usr/local/nagios/etc/objects/commands.cfg file, add the following “command” entry — you should see others like it. This specifies a type of command you will run on a given host:

# 'check_repl_mysql_seconds_behind_master' command definition
define command{
command_name    check_repl_mysql_seconds_behind_master
command_line    $USER1$/check_repl_mysql_seconds_behind_master.pl --host=$HOSTADDRESS$ --password=$ARG1
}

In this case we’re just using most of the default parameters, but if we wanted to specify a different critical or warning threshold, we could add it with –critical=$ARG2, etc.

Then, if you’re monitoring localhost (probably not in most cases) you would just add the following entry to /usr/local/nagios/etc/localhost.cfg:

define service{
use                             local-service         ; Name of service template to use
host_name                       localhost
service_description             MySQL Replication
check_command                   check_repl_mysql_seconds_behind_master!"<password>"
notifications_enabled           0
}

If you want to add another host to be monitored from this nagios instance, you can specify a new configration file, say, db01.cfg and write that to your etc/objects directory. Then add an entry to /usr/local/nagios/etc/nagios.cfg as follows, after making the appropriate adjustments to the hostname in the service:

cfg_file=/usr/local/nagios/etc/objects/db01.cfg

If everything has been configured correctly, you should be able to restart nagios with /etc/init.d/nagios restart, and then go into your nagios console and see the configuration entries for your new command and service, and also see if your slaves are far behind :-)

Once you have one service added in this way, adding new ones becomes a lot easier. I hope to investigate some more of the performance monitoring features of the nagios mysql plugins, but i’ll leave that for a future blog post.

Posted in monitoring, mysql, replication | 3 Comments »

« Previous Entries