How can I identify and delete unused MySQL indexes? I have a large table that has somehow accumulated way too many indexes over the years, and I'd now like to identify and delete indexes that are no longer used by the application.

OK, so I found this blog post by the one and only Percona, that says that I can simply do select * from sys.schema_unused_indexes; which does give me a list of indexes. However, it says it's based on having:

update performance_schema.setup_consumers set enabled = 'yes' where name = 'events_waits_current';
update performance_schema.setup_instruments set enabled = 'yes' where name = 'wait/io/table/sql/handler';

The second one was already set to yes but the first one was not. What does it do? Why is it required?

I thought it would be as simple as

DROP INDEX index_name ON table_name;

As for removing only unused indexes, I don't know how MySql would be able to determine if an index is used, or not. You'd have to decide. Since indexes are used to find things quickly I would imagine that if you delete an index and then take a performance hit it's likely not an unused one.

commented: This is incorrect. I should not have to decide. -8

As some general background, that's the syntax to delete an index, of course, but MySQL knows if an index is used because, well, it's the one that uses it, and it tracks that stuff internally. Now on to my question ... ;)

I already read that article, as well as related ones such as this and this but it all went above my head and didn’t seem to directly answer my question.

For example, when I currently run select * from sys.schema_unused_indexes; I get a list of indexes. Is that list incomplete or straight-up wrong?

Is that list incomplete or straight-up wrong?

I have no way of knowing. If I add a bunch of numbers and get an answer, is it right or wrong? Again, as a guess, if you have enabled the correct flags and let the system run through typical processing, the returned list should be, according to the docs, a list of unused indexes. Worst case scenario, if you delete an index and your processes slow down then you probably deleted an active index. In that case just recreate it. Nothing lost.

Creating and removing database indexes is very complex and is not a matter of let's throw spaghetti at the wall and see what sticks.

When working with big data, loads and loads of thought goes into each and every index. Nearly all indexes can work, but not as well as others might, and having too many indexes on a table can slow down inserts and updates to a drag. Having the exact correct indexes can lead to huge performance wins.

For us, it takes 12+ hours to add an index to a table, during which for most of that time we need to bring the database offline, as it's too resource intensive to add an index to a table that is concurrently being hit with requests. Removing an index that is being used, even if it's sub-optimal, can easily take down the server and break replication. The last time I mistakingly removed an index that was being used, we had 2 days of downtime.

I faced a similar problem when I created and maintained the corporate side databases to mirror the EMS (AGC/SCADA) real time data. One month of data was roughly 300 meg and it was critical that this be available 24x7. Sometimes databases break and it is important to recover them as quickly as possible. Knowing that around 90% of the queries were on the current month, 5-10% on the previous month, and only rarely on older months, I created a new set of tables for each month, and views that combined the previous, current, and next months. Inserts were only ever done on the current month. Recovering any particular month that got corrupted was a matter of minutes rather than hours. Inserts were done on the base tables, and selects (almost always) on the views.

I wrote scripts to automatically create the new databases and the views. When corporate IT decided to create their own copy I heard that their database people were trash talking my setup behind my back. I sat down with one of them (who had been a part of the Process Control Group which I joined when I was hired) and explained my reasoning. They ended up doing it my way.

I can see your rationale, but I don't necessarily agree with a separate table in the database for every month. There are much more elegant ways of handling that use case these days, especially in MySQL. You didn't specify how long ago this was.

Either way, still hoping a MySQL expert can come along and answer my question here.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.