Optimize SQL Queries:
- Indexes
- Symbol Operator
- Wildcard
- NOT Operator
- COUNT VS EXIST
- Wildcard VS Substr
- Index Unique Column
- Max and Min Operators
- Data Types
- Primary Index
- String indexing
- Limit The Result
- In Subquery
- union vs OR
Indexes
Symbol Operator
Symbol operator such as >,<,=,!=, etc. are very helpful in our query. We can optimize some of our query with symbol operator provided the column is indexed.
example:
SELECT * FROM TABLE WHERE COLUMN > 16
SELECT * FROM TABLE WHERE COLUMN >= 15
Wildcard
In SQL, wildcard is provided for us with ‘%’ symbol. Using wildcard will definitely slow down.
We can optimize our query with wildcard by doing a postfix wildcard instead of pre (or) full wildcard.
Note: That column must be indexed for such optimize to be applied
#Full wildcard SELECT * FROM TABLE WHERE COLUMN LIKE '%hello%';
#Postfix wildcard
SELECT * FROM TABLE WHERE COLUMN LIKE 'hello%';
#Prefix wildcard
SELECT * FROM TABLE WHERE COLUMN LIKE '%hello';
NOT Operator
- Positive Operator much faster than negative operator.
- Using a positive operator just stop immediately once the result has been found. But negative operator check the all the records
COUNT VS EXIST
- The above query is very bad query since count will search for all record exist on the table to determine the numeric value of field ‘COLUMN’.
- The better alternative will be to use the EXIST operator where it will stop once it found the first record. Hence, it exist.
Wildcard VS Substr
"To use wildcard instead of substr". The column should be indexed for using wildcard.
#BAD
SELECT * FROM TABLE WHERE substr ( COLUMN, 1, 1 ) = 'E'.
#BETTER
SELECT * FROM TABLE WHERE COLUMN = 'E%'.
Index Unique Column
- MySQL search better with column that are unique and indexed.
- Hence, it is best to remember to index those columns that are unique.
Max and Min Operators
Using Max and Min operators should help speed up the searching. If you are going to use min and max, The column must be indexed.
Data Types
- Use the most efficient (smallest) data types possible.
- VARCHAR will be better than long text to store an email or small details.
Primary Index
The primary column that is used for indexing should be made as short as possible. This makes identification of each row easy and efficient by the DBMS.
String indexing
- It is unnecessary to index the whole string when a prefix or postfix of the string can be indexed instead.
- Especially if the prefix or postfix of the string provides a unique identifier for the string, it is advisable to perform such indexing.
- Shorter indexes are faster, not only because they require less disk space, but because they also give you more hits in the index cache, and thus fewer disk seeks.
Limit The Result
Another common way of optimizing your query is to minimize the number of row return.SELECT * FROM TABLE
SELECT * FROM TABLE WHERE 1 LIMIT 10
In Subquery
Using dummy table is better than using an IN operator. Alternative, an exist
operator is also better.
SELECT * FROM TABLE WHERE COLUMN IN (SELECT COLUMN FROM TABLE)
Doing this is very expensive because SQL query will evaluate the outer query first before proceed with the inner query. Instead we can use this instead.
SELECT * FROM TABLE, (SELECT COLUMN FROM TABLE) as dummytable WHERE dummytable.COLUMN = TABLE.COLUMN;
Utilize Union instead of OR
Indexes lose their speed advantage when using them in OR situations in MySQL at least. Hence, this will not be useful although indexes is being applied
SELECT * FROM TABLE WHERE COLUMN_A = 'value' OR COLUMN_B = 'value'
Union: run faster
SELECT * FROM TABLE WHERE COLUMN_A = 'value'
UNION
SELECT * FROM TABLE WHERE COLUMN_B = 'value'
SQL Performance Tips
- use index in the column.
- use primary index.
- use index unique
- use symbol operators like, <,>,=
- use positive operator like IN, EXIST instead of NOT IN, NOT EXIST
- use full/pre wildcard operator
- use prefix/postfix string index.
- use LIMIT operator.
- use EXIST instead of COUNT
- use small data types VARCHAR
- Use Slow Query Log (always have it on!)
- Don't use DISTINCT when you have or could use GROUP BY
- Use LOAD DATA instead of INSERT
- Don't use ORDER BY RAND() if you have > ~2K records
- Use SQL_NO_CACHE when you are Selecting frequently updated data or large sets of data.
- use UNION instead of OR
- Avoid using IN(...) when selecting on indexed fields, It will kill the performance of SELECT query.
- Use stored procedures to avoid bandwidth wastage
- enable key_buffer, query cache, table cache.