Wednesday, September 5, 2018

Optimize MYSQL

Optimize SQL Queries:  

  1. Indexes 
  2. Symbol Operator 
  3. Wildcard 
  4. NOT Operator 
  5. COUNT VS EXIST 
  6. Wildcard VS Substr 
  7. Index Unique Column
  8. Max and Min Operators 
  9. Data Types 
  10. Primary Index
  11. String indexing 
  12. Limit The Result 
  13. In Subquery
  14. union vs OR

Indexes 


To create Index for your column is a common way to optimize your search result. 

Symbol Operator



Symbol operator such as >,<,=,!=, etc. are very helpful in our query. We can optimize some of our query with symbol operator provided the column is indexed.


example:
SELECT * FROM TABLE WHERE COLUMN > 16
SELECT * FROM TABLE WHERE COLUMN >= 15


Wildcard


In SQL, wildcard is provided for us with ‘%’ symbol. Using wildcard will definitely slow down.

We can optimize our query with wildcard by doing a postfix wildcard instead of pre (or) full wildcard. 


Note: That column must be indexed for such optimize to be applied

#Full wildcard  SELECT * FROM TABLE WHERE COLUMN LIKE '%hello%'; 


#Postfix wildcard
 SELECT * FROM TABLE WHERE COLUMN LIKE 'hello%';


#Prefix wildcard 
 SELECT * FROM TABLE WHERE COLUMN LIKE '%hello'; 


NOT Operator

To Avoid NOT operator in SQL. 


  • Positive Operator much faster than negative operator. 
  • Using a positive operator just stop immediately once the result has been found. But negative operator check the all the records



Positive operators: 
LIKE, IN, EXIST (or) = symbol operator

Negative operator:

NOT LIKE, NOT IN, NOT EXIST or != symbol. 


COUNT VS EXIST


"To use Exist operator instead of Count operator" to determine whether a particular data exist.


SELECT COLUMN FROM TABLE WHERE COUNT(COLUMN) > 0


  • The above query is very bad query since count will search for all record exist on the table to determine the numeric value of field ‘COLUMN’. 
  • The better alternative will be to use the EXIST operator where it will stop once it found the first record. Hence, it exist.

Wildcard VS Substr



"To use wildcard instead of substr". The column should be indexed for using wildcard.

SEARCH FOR ALL ROWS WITH THE FIRST CHARACTER AS 'E'

#BAD
 SELECT * FROM TABLE WHERE substr ( COLUMN, 1, 1 ) = 'E'.

#BETTER
 SELECT * FROM TABLE WHERE COLUMN = 'E%'.


Index Unique Column



  • MySQL search better with column that are unique and indexed. 
  • Hence, it is best to remember to index those columns that are unique.


Max and Min Operators



Using Max and Min operators should help speed up the searching. If you are going to use min and max, The column must be indexed

Data Types


  • Use the most efficient (smallest) data types possible. 
  • VARCHAR will be better than long text to store an email or small details.


Primary Index


The primary column that is used for indexing should be made as short as possible. This makes identification of each row easy and efficient by the DBMS.

String indexing


  • It is unnecessary to index the whole string when a prefix or postfix of the string can be indexed instead. 
  • Especially if the prefix or postfix of the string provides a unique identifier for the string, it is advisable to perform such indexing. 
  • Shorter indexes are faster, not only because they require less disk space, but because they also give you more hits in the index cache, and thus fewer disk seeks.


Limit The Result

Another common way of optimizing your query is to minimize the number of row return. 


SELECT * FROM TABLE

SELECT * FROM TABLE WHERE 1 LIMIT 10



In Subquery



Using dummy table is better than using an IN operator. Alternative, an exist
operator is also better.

SELECT * FROM TABLE WHERE COLUMN IN (SELECT COLUMN FROM TABLE)

Doing this is very expensive because SQL query will evaluate the outer query first before proceed with the inner query. Instead we can use this instead.

SELECT * FROM TABLE, (SELECT COLUMN FROM TABLE) as dummytable WHERE dummytable.COLUMN = TABLE.COLUMN;

Utilize Union instead of OR


Indexes lose their speed advantage when using them in OR ­situations in MySQL at least. Hence, this will not be useful although indexes is being applied

SELECT * FROM TABLE WHERE COLUMN_A = 'value' OR COLUMN_B = 'value'

Union: run faster

SELECT * FROM TABLE WHERE COLUMN_A = 'value'
UNION
SELECT * FROM TABLE WHERE COLUMN_B = 'value'



SQL Performance Tips

  1. use index in the column.
  2. use primary index.
  3. use index unique
  4. use symbol operators like, <,>,=
  5. use positive operator like IN, EXIST instead of NOT IN, NOT EXIST
  6. use full/pre wildcard operator
  7. use prefix/postfix string index.
  8. use LIMIT operator.
  9. use EXIST instead of COUNT
  10. use small data types VARCHAR
  11. Use Slow Query Log (always have it on!)
  12. Don't use DISTINCT when you have or could use GROUP BY
  13. Use LOAD DATA instead of INSERT
  14. Don't use ORDER BY RAND() if you have > ~2K records
  15. Use SQL_NO_CACHE when you are Selecting frequently updated data or large sets of data.
  16. use UNION instead of OR
  17. Avoid using IN(...) when selecting on indexed fields, It will kill the performance of SELECT query.
  18. Use stored procedures to avoid bandwidth wastage
  19. enable key_buffer, query cache, table cache.


Mysql server Performance:

  1. more RAM is good so faster disk speed
  2. use 64­bit architectures


key_buffer:

key_buffer = 384M - When tuning a MySQL server, key_buffer_size is very important. This number works well for me and with the mysqlreport script I rarely use 50% of the available memory.

table_cache:
table_cache = 1800 – After key_buffer the next most important variable is your table cache.

query cache:
query_cache_size = 52428800;
query_cache_type = 1
Enable the query cache in MySQL to improve performance

Database Design Performance Tips:

Design sane query schemas. don't be afraid of table joins, often they are faster than denormalization

  1. Don't use boolean flags
  2. Use Indexes
  3. Don't Index Everything
  4. Do not duplicate indexes
  5. Do not use large columns in indexes if the ratio of SELECTs:INSERTs is low.
  6. Use a clever key and ORDER BY instead of MAX

Normalize first, and denormalize where appropriate.


No comments:

Post a Comment