Showing posts with label Database Index. Show all posts

Reorganize or Rebuild Indexes to remove fragmentation on database table

Fragmentation
Fragmentation occurs when you perform any INSERTION, UPDATION or DELETION operation against table. Over the time this operation cause to data become scattered in database. Heavily fragmented indexes can degrade query performance and cause your application to respond very slowly.

Note: I'm using sample database(AdventureWorks2008R2) from Microsoft. Change database name and table name where required.

Finding Fragmentation On Table-Index
Execute following query to determine fragmentation on particular table of database.
SELECT ind.name, phy.avg_fragmentation_in_percent, phy.fragment_count, phy.avg_fragment_size_in_pages
FROM sys.dm_db_index_physical_stats(DB_ID(N'AdventureWorks2008R2'), OBJECT_ID(N'HumanResources.Employee'), NULL, NULL, NULL) AS phy
JOIN sys.indexes AS ind
ON phy.object_id = ind.object_id
AND phy.index_id = ind.index_id
Index + REBUILD + REORGANIZE + Detecting Fragmentation + Microsoft SQL Server

As you can see in Image that index 'AK_Employee_LoginID' is Heavily fragmented. It will lead to lower performance of your database.

Column Description
avg_fragmentation_in_percent The percent of logical fragmentation (out-of-order pages in the index).
fragment_count The number of fragments (physically consecutive leaf pages) in the index.
avg_fragment_size_in_pages Average number of pages in one fragment in an index.
Source: http://technet.microsoft.com/

REORGANIZE Index or REBUILD Index?

avg_fragmentation_in_percent value Operation
> 5% and < = 30% ALTER INDEX REORGANIZE
> 30% ALTER INDEX REBUILD WITH (ONLINE = ON)*

What is Index Reorganize?
Database will Reorganize data pages on an index. Reorganizing an index is always executed with online. It means when you perform Reorganize operation on an Index, Database will keep the old index value to serve the incoming queries. When its completed, it'll drop the old index data.

What is Index Rebuild?
Database will drop the current index and create an index from scratch. It comes with two option online and offline. As I said online option will keep the data of old index to serve incoming queries until it completes the rebuild operation. Offline option will drop the old index data right away and create index again, index won't be available until it completes the rebuild operation.

To reorganize an index
ALTER INDEX IX_Employee_OrganizationalLevel_OrganizationalNode ON HumanResources.Employee
REORGANIZE;
To reorganize all indexes in a table
ALTER INDEX ALL ON HumanResources.Employee
REORGANIZE;
To rebuild an index
ALTER INDEX AK_Employee_LoginID ON HumanResources.Employee
REBUILD;
To rebuild all indexes in a table
ALTER INDEX ALL ON HumanResources.Employee
REBUILD WITH (FILLFACTOR = 80, SORT_IN_TEMPDB = ON,
              STATISTICS_NORECOMPUTE = ON);
Warning: This is part of database maintenance so please be carefully while you are dealing with live database server. I'd suggest perform this task under maintenance hours.

Microsoft SQL Server Index: Warning! The maximum key length is 900 bytes.

Today I was creating an Index on columns and got warning from SQL server. I never know that Microsoft SQL Server has put Maximum size limitation on Index key column. I prefer to write SQL script rather design mode of SQL Server Studio. I got warning message in console as follow...

Warning! The maximum key length is 900 bytes. The index 'IndxNc_index_master__MultiCol01' has maximum length of 1000 bytes. For some combination of large values, the insert/update operation will fail.

I did Google for more information and read few articles and also SQL Server library articles. Lets understand warning message with example.

Step 1: First create table. Lets assume below table for the sake of example.
CREATE TABLE [dbo].[index_master](
 [id] [bigint] IDENTITY(1,1) NOT NULL,
 [indexCol1] [nvarchar](450) NULL,
 [indexCol2] [bigint] NULL,
 [indexCol3] [int] NULL,
 [indexCol4] [varchar](442) NULL,
 [indexCol5] [varchar](450) NULL
) ON [PRIMARY]

Step 2: Find out individual column size in bytes.
SELECT name, max_length as size_in_bytes
FROM sys.columns
WHERE OBJECT_NAME(object_id) = 'index_master'

Warning! The maximum key length is 900 bytes


Step 3: Create an index on indexCol2 [8] + indexCol3 [4] + indexCol4 [442] + indexCol5 [450] = 904 bytes.
CREATE NONCLUSTERED INDEX [IndxNc_index_master__MultiCol01] ON [dbo].[index_master] 
(
 [indexCol3] ASC,
 [indexCol4] ASC,
 [indexCol5] ASC,
 [indexCol2] ASC
)WITH (PAD_INDEX  = ON, FILLFACTOR = 80)

/* If you create an index using above script, it'll gives you warning as follow */
/* Warning! The maximum key length is 900 bytes. The index 'IndxNc_index_master__MultiCol01' has maximum length of 904 bytes. For some combination of large values, the insert/update operation will fail.*/

/* If you create an index using SQL Studio design mode, it'll gives you warning as follow */
/* Adding the selected columns will result in an index key with a maximum length of 904 bytes.  The maximum permissible index length is 900 bytes. INSERT and UPDATE operations fail if the combined value of the key columns exceeds 900 bytes. */

Step 4: Lets try to insert record with random data to check what happens if you try cross the limit of 900 bytes.
INSERT INTO index_master values(
'indexCol1',
1234567891012312312,
12345,
'12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218',
'123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186');

It'll throw an error as follow...
Operation failed. The index entry of length 902 bytes for the index 'IndxNc_index_master__MultiCol01' exceeds the maximum length of 900 bytes.

Notes
- If a table column is a Unicode data type such as nchar or nvarchar, then column size in bytes is two times the number of characters specified in the CREATE TABLE statement. Check indexCol1.
- If you have index that exceed the 900 bytes, It'll allow insert/update until unless total sum of key columns not more than 900 bytes. In above insert query cut down long string into small and you'll be able to insert the record without any error.
- You can add big columns in index using INCLUDE clause [except text, ntext, and image datatype columns], SQL Server will not consider included columns for 900 bytes.

Hibernate and Database (Microsoft SQL Server) Indexes

Hibernate logo

Hibernate
Hibernate is an Object Relational Mapping (ORM) library used in java. This framework used to map relational database into Object Oriented domain model. This framework can handle small and enterprise level application both. Hibernate uses Prepared Statement (pre-compiled query) for data retrieval until unless you write hard coded native queries.


Database Index
A database index is a data structure that improves the speed of data retrieval operations on a database table at the minimal cost of additional writes and little more additional space to maintain a restricted set of selected data.


Hibernate and Microsoft SQL Server Index
Today we will intercept the interaction of Hibernate and SQL Server Index. We will see How actually hibernate queries stored in database.

In the following example we have table user_master. It has column user_name with datatype varchar(100). We will create an Index on column user_name and test it against Hibernate.

Step 1: Create an Index on column user_name.
Note: Index will be used only when table holds huge number of records minimum 1500-2000 records.
CREATE NONCLUSTERED INDEX [IXNc_user_master__user_name] ON [dbo].[user_master] 
(
 [user_name] ASC
)WITH (PAD_INDEX  = ON, FILLFACTOR = 80)

Step 2: Executing simple select query in SQL Server Studio.
/* Press Ctrl + M or enable execution plan in SQL Studio. */
/* After execution of query you can see the execution plan, that shows Index Seek(Search in index) of 'IXNc_user_master__user_name' */
SELECT * FROM user_master WHERE user_name = 'vicky_thakor'

Step 3: We will verify that index used for seek (search) or scan. Seek cost is very low but scan cost lot more than regular. Execute following query that shows SEEK and SCAN count on Index and save/remember the details for reference. (*Change database name and table name.)
SELECT DISTINCT OBJECT_NAME(sis.OBJECT_ID) TableName,
  si.name AS IndexName, sc.Name AS ColumnName, sic.Index_ID, sis.user_seeks, sis.user_scans, sis.user_lookups, sis.user_updates
FROM sys.dm_db_index_usage_stats sis
INNER JOIN sys.indexes si 
 ON sis.OBJECT_ID = si.OBJECT_ID AND sis.Index_ID = si.Index_ID
INNER JOIN sys.index_columns sic 
 ON sis.OBJECT_ID = sic.OBJECT_ID AND sic.Index_ID = si.Index_ID
INNER JOIN sys.columns sc 
 ON sis.OBJECT_ID = sc.OBJECT_ID AND sic.Column_ID = sc.Column_ID
WHERE sis.Database_ID = DB_ID('database_name') AND sis.OBJECT_ID = OBJECT_ID('user_master');

Step 4: Now clear all available Prepared Statements from SQL Server cache.
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS

Step 5: Execute same query using Hibernate Criteria
Criteria criteria = session.createCriteria(User.class);
criteria.add(Restrictions.eq("Username", "vicky_thakor"));        
criteria.list();

Step 6: Execute following query to check what query Hibernate generated to fetch the data.
SELECT usecounts, text 
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
WHERE cacheobjtype != 'Parse Tree'

/* Hibernate generated following query */
(@P0 nvarchar(4000))select this_.uid as uid0_0_, this_.Firstname as Firstname0_0_, this_.Lastname as Lastname0_0_, this_.user_name as user4_0_0_ from user_master this_ where this_.user_name= @P0

Hibernate and Microsoft SQL Server database indices

Important: We have column user_name with datatype varchar but hibernate fetching data using nvarchar datatype.

Step 7: Execute Step 3 to check index is used with SEEK or SCAN. You will find that index is scanned and that cost a lot.


Cross-check
Lets check how database act with different datatype for index. You are requested to turn on execution plan setting in Microsoft SQL Server [Read more: Execution Plan in MSSQL Server]. We will use Prepared Statement at database level with both datatype.
/* Execute following query which leads to Index Scan (Check attached image) */
EXECUTE sp_executesql
@statement = N'SELECT * FROM user_master WHERE user_name = @P0',
@parameters = N'@P0 nvarchar(4000)',
@P0 = 'vicky_thakor'

/* Execute following query which leads to Index Seek (Check attached image) */
EXECUTE sp_executesql
@statement = N'SELECT * FROM user_master WHERE user_name = @P0',
@parameters = N'@P0 varchar(4000)',
@P0 = 'vicky_thakor'

Hibernate and Microsoft SQL Server database indices

Issue
Hibernate used nvarchar to support Unicode characters and we have column with varchar datatype. Database always upgrade lower datatype to higher datatype to eliminate data lose. Datatype conversion leads to index scan rather index seek.


Solution

You are suggested to change your datatype from varchar to nvarchar. If you are not interested in datatype change I have another solution for it.

Add parameter called sendStringParametersAsUnicode in your connection string. It will force hibernate to use varchar instead nvarchar.
<property name="hibernate.connection.url">jdbc:jtds:sqlserver://127.0.0.1:1433/javaQuery;sendStringParametersAsUnicode=false</property>

Conclusion
Its not only the hibernate use nvarchar, You should check all other framework you are working with. Indexes are meant to increase the performance but this intermediate framework and poor database knowledge leads to low performance of application.  

MSSQL: Queries that do not return large result sets. Nonclustered Index limitation

What is Nonclustered Index?
A nonclustered index contains the index key values and row locators that point to the storage location of the table data.

Queries that do not return large result sets. - Nonclustered Index Design Guidelines
Source: http://technet.microsoft.com/en-us/library/ms179325(v=sql.105).aspx

We are using hibernate in our project to support multiple dialect. Its our routine process to get queries that are generated by hibernate. We do analysis on that queries and suggest indexes and other stuffs. We replaced ? in query with 0. We fired that query in Microsoft SQL Server Management Studio and Execution plan suggested index. We created index with bit modification. After executing same query again what we see is index is not used by query.

We were clueless why is database not using index. We went through lots of article but no one wrote specifically for this rule of nonclustered index. We did few more run on different tables and queries. We were not able to derive perfect rule that...

Database will not use Nonclustered Index when Its returning 'n' records.
It's yet to be discovered from my side. If you guys have perfect rule for this do share in comment. I'll put here if its really a case.

Case Scenario:
SELECT * 
FROM s_master this_ 
LEFT OUTER JOIN t_master tlist2_ 
ON this_.id = tlist2_.s_id 
WHERE this_.document_id = 0 
AND this_.library_id = 0
This query returned 2248 records and as Microsoft SQL server design guide line suggests, query won't use index. We replaced 0 with specific value and It started using Index.

Conclusion:
Nonclustered Index is only useful when you are sure that your query won't return much records.

You'd like to read more about index, scroll through this articles...

MSSQL: How to find all database indexes on column?

I was playing with MSSQL system tables for fun and popup one thing that lets try to find all indexes on column. I ended up with one long query so thought let me post it and you guys don't have work for this query again.

Professionals I don't have to explain much about these tables. Beginner try to work around this table. It'll help you lot in future, way to become database expert in other word Database Administrator(DBA).

/* Change table name and column name in where condition */
SELECT distinct ind.name AS [idx_name], t.name AS [table_name], c.name AS [column_name], ic.is_included_column
FROM sys.indexes ind 
RIGHT JOIN sys.tables t 
 ON ind.object_id = t.object_id 
RIGHT JOIN sys.columns c 
 ON t.object_id = c.object_id
RIGHT JOIN sys.index_columns ic 
 ON ind.index_id = ic.index_id
       AND c.column_id = ic.column_id
WHERE ind.is_primary_key = 0  
  AND ind.is_unique_constraint = 0 
  AND t.is_ms_shipped = 0
  AND ind.name IS NOT NULL
  AND ind.is_unique IN (0,1)
  AND t.name = 'table_name'
  AND c.name = 'column_name'
  AND OBJECT_NAME(ic.object_id) = 'table_name'
There ain't much to write for this query. Change table_name and column_name

MySQL: Queries to 'Show All Database Indexes' and 'Drop All Table Indexes'

Database Index
A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and the use of more storage space to maintain the extra copy of data. Indexes are used to quickly locate data without having to search every row in a database table every time a database table is accessed. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.

As described above database index helps only for search operation. Insert, Update and Delete will suffer. Index should be created only selected fields based on your retrieval of data (Columns).

- Source (WikiPedia: http://en.wikipedia.org/wiki/Database_index)

In my previous article we discussed about MSSQL Indexes . Now we'll see how you can achieve same in MySQL. 

SHOW ALL INDEXES 
/* It'll show all indexes except Primary Key */
SELECT *
FROM INFORMATION_SCHEMA.STATISTICS
WHERE TABLE_SCHEMA = (SELECT DATABASE())
AND INDEX_NAME != 'PRIMARY';

DROP ALL TABLE INDEXES
Basic Syntax of dropping index is DROP INDEX INDEX_NAME ON TABLE_NAME; but to drop index in one single shot is not possible. You've to loop through all indexes of database. Thus, We've to use Stored Procedure in MySQL.
DROP PROCEDURE IF EXISTS usp_DropIndexOfTable;
DELIMITER $$
CREATE PROCEDURE usp_DropIndexOfTable(IN TableName VARCHAR(100))
BEGIN
 /**
  * @Created By: Vicky Thakor
  * @Created On: 6th March, 2014
  * This will delete all Indexes of table except Primary Key
  * 
  * @Test: CALL usp_DropIndexOfTable('user_master');
  */
 DECLARE v_finished INTEGER DEFAULT 0;
 DECLARE v_index varchar(200) DEFAULT "";
        DEClARE index_cursor CURSOR FOR 
        /* In below query remove TABLE_NAME criteria to remove all indexes */
        SELECT INDEX_NAME
  FROM INFORMATION_SCHEMA.STATISTICS
  WHERE TABLE_SCHEMA = (SELECT DATABASE())
  AND TABLE_NAME = TableName
  AND INDEX_NAME != 'PRIMARY';
        DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_finished = 1;

 DECLARE exit handler FOR sqlexception
 BEGIN
 -- ERROR
 SHOW ERRORS LIMIT 1;
 ROLLBACK;
 END;

 DECLARE exit handler FOR sqlwarning
 BEGIN
 -- WARNING
 SHOW WARNINGS LIMIT 1;
 ROLLBACK;
 END;

 OPEN index_cursor; 
  get_index: LOOP
   FETCH index_cursor INTO v_index;

   IF v_finished = 1 THEN 
    LEAVE get_index;
   END IF;
  
   SET @v_query = CONCAT('DROP INDEX ',v_index,' ON ', TableName);
   PREPARE stmt FROM @v_query;
   EXECUTE stmt;
   DEALLOCATE PREPARE stmt;
  END LOOP get_index;
       CLOSE index_cursor;
END$$
DELIMITER ;
Just fire above Stored Procedure in your database and then execute this stored procedure using
CALL usp_DropIndexOfTable('user_master'). You can modify above stored procedure according to your requirement.

MSSQL: Queries to 'Show All Database Indexes' and 'Drop All Database Indexes'

Database Index
A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and the use of more storage space to maintain the extra copy of data. Indexes are used to quickly locate data without having to search every row in a database table every time a database table is accessed. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.

As described above database index helps only for search operation. Insert, Update and Delete will suffer. Index should be created only selected fields based on your retrieval of data (Columns).

- Source (WikiPedia: http://en.wikipedia.org/wiki/Database_index)

Indexing is serious business for any large application or small application. Haphazard use of Indexing will create an overhead on database so be careful while dealing with index.

Today, There is nothing much to say. All I can say is you can bookmark this article for future use of these queries. It may help you to save your time rather then wasting time on Googling.

SHOW ALL INDEXES
/* Use `*` to check out other fields. */
SELECT distinct ind.name as IndexName,t.name as TableName
FROM 
sys.indexes ind 
INNER JOIN 
sys.index_columns ic ON ind.object_id = ic.object_id and ind.index_id = ic.index_id
INNER JOIN 
sys.columns col ON ic.object_id = col.object_id and ic.column_id = col.column_id 
INNER JOIN 
sys.tables t ON ind.object_id = t.object_id 
WHERE 
ind.is_primary_key = 0 
AND ind.is_unique_constraint = 0 
AND t.is_ms_shipped = 0
AND ind.is_unique IN(0, 1)
DROP ALL INDEXES "UNIQUE" and "NON UNIQUE"
DECLARE @query nvarchar(max);
SELECT @query = 
(SELECT distinct 'DROP INDEX '+ ind.name + ' ON ' + t.name +';'
FROM 
sys.indexes ind 
INNER JOIN 
sys.index_columns ic ON ind.object_id = ic.object_id and ind.index_id = ic.index_id
INNER JOIN 
sys.columns col ON ic.object_id = col.object_id and ic.column_id = col.column_id 
INNER JOIN 
sys.tables t ON ind.object_id = t.object_id 
WHERE 
ind.is_primary_key = 0  
AND ind.is_unique_constraint = 0 
AND t.is_ms_shipped = 0
AND ind.is_unique IN(0, 1)
for xml path(''));
EXEC sp_executesql @query