When Hibernate uses INNER JOIN?

Hibernate Logo

After years of experience working around Hibernate there are very few people actually knows "When Hibernate Uses INNER JOIN". I'm one of 'em till today. Today I drilled further and found 3 cases where Hibernate uses INNER JOIN.

1. not-null attribute
When you specify attribute not-null = "true" in hbm.xml for particular column which used for table joining, It will use INNER JOIN no matter you specified FetchMode.JOIN in criteria.
/* Having not-null = "true" in hbm file, It'll override following FetchMode.JOIN */
criteria.setFetchMode("user", FetchMode.JOIN);

2. createAlias
When you create an alias of table, Hibernate will automatically uses INNER JOIN. Its default setting of hibernate.
criteria.createAlias("user", "user");

3. CriteriaSpecification.INNER_JOIN
Well you can ignore CriteriaSpecification.INNER_JOIN, as it can be used with createAlias. And when you use createAlias hibernate will automatically use INNER JOIN but this is just for the sake of your knowledge.
criteria.createAlias("user", "user", CriteriaSpecification.INNER_JOIN);

Note: If I'm missing any other way let's all know about it. Use comment box below.

How to change collation_database value in MySQL?

collation_database

A collation is a set of rules for comparing characters in a character set.
Read more about Character Sets and Collations in General.

How to check database collation character set in MySQL?
Execute following query to know the collation character set of current database.Currently its 'utf8_general_ci'.
SHOW VARIABLES LIKE '%collation_database%';
Variable_name value Value
collation_database utf8_general_ci


How to change database collation character set in MySQL?
We need to alter the database with new collation character set. We will change it to 'utf8_unicode_ci', change character set as per your requirement.
ALTER DATABASE inventory COLLATE utf8_unicode_ci;
Now lets check the collation character set again by executing same query.
SHOW VARIABLES LIKE '%collation_database%';
Variable_name value Value
collation_database utf8_unicode_ci


How to change character_set_database value in MySQL?

character_set_database + MySQL

MySQL is highly configurable database. It support different character set at Database level, Table level and Column level as well. Sometime it gets worse when you provide so many options.

We were facing an issue because of mixed character set at different level in MySQL. I've tried to change configuration [my.cnf in Linux and my.ini in Windows] file of MySQL but didn't work. I've searched to change value of 'character_set_database' but no one explained to the point. After many try and error I figured out solution.

How to check database character set in MySQL?
Execute following query to know the character set of current database. Currently its 'latin1'.
SHOW VARIABLES LIKE '%character_set_database%';
Variable_name value Value
character_set_database latin1


How to change database character set in MySQL?
We need to alter the database with new character set. We will change it to 'utf8', change character set as per your requirement.
ALTER DATABASE inventory CHARACTER SET utf8;
Now lets check the character set again by executing same query.
SHOW VARIABLES LIKE '%character_set_database%';
Variable_name value Value
character_set_database utf8

Reorganize or Rebuild Indexes to remove fragmentation on database table

Fragmentation
Fragmentation occurs when you perform any INSERTION, UPDATION or DELETION operation against table. Over the time this operation cause to data become scattered in database. Heavily fragmented indexes can degrade query performance and cause your application to respond very slowly.

Note: I'm using sample database(AdventureWorks2008R2) from Microsoft. Change database name and table name where required.

Finding Fragmentation On Table-Index
Execute following query to determine fragmentation on particular table of database.
SELECT ind.name, phy.avg_fragmentation_in_percent, phy.fragment_count, phy.avg_fragment_size_in_pages
FROM sys.dm_db_index_physical_stats(DB_ID(N'AdventureWorks2008R2'), OBJECT_ID(N'HumanResources.Employee'), NULL, NULL, NULL) AS phy
JOIN sys.indexes AS ind
ON phy.object_id = ind.object_id
AND phy.index_id = ind.index_id
Index + REBUILD + REORGANIZE + Detecting Fragmentation + Microsoft SQL Server

As you can see in Image that index 'AK_Employee_LoginID' is Heavily fragmented. It will lead to lower performance of your database.

Column Description
avg_fragmentation_in_percent The percent of logical fragmentation (out-of-order pages in the index).
fragment_count The number of fragments (physically consecutive leaf pages) in the index.
avg_fragment_size_in_pages Average number of pages in one fragment in an index.
Source: http://technet.microsoft.com/

REORGANIZE Index or REBUILD Index?

avg_fragmentation_in_percent value Operation
> 5% and < = 30% ALTER INDEX REORGANIZE
> 30% ALTER INDEX REBUILD WITH (ONLINE = ON)*

What is Index Reorganize?
Database will Reorganize data pages on an index. Reorganizing an index is always executed with online. It means when you perform Reorganize operation on an Index, Database will keep the old index value to serve the incoming queries. When its completed, it'll drop the old index data.

What is Index Rebuild?
Database will drop the current index and create an index from scratch. It comes with two option online and offline. As I said online option will keep the data of old index to serve incoming queries until it completes the rebuild operation. Offline option will drop the old index data right away and create index again, index won't be available until it completes the rebuild operation.

To reorganize an index
ALTER INDEX IX_Employee_OrganizationalLevel_OrganizationalNode ON HumanResources.Employee
REORGANIZE;
To reorganize all indexes in a table
ALTER INDEX ALL ON HumanResources.Employee
REORGANIZE;
To rebuild an index
ALTER INDEX AK_Employee_LoginID ON HumanResources.Employee
REBUILD;
To rebuild all indexes in a table
ALTER INDEX ALL ON HumanResources.Employee
REBUILD WITH (FILLFACTOR = 80, SORT_IN_TEMPDB = ON,
              STATISTICS_NORECOMPUTE = ON);
Warning: This is part of database maintenance so please be carefully while you are dealing with live database server. I'd suggest perform this task under maintenance hours.

Microsoft SQL Server Index: Warning! The maximum key length is 900 bytes.

Today I was creating an Index on columns and got warning from SQL server. I never know that Microsoft SQL Server has put Maximum size limitation on Index key column. I prefer to write SQL script rather design mode of SQL Server Studio. I got warning message in console as follow...

Warning! The maximum key length is 900 bytes. The index 'IndxNc_index_master__MultiCol01' has maximum length of 1000 bytes. For some combination of large values, the insert/update operation will fail.

I did Google for more information and read few articles and also SQL Server library articles. Lets understand warning message with example.

Step 1: First create table. Lets assume below table for the sake of example.
CREATE TABLE [dbo].[index_master](
 [id] [bigint] IDENTITY(1,1) NOT NULL,
 [indexCol1] [nvarchar](450) NULL,
 [indexCol2] [bigint] NULL,
 [indexCol3] [int] NULL,
 [indexCol4] [varchar](442) NULL,
 [indexCol5] [varchar](450) NULL
) ON [PRIMARY]

Step 2: Find out individual column size in bytes.
SELECT name, max_length as size_in_bytes
FROM sys.columns
WHERE OBJECT_NAME(object_id) = 'index_master'

Warning! The maximum key length is 900 bytes


Step 3: Create an index on indexCol2 [8] + indexCol3 [4] + indexCol4 [442] + indexCol5 [450] = 904 bytes.
CREATE NONCLUSTERED INDEX [IndxNc_index_master__MultiCol01] ON [dbo].[index_master] 
(
 [indexCol3] ASC,
 [indexCol4] ASC,
 [indexCol5] ASC,
 [indexCol2] ASC
)WITH (PAD_INDEX  = ON, FILLFACTOR = 80)

/* If you create an index using above script, it'll gives you warning as follow */
/* Warning! The maximum key length is 900 bytes. The index 'IndxNc_index_master__MultiCol01' has maximum length of 904 bytes. For some combination of large values, the insert/update operation will fail.*/

/* If you create an index using SQL Studio design mode, it'll gives you warning as follow */
/* Adding the selected columns will result in an index key with a maximum length of 904 bytes.  The maximum permissible index length is 900 bytes. INSERT and UPDATE operations fail if the combined value of the key columns exceeds 900 bytes. */

Step 4: Lets try to insert record with random data to check what happens if you try cross the limit of 900 bytes.
INSERT INTO index_master values(
'indexCol1',
1234567891012312312,
12345,
'12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218',
'123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186');

It'll throw an error as follow...
Operation failed. The index entry of length 902 bytes for the index 'IndxNc_index_master__MultiCol01' exceeds the maximum length of 900 bytes.

Notes
- If a table column is a Unicode data type such as nchar or nvarchar, then column size in bytes is two times the number of characters specified in the CREATE TABLE statement. Check indexCol1.
- If you have index that exceed the 900 bytes, It'll allow insert/update until unless total sum of key columns not more than 900 bytes. In above insert query cut down long string into small and you'll be able to insert the record without any error.
- You can add big columns in index using INCLUDE clause [except text, ntext, and image datatype columns], SQL Server will not consider included columns for 900 bytes.

Hibernate and Database (Microsoft SQL Server) Indexes

Hibernate logo

Hibernate
Hibernate is an Object Relational Mapping (ORM) library used in java. This framework used to map relational database into Object Oriented domain model. This framework can handle small and enterprise level application both. Hibernate uses Prepared Statement (pre-compiled query) for data retrieval until unless you write hard coded native queries.


Database Index
A database index is a data structure that improves the speed of data retrieval operations on a database table at the minimal cost of additional writes and little more additional space to maintain a restricted set of selected data.


Hibernate and Microsoft SQL Server Index
Today we will intercept the interaction of Hibernate and SQL Server Index. We will see How actually hibernate queries stored in database.

In the following example we have table user_master. It has column user_name with datatype varchar(100). We will create an Index on column user_name and test it against Hibernate.

Step 1: Create an Index on column user_name.
Note: Index will be used only when table holds huge number of records minimum 1500-2000 records.
CREATE NONCLUSTERED INDEX [IXNc_user_master__user_name] ON [dbo].[user_master] 
(
 [user_name] ASC
)WITH (PAD_INDEX  = ON, FILLFACTOR = 80)

Step 2: Executing simple select query in SQL Server Studio.
/* Press Ctrl + M or enable execution plan in SQL Studio. */
/* After execution of query you can see the execution plan, that shows Index Seek(Search in index) of 'IXNc_user_master__user_name' */
SELECT * FROM user_master WHERE user_name = 'vicky_thakor'

Step 3: We will verify that index used for seek (search) or scan. Seek cost is very low but scan cost lot more than regular. Execute following query change database name and table name.
SELECT DISTINCT OBJECT_NAME(sis.OBJECT_ID) TableName,
  si.name AS IndexName, sc.Name AS ColumnName, sic.Index_ID, sis.user_seeks, sis.user_scans, sis.user_lookups, sis.user_updates
FROM sys.dm_db_index_usage_stats sis
INNER JOIN sys.indexes si 
 ON sis.OBJECT_ID = si.OBJECT_ID AND sis.Index_ID = si.Index_ID
INNER JOIN sys.index_columns sic 
 ON sis.OBJECT_ID = sic.OBJECT_ID AND sic.Index_ID = si.Index_ID
INNER JOIN sys.columns sc 
 ON sis.OBJECT_ID = sc.OBJECT_ID AND sic.Column_ID = sc.Column_ID
WHERE sis.Database_ID = DB_ID('database_name') AND sis.OBJECT_ID = OBJECT_ID('user_master');

Step 4: Clear all available Prepared Statements from SQL Server cache.
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS

Step 5: Execute same query using Hibernate Criteria
Criteria criteria = session.createCriteria(User.class);
criteria.add(Restrictions.eq("Username", "vicky_thakor"));        
criteria.list();

Step 6: Execute following query to check what query Hibernate generated to fetch the data.
SELECT usecounts, text 
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
WHERE cacheobjtype != 'Parse Tree'

/* Hibernate generated following query */
(@P0 nvarchar(4000))select this_.uid as uid0_0_, this_.Firstname as Firstname0_0_, this_.Lastname as Lastname0_0_, this_.user_name as user4_0_0_ from user_master this_ where this_.user_name= @P0

Hibernate and Microsoft SQL Server database indices

Important: We have column user_name with datatype varchar but hibernate fetching data using nvarchar datatype.

Execute query 3 to check index is used with seek or scan. You will find that index is scanned and that cost a lot.


Cross-check
Lets check how database act with different datatype for index. You are requested to turn on execution plan setting in Microsoft SQL Server [Read more: Execution Plan in MSSQL Server]. We will Prepare Statement at database level with both datatype.
/* Execute following query which leads to Index Scan (Check attached image) */
EXECUTE sp_executesql
@statement = N'SELECT * FROM user_master WHERE user_name = @P0',
@parameters = N'@P0 nvarchar(4000)',
@P0 = 'vicky_thakor'

/* Execute following query which leads to Index Seek (Check attached image) */
EXECUTE sp_executesql
@statement = N'SELECT * FROM user_master WHERE user_name = @P0',
@parameters = N'@P0 varchar(4000)',
@P0 = 'vicky_thakor'

Hibernate and Microsoft SQL Server database indices

Issue
Hibernate used nvarchar to support Unicode characters and we have column with varchar datatype. Database always upgrade lower datatype to higher datatype to eliminate data lose. Datatype conversion leads to index scan rather index seek.


Solution

You are suggested to change your datatype from varchar to nvarchar. If you are not interested in datatype change I have another solution for it.

Add parameter called sendStringParametersAsUnicode in your connection string. It will force hibernate to use varchar instead nvarchar.
<property name="hibernate.connection.url">jdbc:jtds:sqlserver://127.0.0.1:1433/javaQuery;sendStringParametersAsUnicode=false</property>

Conclusion
Its not only the hibernate use nvarchar, You should check all other framework you are working with. Indexes are meant to increase the performance but this intermediate framework and poor database knowledge leads to low performance of application.  

Prepared Statement in Microsoft SQL Server

Prepared Statement in Microsoft SQL Server

Database prepares execution plan(steps) for each query fired on it. Preparing execution plan for each query will increase execution cost. To reduce cost on creating execution plan for each query database provides Prepared Statements.

Prepared Statement
Prepared Statements are pre-compiled queries stored in database. Prepared Statement used when same query used with different values. Preparing execution plan in advance for particular query will increase query performance.

When, Prepared Statement?
Multiple occurrence of same query with different values will lead to Prepared Statement. Lets understand following queries...
/* Database will prepare two execution plan for following queries */
SELECT * FROM user_master WHERE user_name = 'vicky.thakor'
SELECT * FROM user_master WHERE user_name = 'chirag.thakor'

/* Prepared Statement */
SELECT * FROM user_master WHERE user_name = :param1
- First two queries are simple SQL statement, both are same except its value. For this queries database will prepare two execution plan.
- Third query is pre-compiled query and can accept n number of values and database don't have to prepare execution plan multiple times. Third query template/Prepared Statement used for above two query.

Rule for execution plan
Database will consider space, comma, special symbol while creating execution plan either its Prepared Statement or Simple SQL Query. Database will prepare different execution plan for following queries which seems similar to us but not for database.
/* Execution Plan 1; Comment is part of query*/
/* Native Query */ SELECT * FROM user_master WHERE user_name = 'vicky.thakor'

/* Execution Plan 2; Comment is part of query, We just changed 'q' in comment */
/* Native query */ SELECT * FROM user_master WHERE user_name = 'vicky.thakor'

/* Execution Plan 3 */
SELECT * FROM user_master WHERE user_name = 'vicky.thakor'

/* Execution Plan 4 (Prepared Statement)*/
SELECT * FROM user_master WHERE user_name = :param1

/* Execution Plan 5 (Prepared Statement); Comment is part of query */
/* Prepared Statement */SELECT * FROM user_master WHERE user_name = :param1
So far you noticed how importance of query writing and Prepared Statement. We all are using Prepared Statement from different programming languages, Program just prepare one of the above Prepared Statement and maintain same query structure for all database execution.

Now we will see how you can use Prepared Statements while writing Stored Procedures, functions in Microsoft SQL Server.

Step 1: Clear all available execution plan/Prepared Statements from cache.
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS

Step 2: We will see the execution plan for simple SQL Query
SELECT * FROM user_master WHERE user_name = 'vicky.thakor'
SELECT * FROM user_master WHERE user_name = 'chirag.thakor'

Step 3: Check your execution plan in cache. You will find two execution plan in it.
SELECT usecounts, size_in_bytes, cacheobjtype, text 
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
WHERE cacheobjtype != 'Parse Tree'

Step 4: Now again clear all available execution plan/Prepared Statements from cache.
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS

Step 5: Now we will Prepare Statement in Database. Execute following query two times with different values. Change your table name, column name and value.
EXEC sp_executesql
@statement = N'SELECT * FROM user_master WHERE user_name = @param1',
@parameters = N'@param1 varchar(100)',
@param1 = N'vicky.thakor'

Step 6: Check your execution plan in cache. You will notice that same execution plan used twice.
SELECT usecounts, size_in_bytes, cacheobjtype, text 
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
WHERE cacheobjtype != 'Parse Tree'