As a DBA, you would be faced with situations where databases would begin to grow exponentially causing issues like low disk space. One of the most common databases that is famous for growing is the almighty [TempDB]. TempDB is well known over the decades for being the major culprit for grow and filling up the disk space. As you continue to read this blog, you will learn how to handle such situations..
SUMMARY
This blog provides procedures and recommendations to help you diagnose and troubleshoot problems caused by insufficient disk space in the tempdb database. Running out of disk space in tempdb can cause significant disruptions in the SQL Server production environment and can prohibit applications that are running from completing operations.
tempdb Space Requirements
The tempdb system database is a global resource that is available to all users that are connected to an instance of SQL Server. The tempdb database is used to store the following objects: user objects, internal objects, and version stores. You can use the sys.dm_db_file_space_usage dynamic management view to monitor the disk space used by the user objects, internal objects, and version stores in the tempdb files. Additionally, to monitor the page allocation or deallocation activity in tempdb at the session or task level, you can use the sys.dm_db_session_space_usage and sys.dm_db_task_space_usage dynamic management views. These views can be used to identify large queries, temporary tables, or table variables that are using a large amount of tempdb disk space.
ANALYSIS
Any connection to the SQL instance fails and application is not able to connect. Any queries executed produces error:
Msg 9002, Level 17, State 4, Procedure xxxx, Line 3764 The transaction log for database 'tempdb' is full due to 'ACTIVE_TRANSACTION'
The disk space as shown below is FULL and tempdb database used up all the space on the disk.
Below is an example of errors you may encounter in such scenarios
SOLUTION
The query run below shows us what type of log wait is occuring on the database in the where clause. In this scenario, ‘tempdb’ database is our focus; this finds what is the log wait. This confirms the error you were recieving from above.
This query lists out the active session_ids and the reserved transaction log bytes specific to the database ‘tempdb’. This initial query gives you a general idea of which session is taking up most of the transaction log disk space for tempdb.
SELECT database_transaction_log_bytes_reserved,session_id FROM sys.dm_tran_database_transactions AS tdt INNER JOIN sys.dm_tran_session_transactions AS tst ON tdt.transaction_id = tst.transaction_id WHERE database_id = 2; --database_id 2 is equivalent to tempdb
Run this query to find culprit spid that has been running the longest (active transaction) with the largest log bytes used. The begin time column will also give you an insight as to how long this query has been running.
SELECT tst.[session_id], s.[login_name] AS [Login Name], DB_NAME (tdt.database_id) AS [Database], tdt.[database_transaction_begin_time] AS [Begin Time], tdt.[database_transaction_log_record_count] AS [Log Records], tdt.[database_transaction_log_bytes_used] AS [Log Bytes Used], tdt.[database_transaction_log_bytes_reserved] AS [Log Bytes Rsvd], SUBSTRING(st.text, (r.statement_start_offset/2)+1, ((CASE r.statement_end_offset WHEN -1 THEN DATALENGTH(st.text) ELSE r.statement_end_offset END - r.statement_start_offset)/2) + 1) AS statement_text, st.[text] AS [Last T-SQL Text], qp.[query_plan] AS [Last Plan] FROM sys.dm_tran_database_transactions tdt JOIN sys.dm_tran_session_transactions tst ON tst.[transaction_id] = tdt.[transaction_id] JOIN sys.[dm_exec_sessions] s ON s.[session_id] = tst.[session_id] JOIN sys.dm_exec_connections c ON c.[session_id] = tst.[session_id] LEFT OUTER JOIN sys.dm_exec_requests r ON r.[session_id] = tst.[session_id] CROSS APPLY sys.dm_exec_sql_text (c.[most_recent_sql_handle]) AS st OUTER APPLY sys.dm_exec_query_plan (r.[plan_handle]) AS qp WHERE DB_NAME (tdt.database_id) = 'tempdb' ORDER BY [Log Bytes Used] DESC GO
Run this query after you obtain the culprit session_id from the above query. This query gives you more details on the session.
select * from sys.dm_exec_sessions where session_id = 99 --make sure to change session_id to match the id in your environment
Run this query if the desired solution is to kill/terminate the session. Note; this will cause a rollback of the uncommitted transactions.
kill 99 --make sure to change session_id to match the id in your environment
CONCLUSION
As you notice, a full SQL Server restart will do the same thing. A restart will close all sessions on the instance and drop and recreate tempdb again. This would relief you of the encountered performance issues and errors. However, you will not have a full understanding and RCA of why the issue occurred. Only with the process provided above will you get a better understanding of the root cause and will provide you with a more informed judgement before considering any solutions. Also in some scenarios, a SQL Server restart can not be afforded as it is production and other applications are running. The steps provided will help you pinpoint the culprit and resolve the issue. Please be aware that killing the culprit session(s) will not release the disk space used by tempdb back to the OS. It is required to restart the SQL Server to allow that to happen. That can be scheduled later during a maintenance window or outside business hours.
HTH
Leave a Reply