My Sixth Week at SQL Services

Tuesday 11.04.2017

This morning I did some research into alert sev22-25, as I had no idea what those alerts actually related to.  This is what I found out:

Alerts 20-24:

Indicate system problems and are fatal errors, which means that the Database Engine task that is executing a statement or batch is no longer running. The task records information about what occurred and then terminates. In most cases, the application connection to the instance of the Database Engine may also terminate. If this happens, depending on the problem, the application might not be able to reconnect.

Error messages in this range can affect all of the processes accessing data in the same database and may indicate that a database or object is damaged. Error messages with a severity level from 19 through 24 are written to the error log.

Alert 22

Indicates that the table or index specified in the message has been damaged by a software or hardware problem.

Severity level 22 errors occur rarely. If one occurs, run DBCC CHECKDB to determine whether other objects in the database are also damaged. The problem might be in the buffer cache only and not on the disk itself. If so, restarting the instance of the Database Engine corrects the problem. To continue working, you must reconnect to the instance of the Database Engine; otherwise, use DBCC to repair the problem. In some cases, you may have to restore the database.

If restarting the instance of the Database Engine does not correct the problem, then the problem is on the disk. Sometimes destroying the object specified in the error message can solve the problem. For example, if the message reports that the instance of the Database Engine has found a row with a length of 0 in a non-clustered index, delete the index and rebuild it.

Alert 23

Indicates that the integrity of the entire database is in question because of a hardware or software problem.

Severity level 23 errors occur rarely. If one occurs, run DBCC CHECKDB to determine the extent of the damage. The problem might be in the cache only and not on the disk itself. If so, restarting the instance of the Database Engine corrects the problem. To continue working, you must reconnect to the instance of the Database Engine; otherwise, use DBCC to repair the problem. In some cases, you may have to restore the database.

Alert 24

Indicates a media failure. The system administrator may have to restore the database from a previous point in time. You may also have to call your hardware vendor. Going forward hardware may need replacing or repairing.

Alert 25

Indicates some type of system error. I am yet to find any more information on Alert sev25. So more research will need to be done!

 

Later in the morning, Kathryn had time to go over a few of my alerts/diagrams with me. She was really encouraging and said I had done a great job, which boosted my courage slightly before my meeting with Adam.  Working with Kathryn also helped clarify a few things. There were some points in my diagrams where I wasn’t entirely sure of when a DBA might need to escalate, and then what happened after that escalation. Or even who exactly they should escalate to. Kathryn helped confirm that process for me. She also pointed out a few places where she would do checks just to make sure the issue causing the alert was still in affect, and hadn’t in fact resolved itself. So that was very helpful, as I’m sure this is common practice among all of the DBA’s. Also, alerts have often resolved themselves by the time a DBA checks the problem, so these are just causing ‘noise’.

My meeting with Adam in the afternoon went really well. He was happy with what I had achieved so far, and confirmed that the level of detail I  have is exactly as he wants it. Which was a relief! A few minor changes were made to a couple of the diagrams along the way, but I now have 3 that have been checked and approved by Adam. Three more which need to be checked by Adam, and 9 more in the pipeline. I have a lot of work to do before the end of the month, which is when Adam would like me to have all of the diagrams completed and ready to upload to SharePoint. But I feel like I am on track!

Wednesday 12.04.2017

I spent the morning completing the initial diagrams for the remaining alerts. This initially resulted in 11 more diagrams. However when looking at the SharePoint page for mirroring alerts, I noticed there are 9 separate mirroring alerts which all have their own resolution process. Because of this, I created a further 9 diagrams for each of those mirroring alerts, although I’m not sure these will actually be used, but it definitely helped my understanding. So, by the end of the morning I had the first versions of 20 alert resolution processes.

My next step will be to find times to meet with Rosie, Kathryn and Tim in order to compare my initial iterations with the process they actually carry out.

Because we had a staff lunch thing that afternoon, half of the DBA team were out. This meant everyone was super busy so I felt I shouldn’t take up any of the others time. Because of this, I spent the rest of the afternoon doing some further research on mirroring. Below is some of what I ldiscovered:

Mirroring

Database mirroring is a solution for increasing the availability of a SQL Server database. Mirroring is implemented on a per-database basis and works only with databases that use the full recovery model.

Database mirroring involves redoing every insert, update, and delete operation that occurs on the principal database onto the mirror database as quickly as possible. This is accomplished by sending a stream of active transaction log records to the mirror server, which applies log records to the mirror database, in sequence, as quickly as possible. Unlike replication, which works at the logical level, database mirroring works at the level of the physical log record.

Benefits of Database Mirroring

  • Increases availability of a database
  • Increases data protection
  • Improves the availability of the production database during upgrades

Components 

There are two databases involved in the mirroring process; principal and mirror

  • Principal Database: a read-write database whose transaction log records are applied to a read-only copy of the database (a mirror database).
  • Mirror Database: The copy of the database that is typically fully synchronized with the principal database.

There are 3 server role types available in mirroring; principal, mirror and witness.

  • Principal Server: the partner whose database is currently the principal database.
  • Mirror Server: the server instance on which the mirror database resides.
  • Witness server: For use only with high-safety mode, an optional instance of SQL Server that enables the mirror server to recognize when to initiate an automatic failover. Unlike the two failover partners, the witness does not serve the database. Supporting automatic failover is the only role of the witness.

 

Reasons to monitor mirroring

  • Verify that mirroring is functioning
  • Determine whether the mirror database is keeping up with the principal database
  • Determine how much data was lost when the principal server instance becomes unavailable during high-performance mode
  • Compare current performance with past performance
  • Troubleshoot the cause of reduced data flow between monitoring partners
  •  Set warning thresholds on key performance metrics

 

Thursday 13.04.2017

I spent all of Thursday sitting with Nadine, resolving certain alerts and writing a couple of reports. Since it was our last day before Easter we were treated to Easter eggs and hot cross buns! Yum. I didn’t make any further progress on my diagrams today as everyone is busy before the break, and cant spare any time to chat with me. Maybe next Wednesday will be better after they have a chance to catch up after the long weekend. Well I hope so anyway!

By the end of this week I have now spent 160 hours at my work placement.

Leave a comment