Jumat, 04 November 2016

Work around for calculating age using the SQL Server DateDiff function


Problem

Have you ever wanted to compute age, but the results from the DATEDIFF function seemed to be wrong some of the time? In this tip we cover why the DATEDIFF function does not always reliably compute age?

Solution

The SQL Server online documentation describes how to compute the number of datepart boundary crossings between two dates and/or times with the DATEDIFF function. The documentation indicates you can specify many different types of boundary crossings, such as borders betweens years, quarters, months, days of a year, a month, or a week. If you do not think about this much, you might conclude that boundary crossings for years correctly reflects the duration in years between two dates or datetime values. This would be a mistake.
This tip explains why the number of year boundary crossings does not reliably compute the duration between two dates or datetime values. The tip also applies this understanding to reliably computing the age in years between two date values, such as a birthdate and a hospital admission date .

Demonstrating the Age In Years Issue for the DATEDIFF Function

When you specify a DATEDIFF function, you indicate two date or datetime values as well as the datepart for which you want to count the boundary crossings. The first date is often called the start date and the second date the end date, but there is no requirement for the values to be in a specific order. In the source data for a query or the parameter values for a stored procedure, the "start date" can always precede, always follow, or sometimes precede and sometimes follow the "end date". For example, while it is normal with new born babies for the birth date to be after the admission date for entry to a hospital, it may be necessary sometimes for a baby to return to the hospital shortly after birth for a medical procedure. On the return visit, the birth date will always be before the hospital admission date
The screen shot below illustrates two cases in which the start date is before the end date. The age in days between the two dates is either 2 or 3 days, but in one case the DATEDIFF function returns an Int data type. AgeInYears value of 0 while in the other case the AgeInYears value is 1.


It is important to understand that the DATEDIFF function is both reliable and valid in both cases. The DATEDIFF function counts the boundary crossings between the start and end dates identically in both instances. In the first case, both dates are on the same side of the boundary between 2011 and 2012 -- namely, the 2012 side. Because there is no boundary separating these two dates, the Int. AgeInYears value is 0. In the second case, the start date is on the 2011 side of the boundary, and the end date is on the 2012 side of the boundary. There is one year boundary between the two dates, and the DATEDIFF function returns a value of 1 for Int. AgeInYears.

A Fix for the Age In Years Issue with the DATEDIFF Function

A fix for the DATEDIFF function depends on your computing objective. The DATEDIFF function performs correctly, but its application may not be suitable for your computing objective. Let say, your objective is to fix the example above so the Int. AgeInYears value is 0 in both cases -- using the exact same pair of start and end dates specified above.
Looking at the preceding screen shot, you can see that the Int. AgeInYears value is wrong when the day of the year for the start date is greater than the day of the year for the end date. Notice that January 1, 2012 is the first day of the year for 2012, which is obviously before January 3, 2012 in the year 2012. On the other hand, December 31, 2011 is the 365th day of the year for 2011. Therefore, December 31, 2011 has a day of the year value far exceeding the day of the year value for January 3, 2012.
You can use the DATEPART function to assess separately the day of the year for the start date and the end date. In the preceding example, so long as the day of the year value for the start date is less than or equal to the day of the year value for the end date, the count of the year boundary crossings matches the age in years. In contrast, when the day of the year for the start date is greater than the day of the year for the end date, the DATEDIFF function returns an age in years that is one greater than is valid. Therefore, you can fix the problem by reducing the end date by one year in the preceding example. The following screen shot shows how to implement this fix with T-SQL.


If you run the preceding script with the same two pair of date values as previously, you'll get the right result for the two different start dates on either side of 2011 year boundary. The screen shot below shows the outcome of running the fixed code. Notice that the Int. AgeInYears value is 0 for both start dates.

A More Robust Fix for the DATEDIFF Function Age In Years Issue

While the above solution fixes the DATEDIFF function age in years issue for the same dates as used in the first example, the solution does not necessarily extend to other pairs of start and end dates. In particular, when the start date is after the end date in the same year, the Int. AgeInYears value equals -1. Also, the AgeInDays value is negative. Since age cannot be negative, we need another, more elaborate fix.
The negative value outcome requires different fixes for years and days. In the case of years, the age in years computes to a negative value of -1 just when the start date is after the end date in the same year. As a result, our fix can check for a -1 computed value and replace it with 0. In the case of days, the age in days computes to a negative value that reflects the true difference in days between the dates. Therefore, for days, we can simply wrap the DATEDIFF function inside an ABS function. The following screen shot shows the application of these ideas to the prior fix.
The following excerpt from a Results pane confirms the ability of fixed code to properly handle the negative outcomes from the previous fix. The first row is from the previous fix, and the second row shows the valid results returned by the second fix. Since both the start date and the end date are in the same year, the age in years should obviously be 0 and not -1. In the case of age in days, the start date and end date are one day apart. Age is not directional. It makes no sense to say a baby is - 1 day old. Wrapping the DATEDIFF function in the ABS allows the result to be positive whether the start date is before or after the end date.


Up until this point, all the examples were for cases in which the age in years should be 0, but the final expression for age in years also works correctly for dates that are one or more years apart. The final expression even validly accounts for leap years. The first row in the following Results pane excerpt shows two dates with an age in days of 1460 days, which happens to be equivalent to 4 times 365. However, the reported age in years between the dates in the first row is 3 years -- not 4 years. This is because the two dates encompass the extra day of February 29, 2012. The inclusion of a leap year can require that the dates be at least 1461 days apart for the difference in years to be 4 years apart. The second row in the excerpt below illustrates that the expression for age in years correctly handles this detail.


EmoticonEmoticon