Remove multiple CR LF characters from the end of text strings in SQL

Data entry, it’s where users love to copy and paste. Especially when their text is nicely arranged with multiple spaces, cartridge returns and line feeds amongst other special character formatting.
In a perfect world this would be captured during input (see end of article), but if some slips through the net you need to be able to clean the text string within the SQL database when faced with a situation like this.

User formatted input with multiple CR LF characters

To clean this up you could blaze in and use the function REPLACE

SELECT REPLACE(column_name, CHAR(13) + CHAR(10), '')
FROM table_name
GO

This will remove ALL entries so the users will be calling up complaining their formatting has gone.

Alternatively you need to find the LAST valid text and remove the remainder.  Finding the end of a string is not straightforward in SQL as most functions only take an expression such as ‘foo’ and not a pattern.  The function PATINDEX does accept a ‘% like’ statement, but finds the first occurrence.  Yes, you can use it along with the function REVERSE if you are using SQL 2008+ but it’s not very nice.

My solution, in this instance, where we have multiple formatting characters is to look for the first instance of two identical characters and then keep all preceding information.

This update statement will clean and update a column storing a string contaminated like the above example.

UPDATE table_name
SET column_name = SUBSTRING(column_name,0,CHARINDEX(CHAR(10)+CHAR(10), column_name))
WHERE column_name LIKE '%'+CHAR(10)+CHAR(10)+'%'

Additional

Prevention is better than the cure, so if users are submitting data in Excel you can use this combination of Excel functions to pre-clean the data.

=TRIM(CLEAN(SUBSTITUTE(Your_Cell_Here,CHAR(160),CHAR(7))))
Posted in Excel, Microsoft, SQL Server | Leave a comment

Find max or min values from a list of duplicates in Excel

In Excel sometimes you will need to find either the first or last value from a mixed list of values that also include duplicates.

Later versions of Excel feature a Remove Duplicates button under the Data tab in the ribbon.  This only removes unique duplicates, if you start including your date or sales values in the list you find Excel sees only unique values.

The work around is to use an Excel Array Formula.

You can combing MIN / MAX and IF to lookup the value from an array and then grab the relevant version of the data.

Below I use MAX to find the latest date for each of the unique values.

Syntax

=MAX(IF(<value to lookup in list> = <unique value>,<range of date for all values>))

To create this as an Array Function press Ctrl + Shift + Enter rather than just Enter

You can tell an Array Function by the curly brackets around the entire request (including equals)

{=MAX(IF(A:A=D2,B:B))}

Example

Excel Max List Array Function

You can then fill down appropriately to populate the rest of your values.

If using dates you will need to format the result cell as a date field or Excel will display the numeric value for the date instead.

Posted in Excel, Microsoft | Leave a comment