Data entry, it’s where users love to copy and paste. Especially when their text is nicely arranged with multiple spaces, cartridge returns and line feeds amongst other special character formatting.
In a perfect world this would be captured during input (see end of article), but if some slips through the net you need to be able to clean the text string within the SQL database when faced with a situation like this.
To clean this up you could blaze in and use the function REPLACE
SELECT REPLACE(column_name, CHAR(13) + CHAR(10), '') FROM table_name GO
This will remove ALL entries so the users will be calling up complaining their formatting has gone.
Alternatively you need to find the LAST valid text and remove the remainder. Finding the end of a string is not straightforward in SQL as most functions only take an expression such as ‘foo’ and not a pattern. The function PATINDEX does accept a ‘% like’ statement, but finds the first occurrence. Yes, you can use it along with the function REVERSE if you are using SQL 2008+ but it’s not very nice.
My solution, in this instance, where we have multiple formatting characters is to look for the first instance of two identical characters and then keep all preceding information.
This update statement will clean and update a column storing a string contaminated like the above example.
UPDATE table_name SET column_name = SUBSTRING(column_name,0,CHARINDEX(CHAR(10)+CHAR(10), column_name)) WHERE column_name LIKE '%'+CHAR(10)+CHAR(10)+'%'
Prevention is better than the cure, so if users are submitting data in Excel you can use this combination of Excel functions to pre-clean the data.