This blog has, IMO, some great resources. Unfortunately, some of those resources are becoming less relevant. I'm still blogging, learning tech and helping others...please find me at my new home on http://www.jameschambers.com/.

Thursday, May 7, 2009

MS SQL Server: Writing String.Split

UPDATE: A reader pointed out a scenario that I had not tested for: trailing data after the delimiter.  I did not run into this in my testing (I was using line breaks and always had them at the end of the document I was testing) but I have updated the code below to reflect how to grab the rest of the text.

When you come from a programming background you might find it hard to believe when you flop over to the MS SQL side of the world that SQL Server does not have a SPLIT function for string manipulation.  In .Net, we have the string.split method that allows us to break up text based on a delimiter.

In MS SQL, you have to write a string SPLIT function yourself.  Not having a great head (yet) for performance, I would suggest that because of the walking the string bits in here, my stab at this shouldn’t be used in high volume situations.  That said, I walked across 3,500 rows generating 70,000 rows of split-output (and inserting them to the DB) in less than 20 seconds.

The following method returns as a table variable, so the correct syntax for calling requires a SELECT as such:

select * from Split(YOUR_TEXT, ',')


Here’s the function for your compilation happiness:



create function dbo.Split
(@SourceText varchar(max), @Delimiter char(1
))

returns @Table table(StringID int, Value varchar(max
))
AS
begin
declare
@count
integer

declare
@curPos
integer
declare
@textPartLength
integer
declare
@prevPos
integer
declare
@nextOcc
integer
declare
@textPart varchar(max
)

set @count = 1
set @prevPos = 0
set @curPos = 0
set @nextOcc = charindex(@Delimiter, @SourceText
)

while @nextOcc > 0
begin
set
@curPos = @nextOcc
set @nextOcc = charindex(@Delimiter, @SourceText, @nextOcc + 1
)
set @textPartLength = @curPos - @prevPos

-- grab the substring from the source text and remove the delimiter
set @textPart = substring(@SourceText, @prevPos , @textPartLength
)
set @textPart = replace(@textPart, @Delimiter, ''
)

insert into @Table (StringID, Value) values(@count, @textPart
)

set @prevPos = @curPos
set @count += 1
end



-- get any text after the last delimiter
set @textPartLength = LEN(@SourceText) - (@prevPos - 1
)
set @textPart = substring(@SourceText, @prevPos , textPartLength
)
set @textPart = replace(@textPart, @Delimiter, ''
)
if(LEN(@textPart) > 0
)
insert into @Table values (@count, @textPart)




return


end




2 comments:

  1. it does not return the last element!

    I place '123.456.78.90' and it returns 3 elements only:
    123
    456
    78

    Not working!!!

    ReplyDelete
  2. Thanks for pointing this out, anon. I had originally written the code to break apart a document stored in the database by line breaks, of which there was always one at the end of the document, so I never ran into this.

    I updated the code accordingly so that it works with text after the last delimiter.

    Cheers,
    -jc

    ReplyDelete