The legacy database I’m working from is MySQL version 4.0 and I am moving to SQL Server 2008. My migration strategy is fairly straightforward: at some point in time (hopefully the near future) we will be able to run a script at night that copies all data to MS SQL, then transforms it to the new data model.
My tools for this will be:
- dbForge Studio
- MySQL ODBC data connector
- RegEx for search and replace
- MS SQL Server Management Studio (SSMS)
The first step is getting all the table data over to MS SQL. I don’t want to impact the live db and I want to be able to run the script as many times as I like, whenever I want to. Depending on performance, I may even set it up for a nightly batch for the short term.
I don’t care about preserving indexes (which there are none) nor the identification of primary keys (of which there aren’t on all tables). I need to be able to ‘flash’ my copy and then operate off of the data at that location. I will not be sync’ing anything and I really don’t care if IDs line up.
All of the above statements on the state of the data actually help me out a fair bit. While it may not be too efficient, I am free to DROP all the tables I have imported each pass and recreate (and populate) them. This gives me the side-effect bonus of being able to ‘accidentally’ toast any parts of the data that I like.
Here’s another benefit: being in SQL Server, I can now start writing transforms that move data to my new tables, then output reports through SQL Reporting Services nightly to PDF, giving me a head start on some Raven development…but I digress…
So, here are the steps I followed:
- Establish a connection to the legacy database on the new database server through the MySQL ODBC connector.
- Get a list of existing tables from the legacy DB.
- Convert the list into a reusable script using RegEx
Establish A Connection
I installed the 5.1 build of the MySQL ODBC driver, but I failed to establish a connection to the database when I tried to create the system DSN. The version of MySQL that is running on the server is 4.0, so I took a shot in the dark and installed the 3.51 build of the ODBC driver and was able to connect.
You will need to name the DSN, know the IP address (or host name) that resolves to old server as well as the username and password of a user with appropriate privileges. I’m an arse, so I just used root. ;o)
Pick your database and away you go.
Next, pop over to SQL Server and navigate to your Server Objects in the tree, then expand and right-click on Linked Servers to create a new one. Name your Linked Server; this will be the name you use when you reference the server in TSQL.
Select Microsoft OLE DB Provider for ODBC Drivers as the Provider, specify MySQL as the Product Name and then use the DSN name you specified when you created your System DSN for the Data Source property.
All connected, and good to go.
Get Your List of Tables
This actually proved to be much easier than I anticipated. You can likely use any client/editor/SQL environment that connects to MySQL, but I’ve been lovin’ on dbForge Studio.
I opened up my editor, selected the database and started a new query. Brace yourself, but you’ll have to type two whole words: SHOW TABLES. Nice.
Copy to the clipboard your results and head over to SSMS.
Convert Table Names to a Script
I love regular expressions. I actually was contracted by MSDN magazine to write an article on RegEx as the release of .Net 2.0 was about to be made. Sadly (for me), the content of the article didn’t really fit the theme for a good 4-6 months, and by then there was a good amount of content out on the market. Oh well, got my $150 from MS as a retainer. :oD
Paste your list of tables into a new query window (on any existing database). We’re not actually going to run anything here, we just want to use the features of the editor to do some crazy-mad scripting wizardry.
After you’ve pasted, make sure there are no ‘extra’ blank lines at the end of the editor…it will save a clean up step at the end.
Hit CTRL-H to open up the Find and Replace dialogue. In ‘Find what:’, type “^.*$” without the quotes. This is a regular expression that will read a whole line of text. The caret is ‘start of string’, the .* eats all text to the $, which is ‘end of string’.
Next, paste the following into the ‘Replace with:’ field:
if object_id('\0', 'U') is not null drop table \0 \n select * into \0 from openquery(YOUR_LINKED_SERVER_NAME, 'SELECT * FROM YOUR_LEGACY_DB_NAME.\0')
Basically, what we’re doing is using the regular expression engine to replace a table name with our little bit of script. Anywhere the \0 appears, regex will inject whatever we ‘ate’ with .*, which in our case was a single table name.
Be sure to change the obvious placeholders for your linked server name and the name of your legacy DB.
Hit replace all and boom! you will have your script. Save it out, then select all and copy to the clipboard. You can close the query at this time.
Create your new database in SQL Server, then right-click on the db and start a new query. Paste your script into the editor and…be brave now…hit F5.
Now go for coffee. If you have a lot of data in your legacy DB, you’ll likely have time to go for coffee in a nearby neighbourhood. And, if it’s really large, you might even be able to grow your own beans.
My MySQL server is on the same network and the DB I’m ripping has approximately 100,000 rows. It takes just over 12 minutes for me to run the above.