I want to talk about how we wanted to do our own database backup for Microsoft SQL Server, and what came out of it.
We use Microsoft SQL Server in our company and it is expected that we have to do database backups for it, which we do successfully.This is done as follows :
- The first script makes a backup of the necessary databases
- The second script goes through the folders with backup databases and archives them in 7zip
- The third script takes these backups to the external archive
It seems to be logical in my opinion, but there is some weirdness in all this. Because for some time on the server there are files that actively take up space, and then these files are deleted for lack of use. I wanted to optimize it all. Besides, our company has a product, one of the tasks of which is database synchronization, and it also could need this kind of code to optimize the task.
Compressed backup with SQL tools
Good idea, but Not every SQL version supports it, and the size of this backup is not the smallest anyway. And given that in most cases, backups are never useful, it would be nice to have more compression. But it is a great option in terms of convenience.
Backup to Azure
SQL Server as of 2012 supports Azure backup I had the idea to specify my own service as a URL and do everything in it. But it was very complicated and dirty technology. And the SQL version must be quite recent.
Backup to pipe
Great idea, but SQL is not supported since 2005.
Backup via VDI
A secret technology from Microsoft, replacing pipe. Why is it secret? Try to find the documentation for it. It’s a real challenge. There’s a documentation on the Microsoft web site which is kind of clear, but it doesn’t make any sense. Because it seems to be about VDI, but it seems to be about VSS. There are reprints of it and plush examples of its use on various sites.
In fact, the clearest documentation is in the chm file at this link
After all this documentation I had a feeling that it was just a service technology, but since I couldn’t find anything else, I settled on this one. It works with any SQL since 2005 (even 2000 with some restrictions), in any edition.
Nothing to do, took the documentation, got Anya, gave out the documentation, googled and sent off to learn this technology.
Anya studies the documentation
If I was coming up with this technology, I would do this :
- Registered virtual device in SQL
- Would make a backup command to this device with the parameters
- Would write instructions to the user on how to use
That would make sense, but Microsoft is not looking for easy ways. They came up with the following scheme
- Register disposable device in SQL with a random name (usually everyone uses guides)
- We are waiting for data from SQL
- At the same time, another thread makes a backup into this device (we made up the name ourselves, so only we can call it)
- We get data from SQL in chunks
- We got it all, close the device and finish the database backup command
Of course, everything is written in the "wonderful" COM, so that life doesn’t seem like honey and is spiced up with the following restrictions :
- SQL and application must be on the same server (thank goodness different SQL instances are supported)
- SQL user must be sysadmin (humble)
- The user who runs the application must be an administrator or a user with magical permissions Create Global Objects
More on this last point: there is a masterpiece phrase in the documentation :
The CreateEx (and Create) calls will modify the security DACL on the process handle in the client process. Because of this, any other modification of the process handle must be serialized with invocation of CreateEx. CreateEx will serialize with other calls to CreateEx, but is unable to serialize with external processing. Access is granted to the account running the SQL Server service.
Typical phrase like : feel like a loser That is, all the words are clear, but there is no meaning. In fact, as I wrote above, you need the right Create Global Objects which is available to administrators, Network Service and Local Service by default. That is, in principle, you can run everything from the service, but not the administrator (the user who makes backups, usually not the administrator), can not. But it is possible to grant permissions manually, through group policy (we just did a command in the utility, which grants the rights to a specified user, i.e. once administrator, you can continue to work). Unfortunately, we didn’t manage to figure out UAC. If you forget to run the program explicitly as the administrator, you won’t get any rights. It was possible to explicitly specify in the manifest that it must have administrator rights, but then you can not use the program not the administrator. But I would like to.
That’s all taken care of, let’s move on. I won’t write code snippets and describe what to do in details, because nobody is interested in this, and those who are interested will figure it out themselves. 🙂 In general, there is an usual loop like GetCommand, CompleteCommand, in the middle of which the results are processed. I will just tell you some details, which appeared when using this COM in .NET
A lyrical digression I completely forgot to tell you what utility we wanted to make (and did in the end). We wanted to make a backup of the database straight into the archive. Normal zip archive. And of course we wanted to restore from it. That is, no special file formats. We could either unpack and restore without this utility, or pack the existing backup and then restore it with the utility. Of course, the first iteration had a problem: not all compression libraries could normally process files larger than 2GB, so we settled for 7zip and its wrapper SevenZipLib As a result you can archive to 7z, zip, bz2, xz, gz. But 7zip in this mode also works via COM… which gave us a lot of joy.
COM The Great and Terrible
Yes, we decided to write everything in .NET. Actually, the marshalling in COM is pretty good there (half of the system libraries are basically wrappers over COM), and it seemed like there wouldn’t be any problems. But they were encountered literally every step of the way.
This part will probably be obvious to those who have already encountered it, but for those who haven’t, it might be interesting.
When declaring objects, method names do not matter
Only the order is important (this is the difference from WinAPI methods with attribute DllImport). That is, if you accidentally mixed up two methods (especially similar ones, such as Create and CreateEx), you can get great bugs, when everything seems to work, but not quite as it should be.
NET will do your excerpts for you
In COM methods usually return HRESULT, i.e. error code. NET takes advantage of this and in case of non-zero code throws a COMException, in which you can look at the code and handle it. Seems convenient, but not really. In addition, on one of the servers with old .NET there is a wonderful bug. When this action occurs (for example, when accessing a closed device, which is actually a normal situation according to the documentation), ExecutionEngineException popped up and fatal crash of the application. Google suggested installing updates as a solution, but in general this is not a solution to the utility, so we worked around the problem as follows: We added a PreserveSig attribute to each COM method, which in case of an error will not throw an action, but will simply return the error code. In this case, no terrible problems did not arise. And the code is more logical now.
STAThread and MTAThread
For .NET developers, this is some magic attribute that the studio adds to the Main method. If anyone tries to figure out what it is, they get the following chic in its simplicity :
STAThreadAttribute indicates that the COM threading model for the application is single-threaded apartment. This attribute must be present on the entry point of any application that uses Windows Forms; if it is omitted, the Windows components might not work correctly. If the attribute is not present, the application uses the multithreaded apartment model, which is not supported for Windows Forms.
Which translates to: dude, it’s necessary.
Well, of course we should, and when our application was archiving to zip, everything was fine, but when we tried to use 7z, everything froze epically. So what’s the problem, and what do these attributes really mean?
In a nutshell, the point is this. COM is multi-threaded out of the box. That is, different objects receive and return data in different threads. The ugly-armed code (or Windows Forms) cannot handle such a complex concept, so they wrap all COM calls in an internal queue and execute in one thread. No problems with multithreading! But we had an epic run-in with Dead Lock. How much it cost to figure out that it’s a deadlock, and a deadlock somewhere in the guts of .NET… ugh…
But we’re lucky, we have console application, so we just put MTAThread attribute and relax. If that doesn’t work for you… well, suffer. There’s a workaround.
Creating objects can be strange
NET creates handy wrappers, so instantiating objects looks simple and obvious :
var obj = new ClientVirtualDeviceSet();
And everything worked for us until we tried to run everything on 32-bit Windows. When we created the object, an exit prompt popped up. It was logical to assume that the problem was with data types and marshalling (wrong size), but everything was OK. We didn’t know what the problem was, we just created the object like in the good old days, with CoCreateInstance
object comObject;var errorCode = CoCreateInstance(CLSID_MSSQL_ClientVirtualDeviceSet, IntPtr.Zero, CLSCTX_INPROC_SERVER, CLSID_MSSQL_ClientVirtualDeviceSet, out comObject);
I won’t describe all the small stuff with marshalling, proper structure layouts, in principle, they are fine-tuned once and never touch again. Just a tip, if it’s hard to deal with pointers, give IntPtr and convert it to the right type by hand. Might be easier than picking up the right attributes for .NET
A little about 7z
More precisely about SevenZipSharp, but since it’s a wrapper in general (very good), it’s about it too.
The first thing that might be confusing is that the library is geared to work with files (not streams), so the logic is inside out. So in .NET the compressor is usually a wrapper over the writable stream, you write into it, it compresses and passes the compressed data into the internal stream. In SevenZipSharp, on the contrary, it reads data from your stream and writes it into its internal stream. Considering that we also read from SQL, Anya had a lot of pleasant moments of troubleshooting with asynchronous streams, locks, events… Well, it’s an amateur’s game 🙂
The second, again related to the files, is that the library really likes to run in different directions along the flow, which is generally not bad, but if the flow doesn’t allow to do so (e.g., stdout), problems begin. In principle, we managed to work around that (also a separate quest, after which Anya started to hate me) for gzip, xz, gzip, 7z, but there were difficulties with zip. The thread structure does not allow to do that. The situation is even worse with reading from stdin – the library is not able to do that. We had to abandon this feature for now and leave only the possibility to read uncompressed backups from stdin.
After all this fascinating quest, we have a utility with the original name PackDb, which can make backups of databases immediately in the archive, restore them from there, and a bonus, another fascinating feature: copying the database without creating temporary files. What is this for? For test databases, when you want to copy the master database to a new one and test it on this database without touching the main one. And all sorts of little things like rights issuance, backup validation or ability to not use archives if you don’t want, but just take bak files.
It’s up to you to decide whether this quest is worth it if you want to make unusual backups. But in my opinion, it’s a very thankless task. There are too many nuances to make everything beautiful.