Home PostgreSQL Bucardo: Multimaster replication

Bucardo: Multimaster replication

by admin

In the process of agony I dug through a ton of articles and decided to write a thoroughly commented manual.Especially since the information on configuring multimaster and in Russian is very scarce and it’s some kind of piecemeal.
A little introduction.To make Bucardo work, we have to :
1) Tell him which member bases on which servers even exist.
2) Tell him which tables are involved in replication.
Warning : if the developers will add a new table to the application, we must inform bucardo about it. The same applies to changing the schema of existing tables.
3) Tell it which table groups exist and which tables fall into which groups. Groups are needed in case you need to replicate different tables between different servers. It’s more convenient to work with a group than to specify each one separately (very similar to groups in Nagios).
4) Tell him what database groups exist. The purpose is the same as for tables.
Let’s move on to installation. A variant for Debian 7. It assumes that postgresql-9.1 and postgresql-client-9.1 are already installed.

Preliminary preparation

The servers will be called node1 and node2 Be sure to also check that all participating PostreSQL servers listen to the external interfaces :

# netstat -plnt4 | grep 5432tcp 0 0* LISTEN 12345/postgres

Install Bucardo package and PL/Perl support for PostgreSQL on each of the servers :

# apt install bucardo postgresql-plperl-9.1

Activate on each of the servers :

# sed -i 's/ENABLED=0/ENABLED=1/' /etc/default/bucardo

For some reason the maintainers of the package did not think of creating a directory for PIDs, so we will create one on each of the :

# mkdir /var/run/bucardo

Make sure that we can connect via TCP socket to the DBMS on each of the servers :

# psql -U postgres -h

If you don’t remember the password, the easiest instructions are here
If the PG does not want to accept requests from a specific address of a specific user, then configure /etc/postgresql/9.1/main/pg_hba.conf
Next will be the initialization of the database. It will be created by the postgres user, but populated by the bucardo user, so you can get bogged down with a connection problem.
To avoid this we will put a line for it in /etc/postgresql/9.1/main/pg_hba.conf beforehand. Moreover already during the process Bucardo will not only access his cluster node, but also the paired one. That’s why we mustn’t forget it either. If you have more servers in your cluster do not forget about them. On each of the servers :

host all bucardo trusthost all bucardo SECOND.NODE.IP.ADDRESS/32 password

After that, we restart the NMS :

# pg_ctlcluster 9.1 main restart

Bucardo Installation

The bucardo_ctl utility has been replaced by bucardo in recent versions of Debian, so we will be using it.
Initialize Database :

# bucardo install

The dialog looks something like this :

# bucardo installThis will install the bucardo database into an existing Postgres cluster.Postgres must have been compiled with Perl support, and you must connect as a superuserCurrent connection settings:1. Host: localhost2. Port: 54323. User: postgres4. Database: postgres5. PID directory: /var/run/bucardoEnter a number to change it, P to proceed, or Q to quit: PPassword for user postgres:Postgres version is: 9.1Password for user postgres:Creating superuser 'bucardo'Password for user postgres:Attempting to create and populate the bucardo database and schemaPassword for user postgres:Database creation is completeUpdated configuration setting "piddir"Installation is now complete.If you see errors or need help, please email bucardo-general@bucardo.orgYou may want to check over the configuration variables next, by running:bucardo show allChange any setting by using: bucardo set foo=bar

During the initialization process the database was created from the file /usr/share/bucardo/bucardo.schema, so you don’t need to fill it in manually, as described in the manuals of previous versions.
Bucardo is installed, you can run it :

# bucardo start

Setting up replication

Before setting up replication, let’s create the test bases that we will replicate.
On each of the servers :

# psql -U postgres -c "CREATE DATABASE mydb;"# psql -U postgres mydb -c "CREATE TABLE mytable ( num123 integer PRIMARY KEY, abc varchar(10) );"

Another important point about security. After adding the replicated database to the setup, Bucardo will put the user’s password into the database. As he didn’t ask for it during the installation he uses the same password as for the postgres user. In other words we have in the bucardo database the password of the superuser which is a bit risky.
So let’s make him a different password. On each of the servers :

# psql -U postgres -c "ALTER USER bucardo WITH PASSWORD 'eiP4uSash5';"

Next we let Bucardo know how to connect to the database we want to replicate. I’m not a fan of Unix sockets in a high load environment (a separate topic to talk about), so even where local, we’ll specify the TCP socket.
WARNING : We do this on the server node1. And in general further work only with node1 until it is specified what to do on both.
Let’s add a local (mydb_node1) and its remote copy (mydb_node2) from server node2:

# bucardo add database mydb_node1 dbname=mydb dbhost= dbuser=bucardo dbpass=eiP4uSash5Added database "mydb_node1"

# bucardo add database mydb_node2 dbname=mydb dbhost=node2.example.com dbuser=bucardo dbpass=eiP4uSash5Added database "mydb_node2"

Here :
mydb_nodeX is internal base name. This is the name Bucardo uses in its internal work with the base.
dbname=mydb – the real name of the database in PostgreSQL, to which mydb_nodeX refers.
dbuser=bucardo – under which Bucardo will connect to the database to work with this database.
We can see the result this way :

# bucardo list databaseDatabase: mydb_node1 Status: active Conn: psql -p -U bucardo -d mydb -h mydb_node2 Status: active Conn: psql -p -U bucardo -d mydb -h node2.example.com

These settings are taken from the db table of the bucardo database, where the aforementioned password sits:

# psql -U postgres bucardo -c "SELECT name, dbname, dbhost, dbuser, dbpass, status FROM db;"name | dbname | dbhost | dbuser | dbpass | status------------+--------+-------------------+---------+------------+--------mydb_node1 | mydb | | bucardo | eiP4uSash5 | activemydb_node2 | mydb | node2b.forbet.net | bucardo | eiP4uSash5 | active(2 rows)

Now we need to add a table which we will replicate between them. Most of the time people replicate the whole database, so we will add everything at once (a group of tables (herd) will be created automatically). If the developers will come up with a new table, we’ll just add it to the group and everything will work by itself – since further settings will apply to the whole group.

# bucardo add table all --db=mydb_node1 --herd=mydb_herdCreating herd: mydb_herdAdded table public.mytable to herd mydb_herdNew tables added: 1

Here :
–herd=mydb_herd – the name of table group, so you can set up synchronization not to each table individually, but to all tables together.
And we can watch it right away :

# bucardo list tablesTable: public.mytable DB: mydb_node1 PK: num123 (int4)Here we need to focus on the PK. Bucardo doesn't seem to work with tables without primary keys. You won't be able to do a sync afterwards.

You can see the group, too :

# bucardo list herdHerd: mydb_herd DB: mydb_node1 Members: public.mytable

The same goes for sequences. We don’t have them in our example, but maybe someone will use them. We won’t create our own group for them, so as not to complicate things. The probability that the tables are replicated in one direction, and the sequences in the other is extremely low. So let there be one group for tables and sequences:

# bucardo add sequence all --db=mydb_node1 --herd=mydb_herdSorry, no sequences were foundNew sequences added: 0

Our next task is to create a replication group. In this group, we will tell which base will be the source and which will be the recipient of the data. Let’s first create the group itself, which is empty for now:

# bucardo add dbgoup other_mydb_serversCreated database group "mydb_servers_group"

Add both of our servers to the group, specifying who will play which role. This is the only point where master-slave is different from master-master
Initially, you might think that source is the source and target is the recipient. In fact, this is not exactly true. source is someone who works as both source and recipient, and target is only the recipient.
That is, if we have master-slave, we specify one source and the other target. And if we have master-master, then both will be source and no target at all.

Option for MASTER–> SLAVE:

# bucardo add dbgroup mydb_servers_group mydb_node1:sourceAdded database "mydb_node1" to group "mydb_servers_group" as source# bucardo add dbgroup mydb_servers_group mydb_node2:targetAdded database "mydb_node2" to group "mydb_servers_group" as target

Option for MASTER<–> MASTER:

# bucardo add dbgroup mydb_servers_group mydb_node1:sourceAdded database "mydb_node1" to group "mydb_servers_group" as source# bucardo add dbgroup mydb_servers_group mydb_node2:sourceAdded database "mydb_node2" to group "mydb_servers_group" as source

That’s it! We have written what bases there are. It says what tables are in them. It says who is in which group. It remains to say the final touch – to say what group of tables will "shuttle" between the bases of what group. In other words, create a "sync":

# bucardo add sync mydb_sync herd=mydb_herd dbs=mydb_servers_groupAdded sync "mydb_sync"

We can see what we have got :

# bucardo list syncSync: mydb_sync Herd: mydb_herd [Active]DB group mydb_servers_group: mydb_node1 (source) mydb_node2 (source or target - as configured)

After changing the settings, be sure to restart Bucardo:

# bucardo restart

Check : on the first node of node1 run :

# psql -U postgres mydb -c "INSERT INTO mytable VALUES (1, 'a');"

and on the second node2 we check :

# psql -U postgres mydb -c "SELECT * FROM mytable;"

Who made multimaster, he needs to check in the opposite direction as well. You create on node2 and check on node1.
Questions that most people will have :
1) What happens to the table on the target base if the table on the source base was changed while Bucardo was down or the network was unavailable?
Answer : everything is Ok. At startup or when the network appears, Bucardo will pass the data to the target server. So the target server can crash as you want. The only requirement is that it must have the same data schema (table structure) as the first one.
2) If the database is big (tens or hundreds of gigabytes), Bucardo "breaks off" and does not sync all the way. What to do?
Answer : put sync in inactive state. But Bucardo must be enabled for sourcebase to log queries.
bucardo update sync mydb_sync status=inactive (for multimaster on all nodes)
Then do pg_dump/pg_restore manually and reset sync to active mode (for multimaster first on the one where new requests were sent after dump).

You may also like