Moving CouchDb database files between servers

Home > couchDB > Moving CouchDb database files between servers

Moving CouchDb database files between servers

July 8, 2011 David Leave a comment Go to comments

There are several methods available for copying data between CouchDb servers. The most obvious is replication, which CouchDb does extremely well and it’s built in. If that option is viable it is probably the way to go. A while back I posted about a method I have used to use bulk document handling in Couch to copy data. That process works well too, and I continue to use that from time to time for some data.

Recently I had a situation in which I needed to set up several development and testing servers with the same initial state for the data. These were not on the same networks, and replication wasn’t convenient. I was moving a good bit of data across several databases, so the bulk document approach wasn’t attractive either. So I resorted to just copying the data files between servers. CouchDb’s design makes this easy to do.

The steps I take are probably overly cautious, but here’s what I do:

Stop the couchdb service on the source host
tar.gz the data files. On my Ubuntu servers this is typically in /var/lib/couchdb (sometimes in a subdirectory based on the Couch version). If you aren’t sure where these files are, you can find the path in your CouchDb config files, or often by doing a ps -A w to see the full command that started CouchDb. Make sure you get the subdirectories that start with . when you archive the files.
Restart the couchdb service on the source host.
scp the tar.gz file to the destination host and unpack them in a temporary location there.
chown the files to the user and group that owns the files already in the database directory on the destination. This is likely couchdb:couchdb. This is important, as messing up the file permissions is the only way I’ve managed to mess up this process so far.
Stop CouchDb on the destination host.
cp the files into the destination directory. Again on my hosts this has been /var/lib/couchdb.
Double check the file permissions in their new home.
Restart CouchDb on the destination host.

You may or may not have to stop the CouchDb services, but it seems like a good idea to me to decrease the chances of inconsistent files.

I have done this a number of times now with no problems, other than when I manage to mess up the file permissions.

Categories: couchDB

Comments (8) Trackbacks (0) Leave a comment Trackback

robaldred (@robaldred)

September 30, 2011 at 6:45 am

Reply

This doesn’t work when moving down version from 1.1.0 to 1.0.1

An error occurred retrieving a list of all documents in futon.

Unexpected message, restarting couch_server: {‘EXIT’,,
{{badmatch,{error,eacces}},
[{couch_file,init,1},
{gen_server,init_it,6},
{proc_lib,init_p_do_apply,
3}]}}
- David
  
  September 30, 2011 at 8:07 am
  
  Reply
  
  Right, this approach requires the databases to be the same version. If they aren’t, use replication or a dump and bulk load instead.
Pulkit Singhal

November 1, 2011 at 2:05 pm

Reply

Nice Article David, good recipe to have! In addition to this would you happen to know how to copy or move documents between two databases on the same CouchDB instance?
- David
  
  November 5, 2011 at 6:46 am
  
  Reply
  
  If I were moving documents between two databases in the same instance, and had a lot of them to move at once, I’d probably dump the documents to a file and use a bulk load. The other option would be replication.
nguyende.tvit

February 7, 2012 at 3:33 am

Reply

“There are several methods available for copying data between CouchDb servers”.
=> How about other ways?
- David
  
  February 7, 2012 at 9:40 am
  
  Reply
  
  The only ones I can really comment on are things I’ve actually done, which include:
  
  1. replication, mentioned above. This works well if you are copying thousands of files, less well if you are copying tens of millions or are simultaneously writing a lot and have several views.
  2. copying files as described in the post above. This is often the quickest, but it does require compatible couchdb versions and that you can take databases offline.
  3. dumping to json text files and loading them in bulk (described in another post). This works well if the couchdb versions don’t match and the quantity of the data is relatively small (thousands of records)
  4. writing a script to walk one database and write to the other. This is useful when you need logic in the middle rather than just a raw copy. It is obviously the most resource intensive.
  
  In the vast majority of cases I’d do the first or second options of these four.
rgeber

May 29, 2013 at 9:32 am

Reply

We tried doing it that way with a massive 500GB database since the inbuilt replication would have taken weeks in order to finish the initial replication of both databases. Both servers are connected through a rather small connection which we can’t change.

We assumed that copying those files from A to B would do the trick yet after doing we set up replication just to find out that the initial replication still takes a long time. Do those two databases need to “exchange” their document’s states after the files have been copied from one place to another?

Did you ever run into similar issues with that method? From what I know there’s a small delta on the source database already, could that cause the issue?
- David
  
  May 29, 2013 at 10:03 am
  
  Reply
  
  Yes, Couch still has to go through all of the documents to start the replication, which takes a long time if there are many documents. As far as I know, there isn’t a good way to skip that. Copying the data does save having the replication copy the data, however, so the replication just has to go through and compare IDs and revision numbers.

No trackbacks yet.

David's Tech Blog

Moving CouchDb database files between servers

Leave a comment Cancel reply

Recentish Posts

Top Posts

Archives

Meta

David's Tech Blog

Moving CouchDb database files between servers

Share this:

Related

Leave a comment Cancel reply

Categories

Recentish Posts

Top Posts

Archives

Meta