While setting up Amazon RDS instance for a Ruby on Rails application, I followed simple Amazon RDS Setup Guide , but still why I ended up banging my head for long time before success.
So I proceed by following steps, Before that I want to share my Rails application stack which is
Rails 3.2.14 Ruby 1.9.3 PostgreSQL 9.1
Step 1: Launch DB instances
This is the simplest step and pretty straight forward
- Go to http://aws.amazon.com/rds and click Sign Up for Amazon RDS.
- Follow the on-screen instructions.
Here Point (2) is important => on-screen instruction , includes following key points
2. a) Selecting Database for a DB instance like MySQL, PostgreSQL, Oracle etc – I selected PostgreSQL
2. b) Selecting DB version, Amazon offers only PostgreSQL 9.3 as of date – as obvious I selected PostgreSQL 9.3
2. c) Selecting Instance types from various types according to our need. – I selected db.m1.medium
2. d) Setting up master user this is the most important, Amazon RDS setup wizard lets us to create a master user with inherited permissions from rds_admin user which is the default superuser for RDS – I created a master user as MyDBUser
2. e) Choosing DB name – should be same as the DB name which is going to be imported into RDS instance
Apart from this selecting DB identifier, Allocated Storage, Backup setting and Backup Retention Period etc can be configured as per our choice and need or leave them to default.
Step 2: Most Important – Importing Data to Amazon RDS
Before that I want to share my config/database.yml and local DB user and roles details which is
adapter: postgresql encoding: unicode database: blog_production port: 5432 pool: 5 username: MyDBUser password: ******** min_messages: WARNING host: localhost
**Note: here I am connecting to ‘blog_production’ as MyDBUser and I have created master user MyDBUser
Also I imported some tables as ‘postgres’ user form third party DB which was the need, with a simple table import commands
pg_dump -U postgres -t TABLE_NAME DATABASE_NAME > FILE.sql
What this command does?, This imports TABLE_NAME DATABASE_NAME in FILE.sql with ‘postgres’ as a owner.
Note: I had local database with multiple users as ‘MyDBUser’ and ‘postgres’ both as superuser permissions, here interesting to note that ‘blog_production’ was owned by ‘postgres’ user because this database was also imported initially and ‘MyDBUser ‘ is created for Rails Application.
So the scenario was
– Local DB ‘blog_production’ is owned by ‘postgres’ user
– Some tables of ‘blog_production’ are owned by ‘postgres’ (Data Import form Third Party DB)
– Some tables of ‘blog_production’ are owned by ‘MyDBUser’ (Tables generated by via Rails Migration)
– ‘blog_production’ has two user as ‘MyDBUser’ and ‘postgres’ both with role superuser
Above 4 facts caused serious trouble while importing ‘blog_production’ to Amazon RDS instance
Back to RDS setup step 2) Importing Data
For this Amazon Data Import Guide is perfect this gives following simple commands
2. a ) Create a file using pg_dump that contains the data to be loaded
pg_dump -U postgres -d blog_production > blog_production_for_rds.sql
This command will dump ‘blog_production’ db into blog_production_for_rds.sql
2. b) Create target DB instance which we did in Step 1
2. c) Use psql to create the database on the DB instance and load the data as
psql -f blog_production_for_rds.sql --host=some-identifire.c6c8mntzhgv0.us-west-2.rds.amazonaws.com --username=myawsuser --passwerd --dbname=blog_production
— username => should be username should be master which we created is step 1
— host => this is the endpoint give after RDS instance launch like some-identifire.c9lqugmslkvf.us-west-2.rds.amazonaws.com
— password enter if prompted
BANG!!! First Error => psql: FATAL: role “postgres” does not exist
As obvious since we trying to import data which is owned by ‘postgers’ user into RDS which does not have user ‘postgres’
For this I tried following ways
I) Creating ‘postgres’ user in RDS instance which is the easiest one, So I created one and
BANG!!! Second Error => ERROR: must be superuser to create superusers
This is also valid error since I connected to RDS DB as ‘MyDBUser’ which is the master user but non superuser,
RDS does not allow to create a user with role as superuser again since its having a default superuser ‘rds_admin’.
But we can certainly create a ‘postgers’ NOSUPERUSER with GRANT ‘rds_superuser’ to ‘postgres’ user from ‘MyDBUser’, So I created one and tried Step 2(2. c) with username = ‘postgres’, since ‘postgres’ user exists now.
BANG!!! Third Error =>
ERROR: must be owner of relation attachments
ERROR: must be owner of relation attachments_id_seq
This error because I generated attachments table through rails migration hence the of owner attachments table is ‘MyDBUser’
To resolve this I tried Step 2 (2. c) but with username = ‘MyDBUser’
Got Same error but for different table
ERROR: must be owner of relation scores
ERROR: must be owner of relation scores_id_seq
This error because I imported scores table into ‘blog_production’ as ‘postgres’ from third party databases.
II) Using pg_dumpall instead of pg_dump, but pg_dumpall will dump
a) all database users along with roles
b) all databases which is not necessary
But while trying to restore after pg_dumpall (a) will cause same Second Error as, ERROR: must be superuser to create superusers
The Best Solution I preferred and resolved all above errors was
– Change ‘blog_production’ owner to MyDBUser form ‘postgres’
– Change ALL table’s owner to MyDBUser form ‘postgres’
– Change ALL stored procedure’s owner to MyDBUser form ‘postgres’
– Change ALL DB View’s owner to MyDBUser form ‘postgres’
– Change ALL sequences owner to MyDBUser form ‘postgres’
Which means pull all data under the ownership of single user that is ‘MyDBUser’, because I was using this user to connect to database in Rails Application.
To achieve all above I wrote simple shell script and executed following script with following command
sh bin/change_db_owner blog_production MyDBUser
After execution of following script, ‘blog_production’ has all data owned by ‘MyDBUser’
Then I deleted RDS DB instance and followed Step 1 and Step 2, restored database without any error, lastly changed config/database.yml to point new RDS instance.
#!/bin/bash database=$1 new_owner=$2 #collecting all tables tables=`psql -U 'postgres' -qAt -c "select tablename from pg_tables where schemaname = 'public';" $database` #collecting all sequences sequences=`psql -U 'postgres' -qAt -c "select sequence_name from information_schema.sequences where sequence_schema = 'public';" $database` #collecting all views views=`psql -qAt -U 'postgres' -c "select table_name from information_schema.views where table_schema = 'public';" $database` #collecting all procedures procs=`psql -qAt -U 'postgres' -c "SELECT proname FROM pg_catalog.pg_namespace n JOIN pg_catalog.pg_proc p ON pronamespace = n.oid WHERE nspname = 'public';" $database` #changing db owner psql -U 'postgres' -qAt -c "ALTER DATABASE $database OWNER TO $new_owner;" $database #changing tables owner for tbl in $tables ; do psql -U 'postgres' -c "alter table $tbl owner to $new_owner" $database ; done #changing sequences owner for tbl in $sequences ; do psql -U 'postgres' -c "alter table $tbl owner to $new_owner" $database ; done #changing views owner for tbl in $views ; do psql -U 'postgres' -c "alter table $tbl owner to $new_owner" $database ; done #changing procedures owner for tbl in $procs ; do psql -U 'postgres' -c "alter function $tbl() owner to $new_owner" $database; done
I faced above errors because the database(blog_production) which was to be loaded into RDS instance was not owned by single user, because I imported data into ‘blog_production’ form various Third party database’s.