Saturday, 5 July 2014

Updating pg_cast helps, however sometimes !!

I've seen one of our customer is migrating a table from SQL Server to PostgreSQL using EnterpriseDB's Migration ToolKit.  This table has a boolean datatype column. In migration process, MTK converts datatype "boolean" to "bit" in PostgreSQL and the process was taking 6 hrs to complete. Customer wanted to change the datatype from "Bit" to "Integer" and alter command for changing type was taking another 6 hrs in PostgreSQL. If he migrates only structure to PostgreSQL first, and then change the type to "Integer" from "Bit", then it does not allow you to load the data with below error. If it allows, it takes only 6 hrs as no need of alter the type after data load.

ERROR: column "hidehelm" is of type integer but expression is of type boolean
 Hint: You will need to rewrite or cast the expression.

So I found a work around to type cast from "boolean" to "Integer" implicitly by updating "pg_cast" table as below. By this, he can directly load the boolean data into integer column which saves the time of altering the type from BIT to Integer after migrating.

postgres=# insert into tarik values (1::boolean);
ERROR:  column "t" is of type integer but expression is of type boolean
LINE 1: insert into tarik values (1::boolean);
HINT:  You will need to rewrite or cast the expression.
postgres=# select * from pg_cast where castsource='boolean'::regtype and casttarget='int4'::regtype;
castsource | casttarget | castfunc | castcontext | castmethod
------------+------------+----------+-------------+------------
         16 |         23 |     2558 | e           | f
(1 row)
postgres=# update pg_cast set castcontext ='i' where castsource='boolean'::regtype and casttarget='int4'::regtype;
UPDATE 1
postgres=# select * from pg_cast where castsource='boolean'::regtype and casttarget='int4'::regtype;
 castsource | casttarget | castfunc | castcontext | castmethod 
------------+------------+----------+-------------+------------
         16 |         23 |     2558 | i           | f
(1 row)
postgres=# insert into tarik values (1::boolean);
INSERT 0 1

Irrespective of any side effects(which I'm not aware of) of this workaround, this worked and migration had take only 6 hrs. Of-course, customer rolled back this setting after migrating the table. Updating catalogs is very dangerous, so might be I should have concentrated on how to reduce the time of ALTER command after migration?, anyways, it worked, so I was happy !! ;-)

Thanks for any suggestions/comments.

Friday, 4 July 2014

Oops I corrupted my table, of-course just to recover salvaged data.


The way I started the title might be confusing, "I corrupted my table", so everyone starts with "Crap !! why did you do that !!, ...", so just to justify it.......
I see many customers coming for recovering the corrupted tables without any backup. In such cases, hard to recover the tables completely and it needs lot of work, however we can recover salvaged data if they dont care about corrupted rows.

Let me corrupt the table first.. :-)

 I created a million-row table called "to_be_damaged"
postgres=# select count(*) from to_be_damaged ;
  count 
---------
 1000000
(1 row)
postgres=# select relfilenode,relname from pg_class where relname='to_be_damaged';
relfilenode |    relname   
-------------+---------------
       461257 | to_be_damaged

(1 row)
I've used "hexedit" to damage it. Open relfilenode file from OS level using hexedit and try picking a line which is the start of an 8K boundary and typing hex DE AD BE EF across it.

postgres=# select count(*) from to_be_damaged ;
ERROR:  invalid page in block 0 of relation base/12896/461257

Now create an identical table "salvaged" to recover salvaged data from "to_be_damaged" table.
postgres=# create table salvaged(t int);
CREATE TABLE
Prepared below function which copies the rows which are still salvageable:
create or replace function salvage_damaged()
  returns void
  language plpgsql
as $$
declare
  pageno int;
  tupno int;
  pos tid;
begin
  <<pageloop>>
  for pageno in 0..35930 loop  -- pg_class.relpages for the damaged table
    for tupno in 1..1000 loop
      pos = ('(' || pageno || ',' || tupno || ')')::tid;
      begin
        insert into salvaged select * from to_be_damaged where ctid = pos;
      exception
        when sqlstate 'XX001' then
          raise warning 'skipping page %', pageno;
          continue pageloop;
        when others then
          raise warning 'skipping row %', pos;
      end;
    end loop;
  end loop;
end;
$$;

Now run the function to copy salvagable rows:
postgres# select salvage_damaged();
WARNING: skipping page 0
salvage_damaged
-----------------

(1 row)
postgres=# select count(*) from salvaged ;
count
----------
12999815
(1 row)

postgres=# select 13000000-12999815;
?column?
----------
185
(1 row)

I hope it helps someone. Thanks for reading.

Sunday, 20 April 2014

Difference between Warm, hot standby and Streaming Replication:


We all know about replication and its types in postgresql. There are basic 3 types of replication in postgresql i.e  Warm, hot standby and Streaming Replication. I used to get confused between these three and was unable to find the difference when I was in my initial stages of postgres learning. However I understand the difference later by reading the docs. So I would like to post key differences between these 3 types of replication in short(it also helps if anyone wants to find which one is configured for his/her environment).

Warm Standby:
==========
Its introduced in PostgreSQL 8.3(IIRC).

1. It is based on WAL log shipping which typically means WAL archives generated on Master will be transferred and applied at Standby side. So Warm Standby always waits for the WAL archive in which Master is currently writing and keeps throw messages like "cp: cannot stat <archive>: No such file or directory". So it is always one archive behind than Master and data loss will be max of 16MB(assuming a healthy warm standby by :-) )

2. In postgresql.conf file, you would need to change just three parameters in master; wal_level to archive, archive_mode and archive_command, however nothing in postgresql.conf file at standby side.
On Master:
wal_level = archive
archive_mode    = on
archive_command = 'cp %p /path_to/archive/%f'

3. In recovery.conf file, three parameters; standby_mode, restore_command and trigger_file.
4. You cannot connect to Standby, so database is not even open for read operations (or read operations are not permitted on db).

Detailed explanation and related docs are here
Hot Standby:
========
Its introduce in PostgreSQL 9.0.

1. It is also based on WAL log shipping(same as warm standby). And of-course, WALs will be transferred and applied at Standby, so one WAL behind and always waits for the WAL archive in which Master is currently writing.
2. In postgresql.conf file, you would need to change wal_level to hot_standby, archive_mode and archive_command. Since you'll likely want to use pg_basebackup you should also set max_wal_senders to at least 2 or 3. And hot_stanby = on in standby conf file.
On Master:
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = 32
archive_mode    = on
archive_command = 'cp %p /path_to/archive/%f'
On Slave:
hot_standby = on
3. In recovery.conf file, three parameters; standby_mode, restore_command and trigger_file. 4. You can connect to Standby for read queries(you should set hot_stanby to ON in standby postgresql.conf file). 

Detailed explanation and related docs are here.
Steaming Replication: 
==============
 Its introduced in PostgreSQL 9.0.
1. XLOG records generated in the primary are periodically shipped to the standby via the network. XLOG records shipped are replayed as soon as possible without waiting until XLOG file has been filled. The combination of Hot Standby and SR would make the latest data inserted into the primary visible in the standby almost immediately. So minimal data loss(almost only open transactions will be lost if its async rep, 0 loss if it is sync rep) 2. In postgresql.conf file, this time 5 parameters, streaming related params like below:
On Master:
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = 32
archive_mode    = on
archive_command = 'cp %p /path_to/archive/%f'
On Slave:
hot_standby=on
3. In recovery.conf file, you would need to an extra parameter including three which you add in hot/warm standby. i.e primary_conninfo, so below are four parameters:
standby_mode          = 'on'
primary_conninfo      = 'host=192.168.0.10 port=5432 user=postgres'
trigger_file = '/path_to/trigger'
restore_command = 'cp /path_to/archive/%f "%p"'
4. You can connect to Standby for read queries(you should set hot_stanby to ON in standby postgresql.conf file).
Detailed explanation and related docs are here:  && http://bajis-postgres.blogspot.in/2013/12/step-by-step-guide-to-setup-steaming.html

Kindly let me know if I miss anything.

Saturday, 5 April 2014

Anyone wants to change the tablespaces locations while running pg_upgrade ?? isn't it easy !!

As we all know pg_upgrade handles tablespaces in a smart way. However it creates the tablespaces in the same location of old ones. If you want to change the location after upgrade, then you would need some work including updating the catalog table with the new location(IIRC, in PG9.4, you can do it by using ALTER TABLESPACE command). So, for current versions I would like to give you a work around to change the locations of tablespaces while running the pg_upgrade. This tweak is also applicable, If you have any contrib modules installed in databases of old cluster, you have to install them in new cluster also, however we can not create databases in new cluster as it should be empty to run pg_upgrade. "pg_upgrade" creates the databases using template "template 0" in the new cluster. 

You can follow below steps to tweak the running pg_upgrade process.

1. I've upgraded PG9.1 cluster "Im_old_one" to PG9.3 cluster "Hey_Im_New", as you see below it is created new tablespace in the same directory.

-bash-4.1$ /opt/PostgreSQL/9.1/bin/psql -p 5666 -U postgres postgres
Timing is on.
psql.bin (9.1.7)
Type "help" for help.


postgres=# CREATE TABLESPACE tablsc_91 LOCATION '/tmp/tablsc_91';
CREATE TABLESPACE
Time: 1.663 ms
-bash-4.1$ /opt/PostgreSQL/9.3/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin/ -B /opt/PostgreSQL/9.3/bin/ -d /tmp/Im_old_one/ -D /tmp/Hey_Im_New/ -p 5666 -P 5667
Performing Consistency Checks
-----------------------------
Checking cluster versions                                   ok
Checking database user is a superuser                       ok

.
.
Upgrade Complete
----------------
Optimizer statistics are not transferred by pg_upgrade so,
once you start the new server, consider running:
    analyze_new_cluster.sh

Running this script will delete the old cluster's data files:
    delete_old_cluster.sh
-bash-4.1$
-bash-4.1$
-bash-4.1$ ls -ltrh /tmp/tablsc_91/
total 8.0K
drwx------. 2 postgres postgres 4.0K Apr  5 14:25 PG_9.1_201105231
drwx------. 2 postgres postgres 4.0K Apr  5 14:29 PG_9.3_201306121
2. Now delete the new cluster and run the pg_upgrade command and keep monitoring it.
3. Press control + z (pause the job) just after the catalog dump is created. You can pause the job when you see the below output:
-bash-4.1$ /opt/PostgreSQL/9.3/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin/ -B /opt/PostgreSQL/9.3/bin/ -d /tmp/Im_old_one/ -D /tmp/Hey_Im_New/ -p 5666 -P 5667
Performing Consistency Checks
-----------------------------
Checking cluster versions                                   ok
Checking database user is a superuser                       ok
Checking for prepared transactions                          ok
Checking for reg* system OID user data types                ok
Checking for contrib/isn with bigint-passing mismatch       ok
Creating dump of global objects                             ok
Creating dump of database schemas
                                                            ok ---> press control + z just after this.
^Z


**Note : If we fail to pause job after this "Creating catalog dump" step, then please kill this job and re-initiate the new pg_upgrade process. The re-initiate of pg_upgrade does not require any creation of new cluster or any other changes.

4. Open the global object dump and modify CREATE TABLESAPCE command to your new location:

-bash-4.1$ vi pg_upgrade_dump_globals.sql
Above file will be created in the location from where you run pg_upgrate command, open the file and change the create tablespace command from old location to new location.
Ex:-
Change all the CREATE TABLESPACE statements in the file "pg_upgrade_dump_globals.sql"

From
CREATE TABLESPACE "tablsc_91" OWNER "postgres" LOCATION '/tmp/tablsc_91';
To
CREATE TABLESPACE "tablsc_91" OWNER "postgres" LOCATION '/tmp/tablsc_93';
5.  Once modifications has been completed, then resume the job by using "fg" command. It should continue with the restoration process.

Any suggestions/comments would be most welcome!

Saturday, 29 March 2014

Is it failback or switchover with PostgreSQL? Ah, maybe not...

Sometimes I feel like I found something new in postgres, but most of the times it turned out to be known one, I mean, already known by people. I believe, this is a kinda same thing now(this might be known by many already). I see many asking about switchover/failback in PostgreSQL and I always suggest to go for a fresh backup of new master(after failover) to sync old master as a slave to it. Of course, it is the correct procedure. However there is something to achieve this(like below procedure): 
 1. Setup steaming replication between two clusters i.e "Master_but_Slave_soon" with 8888 port and "Slave_gonna_Master" with 8889. See here to setup SR.
-bash-4.1$ /opt/PostgreSQL/9.3/bin/pg_ctl -D /tmp/rep/Slave_gonna_Master/ start
server starting
-bash-4.1$ LOG:  database system was interrupted; last known up at 2014-03-30 09:26:59 IST
LOG:  entering standby mode
LOG:  restored log file "000000010000000000000003" from archive
LOG:  redo starts at 0/3000028
LOG:  consistent recovery state reached at 0/30000F0
LOG:  database system is ready to accept read only connections
cp: cannot stat `/tmp/rep/arch/000000010000000000000004': No such file or directory
LOG:  started streaming WAL from primary at 0/4000000 on timeline 1
-bash-4.1$
-bash-4.1$ ./pg_ctl -D /tmp/rep/Master_but_Slave_soon/ status
pg_ctl: server is running (PID: 26531)
/opt/PostgreSQL/9.3/bin/postgres "-D" "/tmp/rep/Master_but_Slave_soon"
-bash-4.1$
-bash-4.1$
-bash-4.1$ ./psql -p 8888 postgres<
Timing is on.
psql.bin (9.3.2)
Type "help" for help.

postgres=# \x 
Expanded display is on. 
postgres=# select * from pg_stat_replication; 
-[ RECORD 1 ]----+--------------------------------- 
pid              | 26651 
usesysid         | 10 
usename          | postgres 
application_name | walreceiver 
client_addr      | 127.0.0.1 
client_hostname  | 
client_port      | 26058 
backend_start    | 2014-03-30 09:28:59.204298+05:30 
state            | streaming 
sent_location    | 0/4000000 
write_location   | 0/4000000 
flush_location   | 0/4000000 
replay_location  | 0/4000000 
sync_priority    | 0 
sync_state       | async 
 
Time: 155.089 ms 
postgres=# 
postgres=# 
postgres=# \q 
-bash-4.1$ 
-bash-4.1$ 
-bash-4.1$ ./pg_ctl -D /tmp/rep/Slave_gonna_Master/ status 
pg_ctl: server is running (PID: 26643) 
/opt/PostgreSQL/9.3/bin/postgres "-D" "/tmp/rep/Slave_gonna_Master" 
-bash-4.1$ 
-bash-4.1$ 
-bash-4.1$ ./psql -p 8889 postgres 
Timing is on. 
psql.bin (9.3.2) 
Type "help" for help. 
 
postgres=# 
postgres=# select pg_is_in_recovery(); 
 pg_is_in_recovery 
------------------- 
 t 
(1 row) 
 
Time: 3.031 ms 
postgres=#
2. Go for Failover now by creating trigger file. Now our Slave is new Master.
-bash-4.1$ cat /tmp/rep/Slave_gonna_Master/recovery.conf 
standby_mode=on 
primary_conninfo='host=localhost port=8888' 
restore_command='cp /tmp/rep/arch/%f %p' 
trigger_file='/tmp/rep/8888.trigger' 
 
-bash-4.1$ touch /tmp/rep/8888.trigger 
LOG:  trigger file found: /tmp/rep/8888.trigger 
LOG:  redo done at 0/5000028 
cp: cannot stat `/tmp/rep/arch/000000010000000000000005': No such file or directory 
cp: cannot stat `/tmp/rep/arch/00000002.history': No such file or directory 
LOG:  selected new timeline ID: 2 
cp: cannot stat `/tmp/rep/arch/00000001.history': No such file or directory 
LOG:  archive recovery complete 
LOG:  autovacuum launcher started 
LOG:  database system is ready to accept connections 
 
-bash-4.1$ 
-bash-4.1$ /opt/PostgreSQL/9.3/bin/psql -p 8889 -U postgres postgres 
Timing is on. 
psql.bin (9.3.2) 
Type "help" for help. 
 
postgres=# 
postgres=# select pg_is_in_recovery(); 
 pg_is_in_recovery 
------------------- 
 f 
(1 row) 
 
3. Once failover is done, timeline ID of WALs will be changed.
bash-4.1$ ls -ltrh /tmp/rep/Slave_gonna_Master/pg_xlog/ 
total 97M 
-rw-------. 1 postgres postgres  16M Mar 30 09:27 000000010000000000000002 
-rw-------. 1 postgres postgres  16M Mar 30 09:27 000000010000000000000001 
-rw-------. 1 postgres postgres  16M Mar 30 09:28 000000010000000000000003 
-rw-------. 1 postgres postgres  16M Mar 30 09:35 000000010000000000000004 
-rw-------. 1 postgres postgres  16M Mar 30 09:35 000000010000000000000005 
-rw-------. 1 postgres postgres   41 Mar 30 09:36 00000002.history 
drwx------. 2 postgres postgres 4.0K Mar 30 09:36 archive_status 
-rw-------. 1 postgres postgres  16M Mar 30 09:41 000000020000000000000005 
-bash-4.1$ 
-bash-4.1$ 
-bash-4.1$ 
-bash-4.1$ cat /tmp/rep/Slave_gonna_Master/pg_xlog/*.history 
1  0/5000090    no recovery target specified
4. Now shutdown the Master and create a recovery.conf file(change the port number to new Master for primary_conninfo parameter) and change hot_standby parameter to on in postgresql.conf and add pg_hba.conf entries in new Master to allow connections.
-bash-4.1$ /opt/PostgreSQL/9.3/bin/pg_ctl -D /tmp/rep/Master_but_Slave_soon/ stop -mf 
waiting for server to shut down....LOG:  received fast shutdown request 
LOG:  aborting any active transactions 
LOG:  autovacuum launcher shutting down 
LOG:  shutting down 
LOG:  database system is shut down 
-bash-4.1$ 
-bash-4.1$ 
-bash-4.1$ 
-bash-4.1$ cp -R /tmp/rep/Slave_gonna_Master/recovery.done /tmp/rep/Master_but_Slave_soon/recovery.conf 
-bash-4.1$
5. Now move pg_control of Old Master to pg_control.old and Copy the pg_control of New Master to Old Master(I'm not sure if copying pg_control between replication clusters while they are in sync is recommended).
-bash-4.1$ mv /tmp/rep/Master_but_Slave_soon/global/pg_control /tmp/rep/Master_but_Slave_soon/global/pg_control.old 
-bash-4.1$ 
-bash-4.1$ cp -R /tmp/rep/Slave_gonna_Master/global/pg_control /tmp/rep/Master_but_Slave_soon/global/ 
6. Now try to start the Old Master:
-bash-4.1$ ./pg_ctl -D /tmp/rep/Master_but_Slave_soon/ start 
server starting 
-bash-4.1$ LOG:  database system was interrupted; last known up at 2014-03-30 09:36:11 IST 
LOG:  entering standby mode 
LOG:  restored log file "00000002.history" from archive 
LOG:  record with zero length at 0/50000F8 
LOG:  invalid primary checkpoint record 
LOG:  using previous checkpoint record at 0/4000060 
LOG:  database system was not properly shut down; automatic recovery in progress 
LOG:  redo starts at 0/4000028 
LOG:  record with zero length at 0/5000090 
LOG:  consistent recovery state reached at 0/5000090 
LOG:  database system is ready to accept read only connections 
cp: cannot stat `/tmp/rep/arch/000000020000000000000005': No such file or directory 
LOG:  restored log file "000000010000000000000005" from archive 
LOG:  record with zero length at 0/5000090 
LOG:  started streaming WAL from primary at 0/5000000 on timeline 2 
7. Check the replication status between New Master(Old Slave) and Old Master(New Slave).
-bash-4.1$ /opt/PostgreSQL/9.3/bin/psql -p 8889 -U postgres postgres 
Timing is on. 
psql.bin (9.3.2) 
Type "help" for help. 
 
postgres=# 
postgres=# select pg_is_in_recovery(); 
 pg_is_in_recovery 
------------------- 
 f 
(1 row) 
 
Time: 1.469 ms 
postgres=# \x 
Expanded display is on. 
postgres=# select * from pg_stat_replication ; 
-[ RECORD 1 ]----+--------------------------------- 
pid              | 26948 
usesysid         | 10 
usename          | postgres 
application_name | walreceiver 
client_addr      | 127.0.0.1 
client_hostname  | 
client_port      | 52443 
backend_start    | 2014-03-30 09:38:09.862199+05:30 
state            | streaming 
sent_location    | 0/5000340 
write_location   | 0/5000340 
flush_location   | 0/5000340 
replay_location  | 0/5000340 
sync_priority    | 0 
sync_state       | async 
 
Time: 52.213 ms 
postgres=# 
postgres=# 
postgres=# 
postgres=# create table after_failover(t int); 
CREATE TABLE 
Time: 67.959 ms 
postgres=# 
postgres=# 
postgres=# \q 
-bash-4.1$ /opt/PostgreSQL/9.3/bin/psql -p 8888 -U postgres postgres 
Timing is on. 
psql.bin (9.3.2) 
Type "help" for help. 
 
postgres=# 
postgres=# select pg_is_in_recovery(); 
 pg_is_in_recovery 
------------------- 
 t 
(1 row) 
 
Time: 1.411 ms 
postgres=# \dt after_failover 
             List of relations 
 Schema |      Name      | Type  |  Owner   
--------+----------------+-------+---------- 
 public | after_failover | table | postgres 
(1 row) 
Though it works here, I don't have any other test results as it might not work in other situations.

Saturday, 8 March 2014

Want to ensure that Postgres generates core dump after server crash?


As you all know, we have to start the server with "-c" option in order to generate core dump during server crash. However if you want to know whether your server is already started to generate core dump, then here you go:
Start  the PostgreSQL cluster with core dump option
.pg_ctl -c -D <Data directory location>

Here are the steps to find whether the server is started with -c option.

1. Check the procpid of the postmaster. Go to the location /proc/<procpid of postmaster>.
2. At this location, you can verify limits file to check whether server has started with "-c" option. If you do not have any limitations for enterprisedb user in "limits.conf" file at "/etc/security" location, then the "Soft Limit" of "Max core file size" in "limits" file will be "unlimited". The entry in limits file should be like below:

Limit Soft Limit Hard Limit Units
Max core file size unlimited unlimited bytes

If "Soft Limit" of "Max core file size" in "limits" file is "0", it indicates that your postmaster was not started with "-c" option.

Limit Soft Limit Hard Limit Units
Max core file size 0 unlimited bytes


If you have any limitations for all users or enterprisedb user in "limits.conf" file under "/etc/sercurity", then the above value for Soft Limit or Hard Limit will depends on value in "limits.conf" file.

In simple way, if your postmaster has started with "-c" option, then the "Soft Limit" of "Max core file size" in "limits" file should be some value or unlimited otherthan "0"(which depends on limits.conf file).

If we have started our postmaster with "-c" option, a core dump file will be generated under $PG_DATA directory during a backend crash. A core dump will be generated by the operating system, and we will be able to attach gdb to it to collect a stack trace or other information.

Here is one simple test case which shows us how core dump will be generated:

1. We have started the postmaster with "-c" option.
-bash-4.1$ /opt/PostgreSQL/9.3/bin/pg_ctl -D /opt/PostgreSQL/9.3/data -c start
server starting

2. Go to the location /proc/<procpid of postmaster> and check the limits file to know whether postmaster was started with "-c" option.
-bash-4.1$ ps -ef|grep PostgreSQL|grep 9.3
postgres 48892     1  0 14:28 pts/15   00:00:00 /opt/PostgreSQL/9.3/bin/postgres -D /opt/PostgreSQL/9.3/data
postgres 48980 46943  0 14:30 pts/15   00:00:00 grep PostgreSQL
-bash-4.1$
-bash-4.1$ cd /proc/48892/
-bash-4.1$ ls -ltrh


enterprisedb@ubuntu:/proc/6711$ ls -ltr limits
-r-------- 1 enterprisedb enterprisedb 0 2012-08-28 05:15 limits
-bash-4.1$ grep 'core' limits
Max core file size        unlimited            unlimited            bytes   
-bash-4.1$ cat limits
Limit                     Soft Limit           Hard Limit           Units   
Max cpu time              unlimited            unlimited            seconds 
Max file size             unlimited            unlimited            bytes   
Max data size             unlimited            unlimited            bytes  
Max stack size            10485760             unlimited            bytes   
Max core file size        unlimited            unlimited            bytes   
Max resident set          unlimited            unlimited            bytes   
Max processes             1024                 15777                processes
Max open files            1024                 1024                 files   
Max locked memory         65536                65536                bytes   
Max address space         unlimited            unlimited            bytes   
Max file locks            unlimited            unlimited            locks   
Max pending signals       15777                15777                signals 
Max msgqueue size         819200               819200               bytes   
Max nice priority         0                    0                   
Max realtime priority     0                    0                   
Max realtime timeout      unlimited            unlimited            us      

3. For example, aborting any backend process will generate a core dump file under $PG_DATA directory.
-bash-4.1$ ps -ef|grep 48892
postgres 48892     1  0 14:28 pts/15   00:00:00 /opt/PostgreSQL/9.3/bin/postgres -D /opt/PostgreSQL/9.3/data
postgres 48893 48892  0 14:28 ?        00:00:00 postgres: logger process         postgres 49090 48892  0 14:32 ?        00:00:00 postgres: checkpointer process   postgres 49091 48892  0 14:32 ?        00:00:00 postgres: writer process         postgres 49092 48892  0 14:32 ?        00:00:00 postgres: wal writer process     
postgres 49093 48892  0 14:32 ?        00:00:00 postgres: autovacuum launcher process           
postgres 49094 48892  0 14:32 ?        00:00:00 postgres: stats collector process
postgres 49114 46943  0 14:32 pts/15   00:00:00 grep 48892
-bash-4.1$
-bash-4.1$ kill -ABRT 49091 
-bash-4.1$ ls -ltrh /opt/PostgreSQL/9.3/data/core*
-rw-------. 1 postgres postgres 143M Mar  8 14:33 /opt/PostgreSQL/9.3/data/core.49091
4. We can attach gdb to core dump to collect a stack trace by using below command.
-bash-4.1$ gdb -q -c /opt/PostgreSQL/9.3/data/core.49091 /opt/PostgreSQL/9.3/bin/postgres

we should have debug symbols in our installer which was used for installation to get proper trace?, maybe, I'm poor at that, so better stay here.. ;-)

Saturday, 1 March 2014

Huge archive generation with PostgreSQL cluster??

Recently I found huge archive logs generation on one of the servers(for my client). Around 3500 files per day and it needs 3500 * 16 MB = 54GB space every day and space is low on the server. First thing came to my mind was checkpoint parameters, I had a look at them and they were at default. Hmm, so had changed them as below:

checkpoint_segments = 180
checkpoint_timeout = 30min
checkpoint_completion_target = 0.9 

Monitored for couple of days and found that archive logs generation decreased to 50%(around 1700 files). 

With the above parameter settings, archive log generation should be minimal. However if not, Here are some general suggestions to look at:

-- Make sure that you don't UPDATE a row to the same values it already has. The row will be logged even if old and new values match. One way to handle this is to make sure that any UPDATE statements which might set a column to its existing value include in the WHERE clause a test that oldvalue IS DISTINCT FROM new value. Or you can consider creating trigger on problem tables using the Postgres built-in function suppress_redundant_updates_trigger (you might want to look here: http://www.postgresql.org/docs/9.3/static/functions-trigger.html) 

-- Use one UPDATE statement instead of several where possible. If updates to different columns or from multiple processing steps can be combined, this can reduce WAL logging.

-- Consider using UNLOGGED or TEMPORARY tables where possible. For example, if data is inserted into the database and updated right away, perform the initial steps in an UNLOGGED or TEMPORARY table, and INSERT into the main table once the data has been prepared.

-- Identify and eliminate unused indexes. Index changes are WAL-logged so that indexes can be recovered on a crash. The storage space needed for indexes can be more than the storage space needed for the data itself, with a corresponding increase in generated WAL.