Saturday, 7 February 2015

Someone asked me.. "how to change the location of core file generated by postgres".

I remember someone asking me about changing the location of core file generated by postgres. We all know it creates under PGDATA by default, however some people want to avoid that as core file size will be huge some times and eats all space of data directory which will turn into shutdown of cluster. So I thought it will be good if we have an article which shows changing location.

On Linux servers, core file generation can be enabled by running "ulimit -c unlimited" before starting the server, or by using the -c option to pg_ctl start. On Windows, if you're running PostgreSQL 9.1, you can create a "crashdumps" subdirectory inside the data directory.  On earlier versions, it's harder.

Before enabling/disabling, if you want to verify if your cluster started to generate core files or not, then check this. Ok, I have enabled core file gneration for my cluster, let change the location. Here are detailed steps:

-- Start the cluster using "-c" option(cluster user must be set to generate core files).
-- Check the core file pattern as root user using below command:
[root@localhost ~]# sysctl kernel.core_pattern
kernel.core_pattern = |/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e

-- Change the kernel.core_pattern to the location in which you want to generate core files(Contact your Admin to do that). Please note that location given in kernel.core_pattern must be writable by the cluster user, or else the kernel will decline to write a core file there.
[root@localhost ~]# echo "kernel.core_pattern=/tmp/core-%e-%s-%u-%g-%p-%t" >> /etc/sysctl.conf
[root@localhost ~]# tail -5 /etc/sysctl.conf
# max OS transmit buffer size in bytes
net.core.wmem_max = 1048576
fs.file-max = 6815744
########
kernel.core_pattern=/tmp/core-%e-%s-%u-%g-%p-%t
[root@localhost ~]#
[root@localhost ~]#
[root@localhost ~]# sysctl -p |tail -5
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
fs.file-max = 6815744
kernel.core_pattern = /tmp/core-%e-%s-%u-%g-%p-%t
[root@localhost ~]#
%% - A single % character
%p - PID of dumped process
%u - real UID of dumped process
%g - real GID of dumped process
%s - number of signal causing dump
%t - time of dump (seconds since 0:00h, 1 Jan 1970)
%h - hostname (same as ’nodename’ returned by uname(2))
%e - executable filename

-- Verify if cluster is started to generate core file.
bash-4.1$ ps -ef|grep data|grep "9.3"
504       3405     1  0 20:44 ?        00:00:00 /opt/PostgresPlus/9.3AS/bin/edb-postgres -D /opt/PostgresPlus/9.3AS/data
postgres  6155     1  0 21:37 pts/0    00:00:00 /opt/PostgreSQL/9.3/bin/postgres -D /opt/PostgreSQL/9.3/data
bash-4.1$ grep -i core /proc/6155/limits
Max core file size        unlimited            unlimited            bytes  
bash-4.1$

-- Check if cluster crash creates core files in the given location. Let me kill a process to do generate core.
bash-4.1$ ps -ef|grep 6155
postgres  6155     1  0 21:37 pts/0    00:00:00 /opt/PostgreSQL/9.3/bin/postgres -D /opt/PostgreSQL/9.3/data
postgres  6156  6155  0 21:37 ?        00:00:00 postgres: logger process                                  
postgres  6158  6155  0 21:37 ?        00:00:00 postgres: checkpointer process                            
postgres  6159  6155  0 21:37 ?        00:00:00 postgres: writer process                                  
postgres  6160  6155  0 21:37 ?        00:00:00 postgres: wal writer process                              
postgres  6161  6155  0 21:37 ?        00:00:00 postgres: autovacuum launcher process                    
postgres  6162  6155  0 21:37 ?        00:00:00 postgres: stats collector process                        
postgres  6527  6001  0 21:38 pts/0    00:00:00 grep 6155

bash-4.1$ kill -ABRT 6159  -- Killing writer process to get core dump.
bash-4.1$
bash-4.1$
bash-4.1$ ls -ltrh /tmp/core*postgre*
-rw-------. 1 postgres postgres 143M Feb  7 21:41 /tmp/core-postgres-6-501-501-6159-1423325468
bash-4.1$
bash-4.1$ date
Sat Feb  7 21:41:25 IST 2015

-- Check the log entries
-bash-4.1$ grep "6159"  postgresql-2015-02-07_213749.log
2015-02-07 21:41:09 IST LOG:  background writer process (PID 6159) was terminated by signal 6: Aborted

Wow, it generated in new location. Any comments/suggestions are most welcome.