Troubleshooting
Customer a Linux system reboot frequently, engineers received a repair call after logging in to view,
found a large number of oom in the log, memory overflow alarms, each time after reboot,
do not start any application, the host memory is quickly occupied; at the same time,
the rest of the clients use sftp to transfer data, often due to memory overflow caused by the failure
of the transmission, affecting the business.
Troubleshooting
2.1 Messages file contents
Dec 24 15:49:55 xxxxxxxxxxxx kernel: [ 52.671197] BIOS EDD facility v0.16 2004-Jun-25, 2 devices found
Dec 24 15:49:56 xxxxxxxxxxxx kernel: [ 53.763632] oom_kill_process: 21 callbacks suppressed
2.2 System resource utilization
I communicated with the customer and learned that PotralAgent is a regular process, and other machines also have this process,
and there is no abnormality in resource usage.
It can be seen that the host machine is configured with 8G of RAM, which is very little left,
and at the same time some of the swap is also used,indicating that memory resources are tight.
2.3 Memory Usage Analysis
Comparison of memory information before and after reboot
before the reboot | after reboot |
MemTotal: 8062712 kB MemFree: 590024 kB Buffers: 2584 kB Cached: 30332 kB SwapCached: 832 kB | MemTotal: 8062712 kB MemFree: 566876 kB Buffers: 968 kB Cached: 16080 kB SwapCached: 552 kB |
AnonHugePages: 0 kB | AnonHugePages: 0 kB |
Slab: 35932 kB SReclaimable: 4900 kB SUnreclaim: 31032 kB KernelStack: 1704 kB PageTables: 3916 kB
| Slab: 35308 kB SReclaimable: 3736 kB SUnreclaim: 31572 kB KernelStack: 1304 kB PageTables: 2432 kB |
HugePages_Total: 3559 HugePages_Free: 3559 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB | HugePages_Total: 3580 HugePages_Free: 3580 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB |
CommitLimit: 10872696 kB Committed_AS: 197048 kB | CommitLimit: 10851192 kB Committed_AS: 542772 kB |
DirectMap4k: 57344 kB DirectMap2M: 8331264 kB | DirectMap4k: 57344 kB DirectMap2M: 8331264 kB |
Through the above analysis, the memory over-subscription problem is not serious and THP is not turned on.
The cause of the failure has been obvious: 2M page cache has as much as 8GB (8331264/1024/1024), plus other memory overhead,
has clearly exceeded the sum of the system's physical memory.
2.4 Check the system configuration
cat /proc/sys/vm/nr_hugepages
3559
cat /etc/sysctl.conf|grep nr_hugepages
vm.nr_hugepages = 5120
You can see that the system is configured with 5,120 pages, but in reality, 3,559 pages are allocated in the actual operation.(3559*2048/2=7GB)。
Troubleshooting
By turning down the nr_hugepages value or canceling it, it will not cause the system memory to be full,
and the problem is solved.
Lessons Learned
4.1 Introduction to HugePages
As the size of computing requirements continues to increase, so does the demand for memory by applications.
In order to realize the virtual memory management mechanism, the operating system implements memory paging.
Since the inception of the memory "paging mechanism", the default size of memory pages has been set to 4096 bytes (4KB).
Although in principle the memory page size is configurable, the majority of operating system implementations still use the default 4KB pages.
The 4KB page size was reasonable when the "paging mechanism" was introduced because memory was only a few tens of megabytes at that time,
but when physical memory grew to several gigabytes or even dozens of gigabytes, is it still reasonable for operating systems to still use
the 4KB page size as the basic unit?
When running memory-hungry applications on the Linux operating system, the default page size of 4KB generates a lot of TLB misses and out-of-page interrupts,
which greatly affects the performance of the application.
When the operating system uses 2MB or more as the paging unit, the number of TLB misses and page miss interrupts will be greatly reduced,
which will significantly improve the performance of the application. This is the direct reason why the Linux kernel introduced large page support.
The benefits are obvious.
Assuming that an application requires 2MB of memory, if the OS uses 4KB as the paging unit, then it requires 512 pages,
which in turn requires 512 table entries in the TLB,
and 512 page table entries,
and the OS needs to go through at least 512 TLB Misses and 512 Page Misses in order to map all of the 2MB of application space to physical memory;
however, when the OS uses 2KB as the paging unit,
then it requires 512 page entries in the TLB, and 512 page misses in the TLB. However, when the OS adopts 2MB as the basic unit of paging,
only 1 TLB Miss and 1 page miss interrupt
are needed to create a real-virtual mapping for 2MB of application space,
and no more TLB Misses and page miss interrupts are needed during operation (assuming no TLB entry replacement and swap).
In order to achieve large page support at minimal cost, the Linux operating system uses a hugetlbfs-based special file system for 2M-byte large page support.
This special file system approach to large page support allows applications the flexibility to choose the virtual page size as needed without being forced to use 2MB pages.
4.2 THP
Transparent Huge Pages, abbreviated as THP, Transparent Huge Pages (THP) is enabled by default for all applications in RHEL 6.
The kernel tries to allocate huge pages whenever possible,
and the main kernel address space itself is mapped as a huge page, reducing the TLB pressure on the kernel code.
The kernel will always try to use giant pages to fulfill memory allocations.
If no huge pages are available (e.g. due to physical contiguous memory being unavailable), the kernel will fall back to normal 4KB pages.
THP is also swappable (unlike hugetlbfs).
This is accomplished by splitting large pages into smaller 4KB pages, which are then swapped out normally.
4.3 THP Considerations
A static huge page has a separate memory system from the normal memory consisting of 4KB normal pages,
and a static huge page also does not support swap operations andcannot be swapped out to external storage media.
THP and static huge page look similar, but their attributes and behaviors in Linux are completely different.
It can be understood in this way,
the latter is molded in one piece,
while the former is like welding together.
THP is still more similar to the normal page in nature.
THP is more like a normal page in nature.
When you need to swap, when you swap out, you break it up into 4KB and swap it to disk space;
when you swap in, you may need to re-aggregate it again,
which has a significant impact on performance.
Oracle does not officially recommend turning on Transparent Huge Pages (THP) when using
RedHat 6/OEL 6/SLES 11 / UEK2 kernels because there are some issues with Transparent Huge Pages:
In a RAC environment, Transparent Huge Pages (THP) can cause abnormal node restarts and performance issues;
In a standalone environment.
Knowledge Expansion
5.1 Open HugePages
Edit sysctl.conf
vi /etc/sysctl.conf
vm.nr_hugepages = xxxx
Edit limits.conf
vi /etc/security/limits.conf
* soft memlock -1
* hard memlock -1
go into effect
sysctl -p
validate (a theory)
grep -i hugepages /proc/meminfo
5.2 Oracle's Use of HugePages
Prior to 11.2.0.2, the SGA of a database could only choose to use all or none of the hugepages;
11.2.0.2 and later, oracle adds a new parameter "USE_LARGE_PAGES" to manage how the database uses hugepages;
The USE_LARGE_PAGES parameter has three values: "true" (default), "only", "false" and "auto" (since 11.2.0.3 patchset):
The default value is "true", if the system sets Hugepages, SGA will prioritize the use of hugepages and use as many as possible;
11.2.0.2 If there are not enough hugepages, the SGA will not use them. This will result in an ORA-4030 error because hugepages
have been allocated from physical memory, but instead of using it, the SGA uses some other part of memory,
resulting in insufficient memory resources;
However, in version 11.2.0.3, this usage policy has been changed so that the SGA can use some of the hugepages and the rest of
the small pages, so that the SGA will use a limited number of hugepages,
and then use the regular sized pages after the hugepages have been used up.
If set to "false", the SGA will not use hugepages;
If set to "only", the database instance cannot be started if the size of the hugepages is insufficient (to prevent memory overflow);
After version 11.2.0.3, it can be set to "auto", an option that triggers the oradism process to reconfigure the linux kernel to increase
the number of hugepages. oradism needs to be given the appropriate permissions, as follows:
-rwsr-x--- 1 root It will not bother to change the hugepages value in the /etc/sysctl.conf file, when the OS reboots,
the system will revert to the hugepages value configured in /etc/sysctl.conf again.
For Oracle-only servers, setting the Hugepage to the SGA (sum of all instance SGAs) size is sufficient;
If you increase the HugePage or add physical memory or if new instances are added to the current server and the SGA changes,
you should reset the required HugePage.
5.3 Turning off THP
To see if THP is turned on
Edit /etc/rc.local and add the following:
echo "never" >/sys/kernel/mm/transparent_hugepage/enabled
echo "never" >/sys/kernel/mm/transparent_hugepage/defrag
5.4 HugePages Setup Script
Executing this script will give suggested values.
To execute this script, you need to install the bc rpm package.
Run it with oracle user and make sure that the instance starts properly.
#! /bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
# on Oracle Linux
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script is provided by Doc ID 401749.1 from My Oracle Support
# http://support.oracle.com
if [ $(rpm -qa|grep ^bc) ]
then
echo "Already install bc rpm package,contine."
else
echo "Please install bc rpm package within os medium."
exit
fi
# Welcome text
echo "
This script is provided by Doc ID 401749.1 from My Oracle Support
(http://support.oracle.com) where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments on Oracle Linux. Before proceeding with the execution please note following:
* For ASM instance, it needs to configure ASMM instead of AMM.
* The 'pga_aggregate_target' is outside the SGA and
you should accommodate this while calculating SGA size.
* In case you changes the DB SGA size,
as the new SGA will not fit in the previous HugePages configuration,
it had better disable the whole HugePages,
start the DB with new SGA size and run the script again.
And make sure that:
* Oracle Database instance(s) are up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not setup
(See Doc ID 749851.1)
* The shared memory segments can be listed by command:
# ipcs -m
Press Enter to proceed..."
read
# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`
# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'`
if [ -z "$HPG_SZ" ];then
echo "The hugepages may not be supported in the system where the script is being executed."
exit 1
fi
# Initialize the counter
NUM_PG=0
# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"`
do
MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
if [ $MIN_PG -gt 0 ]; then
NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
fi
done
RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q`
# An SGA less than 100MB does not make sense
# Bail out if that is the case
if [ $RES_BYTES -lt 100000000 ]; then
echo "***********"
echo "** ERROR **"
echo "***********"
echo "Sorry! There are not enough total of shared memory segments allocated for
HugePages configuration. HugePages can only be used for shared memory segments
that you can list by command:
# ipcs -m
of a size that can match an Oracle Database SGA. Please make sure that:
* Oracle Database instance is up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not configured"
exit 1
fi
# Finish with results
case $KERN in
'2.2') echo "Kernel version $KERN is not supported. Exiting." ;;
'2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
'2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
'3.8') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
'3.10') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
'4.1') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
esac
# End
For more information, please visit Antute's official website:3.durayork.com