Monday, 27 June 2011

AIX JFS error


How to resolve issues with local JFS file systems that have run out of available space.
 
Resolving full filesystems:
 
To resolve an out-of-filesystem-space issue, complete these steps: 
·         Determine what filesystems are full. 
·         Determine where space is allocated within the source filesystem.
·         Take the required steps to resolve the out-of-space condition. 
 
Once the above steps have been completed, the situation should be resolved,
or the reason for the problem should be understood.
 
If these steps do not resolve the issue, filesystem corruption MAY be involved.Unmount the filesystem and run a full fsck against it to verify that no corruption problems exist.
 
Determining what filesystems are full ?
 
The df command is used to get filesystem status information. The relevant 
field to consider is %Used.
 
%Used = percentage of total filesystem space currently allocated
 
Example: 
 
   Filesystem   1024-blocks      Free %Used    Iused %Iused Mounted on
   /dev/hd4           12288        68   99%     1823    23% /
   /dev/hd2          409600     20436   96%    16181    16% /usr
   /dev/hd9var         8192      6088   26%      163     8% /var
   /dev/hd3           12288     11340    8%       87     3% /tmp
   /dev/hd1           57344     13872   76%     1459    11% /home
In this example, most of the available free space in the root filesystem / 
is allocated. At this point, we have determined free-space problems on the / filesystem. The next issue is to determine what kind of space problem exists. 
Keep in mind that the mounts are hierarchical and have a bottom-up precedent. 
In other words, a filesystem mounted below a second filesystem cannot access 
data in the second filesystem. For example, if you have a mount entry called 
/myfilesystem/mydata immediately followed by a mount entry called /myfilesystem,then the /myfilesystem mount point cannot access the /myfilesystem/mydata filesystem and any data that resides there. 
 
Determining where space is allocated within each filesystem?
 
There are two commands generally used to determine how and where filesystem
allocation is placed: df and du . 
 
df uses the space in a filesystem that is currently unallocated to determine
the space that is used in a filesystem. For instance, if you have a filesystem that consists of 8192 512-byte blocks, and 4096 of those blocks are currently not allocated to anything, then the total space being used by the filesystem would be 4096 512-byte blocks. 
 
Allocated Storage = Total Storage - Unallocated Storage
 
df is inherently the most reliable command to report filesystem usage, because df reports information based on the filesystem as a whole.
 
du is a file-oriented command. It reports the space allocated to a specified 
file or directory. du must have a destination parameter, and is not isolated 
to a filesystem. For instance, running du / would give allocation information 
for all files in / . This would include all files in the / filesystem and any 
other filesystem mounted under / , such as /tmp, /var, and /usr . You could use the -x option of du to keep the operations within the filesystem, but there are cases where the results of using this option may be incomplete. 
 
du will only report space taken by files. It will not report space taken by 
filesystem metadata, such as inodes, inode maps, or disk maps. inode/disk maps and other reserved areas for filesystem use will take up a negligable portion of the filesystem space, but the areas reserved for inodes can be substantial, and is ultimately based on the NBPI (Number of Bytes Per Inode) chosen when the filesystem was created. Each inode uses 128 bytes of filesystem space, so the amount of space taken for inode use will be the percentage defined below:
 
  (128 / NBPI) * 100 
 
By default, a filesystem will use a NBPI of 4096, so the general overhead for
a filesystem will be about 3%. 
 
 
To determine what the NBPI is for the filesystem in question, issue the lsfs 
command with the -q option on the mount point of the filesystem
 
Example: 
# lsfs -q /
Name      Nodename   Mount Pt    VFS   Size    Options   Auto  Accounting
/dev/hd4  --         /           jfs   81920   --        yes   no 
(lv size: 24576, fs size: 24576, frag size: 4096, nbpi: 4096, compress: no, bf : false, ag: 8)
 
 
du will only show allocated information about files it can reference. There are two cases where du may not show information about allocated storage. 
 
The file is hidden because a filesystem or file has been mounted on top of this entry. If you had a file that was stored in /bobby, and then mounted a 
filesystem on top of /bobby, then du would no longer see what was in the 
directory /bobby. It would only see the information in the filesystem that was mounted over /bobby.
 
The file is open by other applications, and the file has been removed. In this case, the storage for that file will remain allocated until all references to that file have been closed. Without a filesystem entry, du will not show allocated space for that file, though df will show this space taken from the filesystem as a whole.
 
 
 
Determining files that are using space in a filesystem
 
To address the situation presented in case 1 in the previous section, mount 
the primary mount point of the desired filesystem on a secondary mount point. 
This has the effect of negating any filesystem mounted under the primary 
mount point. 
Example: 
 
   mount / /mnt
 
 
In the above example, we mounted the / filesystem over /mnt . The effect is 
that if we go into /mnt , we see all the information about the / filesystem 
and no other filesystem mounted under / . If we run cd /mnt/tmp , then we are 
actually in the directory /tmp in the / filesystem, and not in the /tmp
filesystem. Also, if we run du -sk /mnt , it should closely match the %Used 
for / from the df command. If it does not, then this indicates that case 2 
may be occurring. For now, we will proceed with case 1. 
 
We can now investigate disk usage accurately. First, go into the root directory of this filesystem. 
 
   cd /mnt
 
Run the du command to get an accurate accounting of the space that can be seen for all accessible files in this filesystem. 
 
# du -sk /mnt
11778   /mnt
 
This will report the accounted space taken for files in kilobytes. If you add
the overhead of the filesystem described above, this figure should closely
match that given by the df command. 
 
Example: 
# df -vk /
Filesystem  1024-blocks  Used  Free  %Used  Iused   Ifree %Iused  Mounted on
/dev/hd4    12288        12220  68     99%   1823    1249   23%      /
 
The overhead in this case will be (128/4096) * 12288K = 368K 
 
The total space that can be accounted for is 11778K + 368K = 12146K versus the reported space taken from df of 12220K. This outcome is accurate and indicates that any space seen is accounted for in a file somewhere in the filesystem. If the difference between the two is large, then this indicates case 2 is more likely to be occurring. The next section , "Resolving Space Taken by Open Files That Have Been Deleted" addresses the situation in case 2. If not, continue with the following steps.
 
Run the following command on the new mount point of this filesystem to get 
a sorted disk usage of the filesystem's root directory. 
 
 
   ls -A . | while read name; do du -sk $name; done | sort -nr
 
Example: 
 
   # ls -A . | while read name; do du -sk $name; done | sort -nr
   2168    etc
   192     lpp
   168     sbin
    40     dev
    28     export
    12     smit.log
     4     var
     4     usr
     4     tmp
     4     tftpboot
     4     src
     4     smit.script
     4     mnt
     4     .sh_history
     4     .profile
     0     unix
     0     u
     0     lib
     0     bootrec
     0     bin
This command sorts disk usage for all files in the current directory by size, 
in decreasing order. If the file we suspect happens to be a directory, we can 
then change into that directory, and re-run the preceding command to determine what is taking up space within that directory. Continue these steps until you find the desired file or files, at which point you can take appropriate actions. 
 
 
Resolving space taken by open files that have been deleted
 
In case 2, there are files within the filesystems that are opened by 
applications but have been removed from the filesystem tree. This behavior 
is documented in the unlink() system call as follows.
When all links to a file are removed and no process has the file open, all 
resources associated with the file are reclaimed, and the file is no longer 
accessible. If one or more processes have the file open when the last link 
is removed, the directory entry disappears.
However, the removal of the file contents is postponed until all references
to the file are closed. 
 
You can use the fuser command with the -dV flag on the full path to the 
device on which the filesystem resides. This will display files that have
been removed but are still open. It will also report the inode number and 
size of such files. Using the process ID returned for these files, you can
instruct the source application to close these files, or you can exit the 
application. Once this has occurred, and fuser no longer shows this deleted 
file, the space will be returned to the filesystem for general use. 

No comments:

Post a Comment