RES != CODE + DATA in the output information of the top command,why? RES != CODE + DATA in the output information of the top command,why? linux linux

RES != CODE + DATA in the output information of the top command,why?


I'll explain this with the help of an example of what happens when a program allocates and uses memory. Specifically, this program:

#include <stdio.h>#include <stdlib.h>#include <errno.h>#include <string.h>int main(){        int *data, size, count, i;        printf( "fyi: your ints are %d bytes large\n", sizeof(int) );        printf( "Enter number of ints to malloc: " );        scanf( "%d", &size );        data = malloc( sizeof(int) * size );        if( !data ){                perror( "failed to malloc" );                exit( EXIT_FAILURE );        }        printf( "Enter number of ints to initialize: " );        scanf( "%d", &count );        for( i = 0; i < count; i++ ){                data[i] = 1337;        }        printf( "I'm going to hang out here until you hit <enter>" );        while( getchar() != '\n' );        while( getchar() != '\n' );        exit( EXIT_SUCCESS );}

This is a simple program that asks you how many integers to allocate, allocates them, asks how many of those integers to initialize, and then initializes them. For a run where I allocate 1250000 integers and initialize 500000 of them:

$ ./a.outfyi: your ints are 4 bytes largeEnter number of ints to malloc: 1250000Enter number of ints to initialize: 500000

Top reports the following information:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  SWAP CODE DATA COMMAND<program start>11129 xxxxxxx   16   0  3628  408  336 S    0  0.0   0:00.00 3220    4  124 a.out<allocate 1250000 ints>11129 xxxxxxx   16   0  8512  476  392 S    0  0.0   0:00.00 8036    4 5008 a.out<initialize 500000 ints>11129 xxxxxxx   15   0  8512 2432  396 S    0  0.0   0:00.00 6080    4 5008 a.out

The relevant information is:

                          DATA CODE  RES VIRTbefore allocation:         124    4  408 3628after 5MB allocation:     5008    4  476 8512after 2MB initialization: 5008    4 2432 8512

After I malloc'd 5MB of data, both VIRT and DATA increased by ~5MB, but RES did not. RES did increase after I touched 2MB of the integers I allocated, but DATA and VIRT stayed the same.

VIRT is the total amount of virtual memory used by the process, including what is shared and what is over-committed. DATA is the amount of virtual memory used that isn't shared and that isn't code-text. I.e., it is the virtual stack and heap of the process. RES is not virtual: it is a measurment of how much memory the process is actually using at that specific time.

So in your case, the large inequality CODE+DATA < RES is likely the shared libraries included by the process. In my example (and yours), SHR+CODE+DATA is a closer approximation to RES.

Hope this helps.There's a lot of hand-waving and voodoo associated with top and ps. There are many articles (rants?) online about the descrepancies. E.g., this and this.


This explanation is terrific to resolve my some queries. Thanks!And meanwhile, trying to add something got during my understanding of linux memory management knowledge. If any misunderstand, please correct me!

  1. Modern OS process concepts are based on virtual memory. Virtual memory system includes the RAM+SWAP;So I think most of the memory concepts related with processes refer to the virtual memory, except that there are some supplement notes.

  2. Any virtual address(page) allocated to a process is in below state:

    a) allocated, but no mapping to any physical memory(something like COW)

    b) allocated, already mapped to physical memory

    c) allocated, already mapped to swapped memory.

  3. The fields ouput of top command:

    a) VIRT -- it refers to all virtual memory that the process have the right to access, no matter it is already mapped to physical memory or swapped memory, or even has no any mapping.

    b) RES -- it refers to the virtual address already mapped to physical address and it still in RAM.

    c) SWAP -- refers to the virtual address already mapped to physical address and it is swapped into SWAP space.

    d) SHR -- it refers to the shared memory available to a process(VM?)

    e) CODE + DATA -- CODE could be in a state of 2.b/2.c, and DATA could be in any of 3 state 2.a/2.b/3.c, and 3.b/3.c also have a fields name called "USED".

4) So the calculation maybe look like:

a) VIRT(VM) = RES(VM in memory) + SWAP(VM in swap) + VM unmapped(DATA, SHR?).

b) USED = RES + SWAP

c) SWAP = CODE(vm in memory) + DATA(vm in memory) + SHR(vm in memory?)

d) RES = CODE(vm in memory) + DATA(vm in memory) + SHR(vm in memory?)

At least DATA segment still have a "DATA(VM unmapped)", this could be observed from above malloc example. That's a little different from the manpage of top command which says "DATA: The amount of physical memory devoted to other than executable code, also known as the Data Resident Set size or DRS". Thanks again. So amount of (CODE + DATA + SHR) usually larger than RES, because at least DATA(vm unmapped) actually calculated in "DATA", not like the manpge claiming.

Regards,