[ Shellcode ]

Position independent code (AKA Shellcode) is assembly code which can simply be copied to a memory location and run. Due to the lack of need for complex loading & initialization, it is popular for many tasks such as code injection. These challenges are designed to test your ability to reverse engineer malware shellcode.

Shellcode2

Hi, now we are given a PE file named shellcode2.exe_ .

Description:

shellcode2.exe contains a flag stored within the executable. When run, the program will output an MD5 hash of the flag but not the original. Can you extract the flag?

Rules & Information:

  • You are not require to run shellcode2.exe, this challenge is static analysis only.
  • Do not use a debugger or dumper to retrieve the decrypted flag from memory, this is cheating.
  • Analysis can be done using the free version of IDA Pro (you don’t need the debugger).

detect it easy result:

pic1

I opened in Ghidra and found only one function recognized by Ghidra, that is entrypoint function and here is the result of Ghidra’s decompiler:

 MD5::MD5(local_bc);
 local_2c = (code)0x12;
 local_2b = 0x24;
 local_2a = 0x28;
 local_29 = 0x34;
 local_28 = 0x5b;
 local_27 = 0x23;
 local_26 = 0x26;
 local_25 = 0x20;
 local_24 = 0x35;
 local_23 = 0x37;
 local_22 = 0x4c;
 local_21 = 0x28;
 local_20 = 0x76;
 local_1f = 0x26;
 local_1e = 0x33;
 local_1d = 0x37;
 local_1c = 0x3a;
 local_1b = 0x27;
 local_1a = 0x3d;
 local_19 = 0x6e;
 local_18 = 0x25;
 local_17 = 0x48;
 local_16 = 0x6f;
 local_15 = 0x3c;
 local_14 = 0x58;
 local_13 = 0x3a;
 local_12 = 0x68;
 local_11 = 0x2c;
 local_10 = 0x43;
 local_f = 0x73;
 local_e = 0x10;
 local_d = 0xe;
 local_c = 0x10;
 local_b = 0x6b;
 local_a = 0x10;
 local_9 = 0x6f;
 dwBytes = 0x10;
 dwFlags = 0;
 hHeap = GetProcessHeap();
 local_8 = (code **)HeapAlloc(hHeap,dwFlags,dwBytes);
 *local_8 = LoadLibraryA_exref;
 local_8[1] = GetProcAddress_exref;
 local_8[2] = (code *)&local_2c;
 local_8[3] = (code *)0x24;
 _Dst = (code *)VirtualAlloc((LPVOID)0x0,0x248,0x1000,0x40);
 memcpy(_Dst,&DAT_00404040,0x248);
 (*_Dst)(local_8);
 lpText = MD5::digestString(local_bc,&local_2c);
 MessageBoxA((HWND)0x0,lpText,"We\'ve been compromised!",0x30);
                   /* WARNING: Subroutine does not return */
 ExitProcess(0);

Let’s break it down!!!

First, we have a byte array that is stored in stack memory and i guessed this is encrypted data, so i named encrypted_data.

 // i guessed this is an encrypted data
 local_2c = (code)0x12;
 local_2b = 0x24;
 local_2a = 0x28;
 local_29 = 0x34;
 local_28 = 0x5b;
 local_27 = 0x23;
 local_26 = 0x26;
 local_25 = 0x20;
 local_24 = 0x35;
 local_23 = 0x37;
 local_22 = 0x4c;
 local_21 = 0x28;
 local_20 = 0x76;
 local_1f = 0x26;
 local_1e = 0x33;
 local_1d = 0x37;
 local_1c = 0x3a;
 local_1b = 0x27;
 local_1a = 0x3d;
 local_19 = 0x6e;
 local_18 = 0x25;
 local_17 = 0x48;
 local_16 = 0x6f;
 local_15 = 0x3c;
 local_14 = 0x58;
 local_13 = 0x3a;
 local_12 = 0x68;
 local_11 = 0x2c;
 local_10 = 0x43;
 local_f = 0x73;
 local_e = 0x10;
 local_d = 0xe;
 local_c = 0x10;
 local_b = 0x6b;
 local_a = 0x10;
 local_9 = 0x6f;

Second, we have an allocated heap memory stored in local_8 that stores some data.

dwBytes = 0x10;
dwFlags = 0;
hHeap = GetProcessHeap();
local_8 = (code **)HeapAlloc(hHeap,dwFlags,dwBytes);
*local_8 = LoadLibraryA_exref;
local_8[1] = GetProcAddress_exref;
local_8[2] = (code *)&local_2c;
local_8[3] = (code *)0x24;

This allocated heap memory has size 0x10 byte or 4 DWORD that stores:

  • #1 DWORD stores LoadLibraryA address
  • #2 DWORD stores GetProcAddress address
  • #3 DWORD stores encrypted_data address
  • #4 DWORD stores 0x24

Third, we have an allocated memory with PAGE_EXECUTE_READWRITE permission stored in _Dst.

_Dst = (code *)VirtualAlloc((LPVOID)0x0,0x248,0x1000,0x40);
memcpy(_Dst,&DAT_00404040,0x248);
(*_Dst)(local_8);

then DAT_00404040 is copied to this allocated memory and after that, it’s being executed as shellcode with local_8 taken as a parameter.

Now let’s analyze the shellcode, disassemble DAT_00404040, and set it as a function in Ghidra so we can decompile it and break it down.

First, the shellcode initialized some strings in stack memory

  local_5c = 0x6d;
  local_5b = 0x73;
  local_5a = 0x76;
  local_59 = 99;
  local_58 = 0x72;
  local_57 = 0x74;
  local_56 = 0x2e;
  local_55 = 100;
  local_54 = 0x6c;
  local_53 = 0x6c;
  local_52 = 0;
  local_a4 = 0x6b;
  local_a3 = 0x65;
  local_a2 = 0x72;
  local_a1 = 0x6e;
  local_a0 = 0x65;
  local_9f = 0x6c;
  local_9e = 0x33;
  local_9d = 0x32;
  local_9c = 0x2e;
  local_9b = 100;
  local_9a = 0x6c;
  local_99 = 0x6c;
  local_98 = 0;
  local_1bc = 0x66;
  local_1bb = 0x6f;
  local_1ba = 0x70;
  local_1b9 = 0x65;
  local_1b8 = 0x6e;
  local_1b7 = 0;
  local_50 = 0x66;
  local_4f = 0x72;
  local_4e = 0x65;
  local_4d = 0x61;
  local_4c = 100;
  local_4b = 0;
  local_10 = 0x66;
  local_f = 0x73;
  local_e = 0x65;
  local_d = 0x65;
  local_c = 0x6b;
  local_b = 0;
  local_64 = 0x66;
  local_63 = 99;
  local_62 = 0x6c;
  local_61 = 0x6f;
  local_60 = 0x73;
  local_5f = 0x65;
  local_5e = 0;
  local_7c = 0x47;
  local_7b = 0x65;
  local_7a = 0x74;
  local_79 = 0x4d;
  local_78 = 0x6f;
  local_77 = 100;
  local_76 = 0x75;
  local_75 = 0x6c;
  local_74 = 0x65;
  local_73 = 0x46;
  local_72 = 0x69;
  local_71 = 0x6c;
  local_70 = 0x65;
  local_6f = 0x4e;
  local_6e = 0x61;
  local_6d = 0x6d;
  local_6c = 0x65;
  local_6b = 0x41;
  local_6a = 0;
  local_80 = 0x72;
  local_7f = 0x62;
  local_7e = 0;

and these are what we’ve got after analyzing it:

local_5c = "msvcrt.dll";
local_a4 = "kernel32.dll";
local_1bc = "fopen";
local_50 = "fread";
local_10 = "fseek";
local_64 = "fclose";
local_7c = "GetModuleFileNameA";
local_80 = "rb";

From those strings, i guessed that this shellcode will do dynamic loading of some modules and functions.

Next, shellcode copies two data stored in parameter to local variables

  local_8 = (code *)*param_1;
  local_48 = (code *)param_1[1];

Here local_8 will hold LoadLibraryA address and local_48 will hold GetProcAddress address, to make the code clearer, i renamed these two variable in Ghidra to:

  • local_8 => LoadLibraryA
  • local_48 => GetProcAddress

Next, shellcode load msvcrt.dll and kernel32.dll and store their base address to local variables.

  local_40 = (*LoadLibraryA)(&local_5c);    // local_5c = "msvcrt.dll"
  local_88 = (*LoadLibraryA)(&local_a4);    // local_a4 = "kernel32.dll"

Next, shellcode will get some function addresses and store them to local variables.

  local_14 = (code *)(*GetProcAddress)(local_88,&local_7c);     // local_7c = "GetModuleFileNameA"
  local_84 = (code *)(*GetProcAddress)(local_40,&local_1bc);    // local_1bc = "fopen";
  local_a8 = (code *)(*GetProcAddress)(local_40,&local_10);     // local_10 = "fseek"
  local_94 = (code *)(*GetProcAddress)(local_40,&local_50);     // local_50 = "fread"
  local_68 = (code *)(*GetProcAddress)(local_40,&local_64);     // local_64 = "fclose"

Let’s rename those local variable to:

  • local_14 = GetModuleFileNameA
  • local_84 = fopen
  • local_a8 = fseek
  • local_94 = fread
  • local_68 = fclose

Next, shellcode opened the current executable file(shellcode2.exe_) with rb mode

  (*GetModuleFileNameA)(0,local_1b4,0x104);
  local_44 = (*fopen)(local_1b4,&local_80);     // local_80 = "rb"

Next, shellcode sets the file position of the stream to offset 0x4e, read 0x26 bytes and stores it in local_3c as buffer, then closes the stream

  (*fseek)(local_44,0x4e,0);
  (*fread)(local_3c,0x26,1,local_44);
  (*fclose)(local_44);

this is the data shellcode reads:

pic2

Next, shellcode copies the last two data stored in the parameter

  puVar1 = param_1[3];
  puVar2 = param_1[2];

puVar1 will hold DWORD value 0x24 and puVar2 will hold the encrypted_data address, so let’s rename it to:

  • puVar1 => DWORD_24h
  • puVar2 => encrypted_data

Next, i guessed the shellcode does the decryption process using xor-loop operation between encrypted_data and local_3c as xor-key 0x24 times

  puVar1 = (undefined *)0x0;    // puVar1 is loop counters
  do {
    encrypted_data[(int)puVar1] = encrypted_data[(int)puVar1] ^ local_3c[(int)puVar1];
    puVar1 = puVar1 + 1;
  } while (puVar1 != DWORD_24h);
  return;

Now let’s try to decrypt, i extracted encrypted_data from Ghidra to a file using this script:

from ghidra.program.model.address import AddressSet

max_addr = currentSelection.getMaxAddress()

inst = getInstructionAt(currentSelection.getMinAddress())

array1 = bytearray()

counter = 0

while inst.address <= max_addr:
	array1.append(inst.getScalar(1).value & 0xff)
	inst = inst.getNext()

with open("C:\\Users\\Irfan\\Desktop\\encrypted_data.dat", "wb") as binary_file:
	binary_file.write(array1)

print("Done")

and i extracted the xor-key to a file using hex editor(HxD), so now we have two files, encrypted_data.dat and xor_key.dat

pic3 pic4

I made a python script that emulates the decryption process and write the result to a file

import sys

def run():
	if len(sys.argv) != 4:
		print("USAGE: <file1> <file2> <output file>")
		return

	# Read two files as byte arrays
	file1_b = bytearray(open(sys.argv[1], 'rb').read())
	file2_b = bytearray(open(sys.argv[2], 'rb').read())

	# Set the length to be the smaller one
	size = len(file1_b) if len(file1_b) < len(file2_b) else len(file2_b)
	xord_byte_array = bytearray(size)

	# XOR between the files
	for i in range(size):
		xord_byte_array[i] = file1_b[i] ^ file2_b[i]

	# Write the XORd bytes to the output file	
	open(sys.argv[3], 'wb').write(xord_byte_array)
	print("Done...")

run()

Opened the result file in HxD:

pic5

We got a very interesting decoded text here, FLAG{STORE-EVERYTHING-ON-THE-STACK}.

I checked that string as a flag on their website page and here is the result:

pic6

Yep!, we’ve got the FLAG.

Challenge source: https://www.malwaretech.com/challenges/windows-reversing/shellcode2