Tutorials - Ideas and Theoryes on PE Infection

Ideas and theoryes on PE infection
by b0z0/iKx, Padania 1998

In this text I would like to offer some interesting (I hope) information about some things I got in contact while studing PE infection and SoftIcing around. Some parts will cover some more technical aspects of the PE structure while some are more general and are intended to bring further ideas to other virus writers. Hope someone will find this of use.

If you have any comment, question or news that about Windoze stuff described in this short article or something else, feel free to contact me, I am always pleased to share ideas and material with other guys involved with this kind of studies.

The .reloc section

First of all I would like to talk a bit about the .reloc section. As the name says this section contains the data for the relocations in the executable, better known as fixups in Win95. So, very basically, this must contain in some way the informations of where the loader must patch the adresses through the program, depending on where it was loaded.

The .reloc section name is just a standard choosed name used quite always but should be named otherwise. The real reference to the fixup section is infact in the PE header at the dword at offset 0A0h. This word contains the RVA of the fixup table, thus usually, when .reloc is present, this will be equal to the RVA of the .reloc. At PE header at offset 0A4h is the total lenght of all the fixups, and this is smaller (at least usually for aligns) or equal than the .reloc size (virtual and physical).

Just a fast look at the structure of the fixups. The entire fixup table is divided in blocks, where each block describes a 4kb chunk of the program. The first dword of the block refeers to the RVA of the chunk the block is refeering to, the second dword is the lenght of the block (including this first two dd. the lenght is aligned to a dword) and then all the fixup entryes. Each entry is a word, where the 4 most important bits are the type of the relocation and the remaining 12 bits are the offset in the chunk. Check out the specs for a deeper description.

But are the fixups really needed in Win95? Don't we exactly say in the PE header when we want to load the base of the program (image base in PE) and where all the objects (RVA in each object table entry)? That's it. So fixups are just needed when the program can't be loaded at the requested adress. But this actually could never occour, except when a memory wrap around or some sorta strange thing should have to occour. Infact even if ten programs would like to use the same memory the pager will do its work and will give some other physical memory even if the programs will see the same logical adress. So, are fixups of any use then? Not really, I think they are just used for some backward compatibility at the transaction from win16 trough win32s to win32. If you look at newer PExes, like many programs from Win98, you will see that the fixups aren't present anymore in many executables. It seems that some compilers as an optimization (of space expecially) don't include the fixups.

So what's the point? Instead of adding code to the program we are going to infect we can, where present, overwrite the fixups, because anyway they aren't needed anymore in any way. So the infected file size shouldn't grow (the .reloc sections are usually bigger than a not too huge virus) or should grow just a bit, not by the entire virus size. Or even better you should optimize the executable in size by deleting at all the part of the .reloc that should not be overwriten by the virus code :-) Should became a 'good' virus since gives the user even more free space on disk :-)) Of course you should now overwrite the .reloc entry in the object table too, so you won't need to have free space there too.

Anyway just think about one thing: why should you leave the fixups for the entire program if then you don't add the ones for the virus code? Anyway if the program shouldn't be loaded where requested at least the virus code will have problems. So if you don't add (usually viruses don't, it is quite expensive in space and isn't really needed as I said) your virus fixups too then you should not even care about the other ones. Got the point? Of course apart from overwriting the section and object table entry you must also force the loader not to use fixups at all in any way, so you must set to zero the dword in PE header at offset 0A0h (fixup RVA) and at offset 0A4h (fixups lenght).

PE midinfection ideas

Undoubtely midinfection is one of the most interesting (at least imho) topics in virus writing. Midinfectors are usually a lot harder to spot than other viruses and are also usually less noticeable anyway (since the delays due to virus activity should not be concentrated all at the beginning of the program execution but should occour in a random moment in the program execution).

So what about midinfection in PE files? For some aspects it is not too hard to implement. Infact we always know where in memory the program is loaded and where each object, so with a few additions and substractions we can easily calculate a jump from some given adress to our virus entry point. In short we should do something like

RVA_jmp_from:
        jmp     (offset RVA_Virus_EIP - (offset RVA_jmp_from + 5))

Of course this must be calculated at runtime. Well, RVA_Virus_EIP is the RVA of the entry point of the virus (easy to get out), the offset RVA_jmp_from is the RVA where the jmp is executed and the +5 is due to the lenght of the jmp. So the jump to the virus should be very easy to code, eventually one should also add the fixup entry in the fixup section for the jump but that are just peculiarities.

One step should be done, but now to the harder (and the hardest in this case) thing: how should we find a suitable instruction to overwrite? Of course we must be sure that we won't overwrite badly half of some other instruction or write our jump to some data, or everything should go bad, very bad, and the program should be badly corrupted.

This part of the midinfection stage can be done in more ways of course, the first one I'm going to describe is quite simple to implement and should be quite stable, but has some bad aspects. But first let's look at it.

This method relays totally on the fixups information of the program and on some code examination. The full 32bit fixups infact (type 0011b, 03h, _HIGHLOW in the PE specs) will always point to a dword that is likely to be an adress (used as a constant or as a reference). Of course it is very likely that this will be used by some instruction (ex. mov eax,[1234578h] or push 12345678h). Before the dword of the adress the instruction should be of any lenght, but it usually should be 1 or 2 bytes (even 3 if some prefix is used). So since the jump to our virus is 5 bytes long one of this instructions should be good, isn't it? We should read, let's say, 2 bytes before the adress that is pointed from the fixup and check for some given opcodes. If the opcode is a known good one (this check should be made to reduce the risk of corrupting unknown stuff) then we should replace it with our jump to the virus code. Of course depending on the lenght of the original opcode (totally 5, 6, 7 or even more) we should pad the code with some NOPs or move the jump a little back from the original instruction position. So some sample... let's think that the code we selected and got from the fixups is:

instr:
        call    [00401979]      ; db 0ffh,15h,79h,19h,40h,00h
after_instr:

Since it is a 6 byte code we can just do something like:

instr:
        nop                     ; db 90h
        jmp     (off to_virus)  ; db 0e9h,xxh,xxh,xxh,xxh
after_instr:

or even

instr:
        jmp     (off to_virus)  ; db 0e9h,xxh,xxh,xxh,xxh
        db      xxh
after_instr:

In the first case the opcode before the jmp is executed, so must be some good opcode (so a NOP), while in the second we will anyway try to come back at offset of after_instr. Of course if the selected code would be a 5 bytes one (such a mov eax,[00401978]) we should just overwrite it entirely. Using the first method we won't change the offset in the 4kb code chunk where the fixup should be done (the 00401979 is in both cases at instr + 2), so should be better if you're looking to preserve the fixups stuff.

Well, so now with a good routine to check for well known opcodes before the adress that is going to be fixed-up, we should have a fully working jmp to the virus code from the program.

Now we also need all the restoration stuff of course. So we shall execute, after the viral activities, the old code ovewritten by the virus jmp. This is simple, let's see first some code to make everything even clearer:

virus_entry:
        .
        .
        virus_code
        .
        .
back_2_host:
        (old_instruction)
        jmp     (offset after_instr)

What we see? Well, the virus will do its work (of course it must first preserve everything, registers, flags and so on), then will execute the old instruction (this must be saved at the infection stage before. Here too you must pay attention to the lenght of the original instruction and eventually pad with NOPs) and then will jump back (the label is the same as in the previous example to make things even easier to understand) to the offset after the instruction it replaced. This jump is also easy to calculate with a few adds and subs.

Alternatively you should pass the control to the virus with a CALL and then pop the return adress to the last JMP. But it highly depends on the instructions you are going to overwrite in the original code. Infact if you will just overwrite for example the moving operations from and to registers then you should even come to the virus with a CALL and go back with a RETF, since the moving operations don't require the stack status. While if you replace a PUSH that uses the stack you must alway clear the stack before and then do a normal JMP.

So basically this way should be ok, but which are the bad aspects of this? Well, you must have the fixups informations, and as I said some newer PExes don't carry this anymore. If no fixups are present then you just can't use this method, that's all :-( So anyway you should need another way to infect executables without the fixups or decide to leave them clean.

Another bad aspect the reader should notice is that there are really tons of fixups, so the probability of the infected one to be executed should be sometimes really, really low. I should suggest now two sorta solutions. One is obvisious and is multiple infection. You should infect many times the file with the same routines, so randomly different code to be fixed-up will be selected. Alternatively you should set more jumps to the virus code but with just one virus body on to the file. So you should have also more than one restoration routine that will execute a definite code that has been saved for a definite jump to virus code (to be more clear... if the JMP to the virus came from position 1 then we should execute at the restoration the code that was at position 1 and then jump at the next code after position 1, same for position 2 and so on).

Another way to solve the overpopulation of the fixups is to try to set the JMP to virus code somewhere near the program entry point. This way is used by the sample virus provided Padania_Libera that will set his jump to the virus somewhere quite near after the program EIP (it has a sorta rnd routine to decide when to get one... it anyway try to stay very near the program EIP for other problems connected with the memory residency routine, so a better midinfector should go deeper into the program). This is quite easy to do, since from the EIP in the PE header you can calculate which 4kb chunk of the fixups will contain code quite near to the entry point. So instead of scanning all the fixups you should start from the EIP 4kb chunk or something else (like scanning the 4kb chunk starting from the very after the original EIP adress like the sample virus does).

But as already stated the method described before and used in the sample virus needs to have the fixup section or it can't work. Of course there are other ways to find some instruction that is suitable for the substitution with the virus jump. Let's try to see some:

normal code scanning for some given opcodes. This is the easier maybe but at the same time the one with most risks, even if done with extreme careful. You should search through the executable for some given bytes and then try to understand if it is a suitable instruction and if it seems to be the beginning of an instruction. Here you would have to spend a lot of code to do many checks, but this shouldn't prevent errors anyway. For example you should start from somewhere and look for the two bytes 0ffh,015h, that are the opcodes for call [xxxx]. Doing only this of course is a suicide, you should overwrite too much other stuff. But you should check the adress where the call points to (the dword after the 0ffh,015h... of course if we got a real call [xxxx] and not some other data). Infact you should note that very likely this adress should point somewhere not too away from the program. So for example if the program base load adress is 00400000h it is likely that if a valid call [xxxx] was encountered the dword after the 0ffh,015h bytes should be something like 0040xxxxh or 004xxxxxh, depending on where then other objects are loaded (easy to check in the object table) or program lenght or whatever. In this way the opcode scanning becames a lot more stable and shouldn't create too much problems. Try to check out some executables searching 0ffh,015h and examine the dword after the two bytes and you will see. Of course there it is also probable that we should leave even a valid call because it has some strange adress, but the real important thing is to make the infection stable.
You should too for example again start the opcode scanning starting from the EIP or some point that is more likely going to be called (so virus will be more likely executed and the opcode check will have less probability to get some data for mistake) on execution.
In this way the infection should be stable and the probability to make everthing crash very low. But you should add some more checks. For example it is quite usual that before a call [xxxx] some pushes are made (for the params) or some check like this. Just examine some win32 code and you'll see.
Instead of trying to modify just an instruction that is long enough you should try to search for well known instruction sequences used often or anyway created by usual compilers. This should of course include the creation of a stack frame or the return from a procedure or a preparation and call to some specific initialization code or something like. Of course you must be sure that the code you are overwriting is long enough to carry the jump.
You should execute the program in debug mode, so executing the code sorta step-by-step and find some suitable place for the virus jump. This should be of course the clever solution since the probability to damage in some way the executable should be extremely reduced. Anyway this should be quite lenghty and everything except easy to implement in a good and efficent way.
Well, just some ideas for PE executables midinfection, you have now to develope and create new ones from yourself ;) Of course midinfection is only efficent with other anti-AV measures such as polymorphism, since a non-poly or an unencrypted virus should be anyway scanned dumbly at the bottom of the file even without getting its entry-point. The sample virus is though made to demonstrate some concepts about the midinfection stuff, not actually to be unscannable by AVs. I haven't had time (and been too lazy) to write a small poly for this too.

Too much unused space

Micro$oft and Windoze has always been synonymous of fatware. Expecially with new PE format with the various alignments, there is always plenty unused bytes in the executable. Starting from the very beginning with the free space after the objects to the first object code, to the gaps for the object alignments, to the unused space after the fixups (apart from the normal alignment padding) and so on.

Okay, this is well know you will say, but where is the point? Use that spaces! You should use that space to store the original EIP for example. You should write a small routine that finds the first dword actually unused somewhere between the executable objects and then place the original dword there. At the restoration you won't have a fixed position where that dword was saved but would use the same algorithm as when searching for the free space the first time. Of course the example done is anyway easy to get out, since an AV should just use the same algorithm to get the original dword. But this should anyway make the removal less standard and if done in a cool way also harder, so anyway a pro for the virus.

The free spaces avaiable (you can always get a good amount of bytes if you sum all the ones around) could also be used to create a poly loader to your virus code, thus making the detection harder again. So you should do something ala Commander Bomber but using the unused space in the middle of the file. Of course in this way you don't have to mess around with restoring the original data, because anyway it was unused (of course conceptually you should also insert the loader in the middle of the original code and then restore it). You should jump through the various holes you find, putting some garbage code or initializing some important virus stuff, up to the original virus entry point. Or alternatively you should do a poly restoration routine, making the disinfection part unstandard, so AVs will have to care a bit more at their work :)

So, ok, there are a lot of things we should do with the free gaps, but where and how can I get them?

Well, first of all just after the objects in the object table. After all the entryes of the object table (number of them in PE header at word at offset 06h) there is usually quite some free space before the code of the first object (first physically present) ranging from about 100h bytes (when the code begins at 400h) up also to 0d00h (when code begins at 1000h, very usual in newer PE). Of course you must check where the first object physically begins, not just check the entry header size at offset 54h of the PE. This dword infact just specifies how many bytes are loaded in memory from the beginning of the header, not the real physical lenght (so pay attention also if you add an object or some code in the header space since you must also check that the new stuff is actually loaded at runtime, not just if it fits there! forgetting this check should have bad consequences expecially in Win98, where they don't load anymore a lot more code from the header than is actually needed, while in 95 the entire header was usually loaded in memory even it wasn't needed)!

Another free space if founded between the objects. For example let's consider this entryes, with the alignment at 200h:

#   Name      VirtSize    RVA     PhysSize  Phys off
--  --------  --------  --------  --------  --------
01  .text     000096B0  00001000  00009800  00000400
02  .bss      0000094C  0000B000  00000000  00000000

Courtesy of TDUMP :) You see, the .text starting from 400h is physically 9800h long, but actually just 96b0h bytes are loaded into memory. So the 150h bytes starting in the file at the offset (96b0h + 400h) can be actually overwritten by our code (they are just there for the 200h align). Of course if you want that the added code will be loaded in memory too you must correct the virtual size of it too, or it will just result on disk. You must pay attention that the loading of your code won't in some way delete the other code. In this example the next object starts at 0b000h (+ base), so you could add the 150h bytes (or less) to be loaded without any problem. The same goes for all the other objects, I cut down the sample since this is very simple once got the idea, just get TDUMP and a hex file viewer.

Another place when you should put your code is after the fixups. The given lenght in the header is usual bigger than the real one. To get the real lenght of the fixups you must go through the chain of all the fixups up to the end (you just have to skip each block, using the lenght founded in the second dword of the block, until you get the block that apparently has lenght and RVA 0. Of course you have to leave this last 0,0 one as the end of the fixups). Here too you must pay attention to extend the loaded lenght if you add your code here.

Some final words

Well, the article had to came to some sorta end :) PE format is quite interesting and there are a lot more tricks that can be developed with it. So keep up working and developing new stuff, so we will make AVers earn their money for some real work.

With this issue of XINE#3 you can find also the Padania_Libera virus, that demonstrates some of the ideas included in this text file, such as the midinfection by using the .reloc section. So check out the code for some practical example and send me an email if you would like to have some more explanation about something or if you would like to exchange some ideas!