Hi Guys (and gals),

Thanks for taking the time to read this, hope someone can help.

I've been tasked with developing an anti-virus scanner as part of a university assignment. I've written all the signature matching code, which works fine, and the software can scan through a directory and find .exe files, checking them against the signature database.

The issue I have is a virus can infect any COM exectuable. Is there anyway of detecting if a file is a COM exectuable from within C?

As a student I'm not expecting a complete answer, but any pointers in the right direction would be brilliant. Searching for .exe's alone is fine, but I'd love to take it that step further.

I also appreciate that some people may worry the code is being used for malicious use, I assure you if that were the case I'd be posting this in an assembly language forum instead... C is far too restrictive to write an effective virus.

Thanks in advance for any pointers / help, would genuinely appreciate it.

um, search the filename string for a ".com" extension? yes?

and it's pretty clear that you're trying to make some sort of malicious software. not a virus, obviously, because you clearly don't have the talent, but some sort of disruptive program nonetheless.

and FTR, your defense is laughable. viruses aren't written in assembly, you silly person.

um, search the filename string for a ".com" extension? yes?

and it's pretty clear that you're trying to make some sort of malicious software. not a virus, obviously, because you clearly don't have the talent, but some sort of disruptive program nonetheless.

But could a COM exectuable not be renamed to have a different file extension? I'm basically looking for a way of determining, from the contents of a file, whether or not the file is executable (An executable file being a potential carrier of malicious code).

And for the record, I'm building a simple signature-scanning A/V application. As I said, C is far too restrictive to write effective malicious code, with virus writers prefering to use assembly language. If this weren't the case, why are both spectral-scan and heuristic based AV applications so widespread? As such if I genuinely were writing a virus, I'd be doing it in assembly language.

jephthah, please take the time to research both spectral and heuristic based AV scanning to appreciate that the need for these types of applications are derived from the use of assembly language in constructing virus'. As I say, C is far too restrictive on many levels to write effective viral applications.

Please see also the following references:

Filiol, E. (2005). Computer Viruses: From theory to applications. FR: Springer-Verlag.

Rune, S. (1998). Virus Detection and Elimination. MA, US: Academic Press Inc

I'm a genuine student, attempting to write a simple application to demonstrate the ability to perform signature based scanning. if you can't accept that, please dont reply and make yourself look daft by saying assembly language isn't used to write viruses...

Aside from that, I do appreciate a search for .COM would work, but I need to be able to identify those files beyond simply checking the filename/extension, but thank you for that part of your reply anyway.

@jephthah

>um, search the filename string for a ".com" extension? yes?
No. What kind of simpleton would do a naive extension search when the extension is largely ignored by the program loader? AV writers need to be smarter than that, because virus writers definitely are.

>it's pretty clear that you're trying to make some sort of malicious software.
I don't think it's clear at all.

>viruses aren't written in assembly, you silly person.
Wow, are you drunk? Of course viruses are written in assembly. They're written in other languages too, with C and assembly being the most common, last I checked.

@Th3one234

>But could a COM exectuable not be renamed to have a different file extension?
Absolutely. COM executables are raw instructions with no format, so there's not a solid way to detect them. You could start by looking for a jump as the first instruction (a common occurrence in COM executables), or come up with some heuristics for guessing whether the first N bytes represent machine code.

Alternatively you could assume that anything without a known non-threatening format or default loader is potentially malicious and scan it anyway.

>C is far too restrictive on many levels to write effective viral applications.
I disagree. The strongest viruses are more likely to be written in assembly due to the need for increasing need for clever hiding strategies, but you'll find that many effective viruses have been written in C (and continue to be).

commented: huffing ether, actually. :-/ +7

Narue,

Thanks for the backup, I appreciate it

You've put what I was thinking re COMs having no specific format into concrete... It looks like I'll be going down the heuristic route after all, I can't rely on just a JMP (that will however obviously be a heuristic rule and weighted accordingly). Cheers! It looks like I'm in for a long night

Re C being too restrictive, I didn't quite word that right. I meant too restrictive to be able to write an effective virus from the perspective of evasion Afterall that's ultimately what my assignment's about. Any C coder can write a virus, but it's a lot harder to write an undetectable one (Unless you go down the route of loading your C-coded virus as a driver and do a bit of kernel subverting... which is just as hard, if not harder). You're right however, there's plenty of brilliant viruses been written in C... they're just becoming less and less effective at evading AV software so if i were to write one (unfortunately my uni would have a fit at the mere thought) it'd definately be done in assembly.

Thanks again

commented: sorry. my bad. +7
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.