I know I could parse the files name for the extension, but that can be faked. I know I can get the mime type, but they can also be faked. I know I could use Magic Numbers, but on some files, they are not at the top of the file.

Is there any guarenteed way of getting the REAL file type so my server wouldn't get infected?

IF NOT, then that is plain stupid, why isn't there?

IF there IS, please tell me.

Thanks, Caelan.

Only reading the file headers/identifiers/magic numbers comes close to be safely determining it's type.

But don't they vary for each program that generates the image?

Formats for JPG, PNG, etc. are fixed, you can read them. But it's a lot of work. The program that generates it just adds some specific information (usually). What kind of files do you want to check?

If ImageMagick and imagick are installed in your server (or you can install them) the safest way I would suggest is:

if(extension_loaded('imagick') && class_exists("Imagick"))
{
  echo "Imagick IS ISTALLED <br/>";
  try
  {
    // Let’s suppose that the name of the file input is “newFile”
    $image = new Imagick($_FILES["newFile"]["tmp_name"]);
    echo "THE FILE IS IMAGE <br/>";
  }
  catch(Exception $e)
  {
    echo "THE FILE ISN'T IMAGE <br/>";
    var_dump($e);
  }
}
else
{
  echo "Imagick INS'T ISTALLED <br/>";
}
exit;

.zip .png .jpg .gif .php .cpp .cxx .c .cc .h .hpp .asp .js .jse .jsp .cgi .py .axd .asx .asmx .ashx .aspx
.xml .rss .svg .jspx .css .cfm .yaws .swf .html .htm .xhtml .jspx .wss .do .action .js .pl .php .php4 .php3
.phtml .rb .rhtml .xml .dll .svc .atom .m .mm .uc .d .java .jav .j .jsl and a LOT more....

Well thinking about it, I'm not too worried about text files, it's not as if they will ever be run on the server. But the other files, such as images, I need to validate them, so they are guarenteed to not be spoofs.

Member Avatar for diafol

Images should be easy:

$info = getimagesize($filename);
echo $info['mime'];

This gives image/png etc. It does NOT depend on the file extension. It actually checks the file itself.

Alternatively, use exif_imagetype() to get an integer relating to a mime type:

http://www.php.net/manual/en/function.exif-imagetype.php

The constant returned can be converted to mime using: image_type_to_mime_type() if you need it, but for checking the exif function should suffice.

You could even use the finfo: http://www.php.net/manual/en/function.finfo-file.php to get further mime types. I was thinking of svg files - I don't know whether this will return the correct one for that, but I'm pretty sure that the previous functions won't.

Getting finfo to work may be a pain if using <5.3.0

Here's the trick if on Windows:

uncomment the line extension=php_fileinfo.dll in php.ini
restart Apache

I got errors with relative referencing, so had to go absolute:

echo "<table><tr><th>Filename</th><th>Mime</th></tr>
$finfo = finfo_open(FILEINFO_MIME_TYPE); // return mime type ala mimetype extension
foreach (glob("*.*") as $filename) {
    echo "<tr><td>$filename</td><td>" . finfo_file($finfo, dirname(__FILE__)."/". $filename) . "</td></tr>";
}
echo "</table>";
finfo_close($finfo);

You'd need to include a path for $filename if script not in same directory.

commented: Thanks! +1
Member Avatar for diafol

Screenshot_32

Here's a screenshot of the finfo code. Notice that php files get the c++ treatment with whitespace or html! All the others I tested seemed to pan out ok. How reliable this would be I don't know.

commented: Thanks! +0

Good one, I will see how that goes. I did think of using that method, however, I think I read somewhere that fileinfo is vulnrable to MIME type spoofing.

But I just had an idea, how about I give the file a randomly generated extension, then upon download remove that extension, that way, nobody could upload PHP files and run them. Do you think that would work?

Member Avatar for diafol

I wouldn't want user scripts in my site anyway, but changing their extension may work.

It's for Made 2 Code, and I need users to be able to upload scripts, and other various files.

Member Avatar for diafol

Ok, you can store the code as plain text - either in text files or in a DB - or for copying from a page. htmlentities() can probably output safely. I don't know how secure storing native files in a zip archive would be? So on upload, they get archived, so download is the only action permitted? That way the native files are as required in the archive for the user - no fussing with changing extensions. Have a look at phpclasses or github. How do they do it?

Could I use Gzip? It will be on a linux server.

@diafol Is right, when he says:

store the code as plain text - either in text files or in a DB - or for copying from a page. htmlentities() can probably output safely

You should (in a way) be able to upload the files (.cpp, .php., .htm, ...) But the contents of such a file should be parsed and stored inside a database then it can easily displayed on a webpage and at the request of the user can be downloaded as a file again.

The wrong way (If this is what I assume you're doing) is to allow people to upload .zip files and then download them.. This could lead to a lot of problems from your users.

If your system is for collaborating, sharing code etc.. Then it would be better and more efficent to parse and store inside a database rather than holding a whole bunch of .zip files onto a server. Have you thought about version control? I.e. someone takes the zip folder, makes some changes in ONE of the files and then updates it back to the server.. That's a whole project change, just for probably 1 line in 1 file.. That's not efficently done. :)

commented: Thanks! +1
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.