Python: Check if uploaded file is jpg
-
06-07-2019 - |
Question
How can I check if a file uploaded by a user is a real jpg file in Python (Google App Engine)?
This is how far I got by now:
Script receives image via HTML Form Post and is processed by the following code
...
incomming_image = self.request.get("img")
image = db.Blob(incomming_image)
...
I found mimetypes.guess_type, but it does not work for me.
Solution
If you need more than looking at extension, one way would be to read the JPEG header, and check that it matches valid data. The format for this is:
Start Marker | JFIF Marker | Header Length | Identifier
0xff, 0xd8 | 0xff, 0xe0 | 2-bytes | "JFIF\0"
so a quick recogniser would be:
def is_jpg(filename):
data = open(filename,'rb').read(11)
if data[:4] != '\xff\xd8\xff\xe0': return False
if data[6:] != 'JFIF\0': return False
return True
However this won't catch any bad data in the body. If you want a more robust check, you could try loading it with PIL. eg:
from PIL import Image
def is_jpg(filename):
try:
i=Image.open(filename)
return i.format =='JPEG'
except IOError:
return False
OTHER TIPS
No need to use and install the PIL lybrary for this, there is the imghdr standard module exactly fited for this sort of usage.
See http://docs.python.org/library/imghdr.html
import imghdr
image_type = imghdr.what(filename)
if not image_type:
print "error"
else:
print image_type
As you have an image from a stream you may use the stream option probably like this :
image_type = imghdr.what(filename, incomming_image)
Actualy this works for me in Pylons (even if i have not finished everything) : in the Mako template :
${h.form(h.url_for(action="save_image"), multipart=True)}
Upload file: ${h.file("upload_file")} <br />
${h.submit("Submit", "Submit")}
${h.end_form()}
in the upload controler :
def save_image(self):
upload_file = request.POST["upload_file"]
image_type = imghdr.what(upload_file.filename, upload_file.value)
if not image_type:
return "error"
else:
return image_type
A more general solution is to use the Python binding to the Unix "file" command. For this, install the package python-magic. Example:
import magic
ms = magic.open(magic.MAGIC_NONE)
ms.load()
type = ms.file("/path/to/some/file")
print type
f = file("/path/to/some/file", "r")
buffer = f.read(4096)
f.close()
type = ms.buffer(buffer)
print type
ms.close()
Use PIL. If it can open the file, it's an image.
From the tutorial...
>>> import Image
>>> im = Image.open("lena.ppm")
>>> print im.format, im.size, im.mode
The last byte of the JPEG file specification seems to vary beyond just e0. Capturing the first three is 'good enough' of a heuristic signature to reliably identify whether the file is a jpeg. Please see below modified proposal:
def is_jpg(filename):
data = open("uploads/" + filename,'rb').read(11)
if (data[:3] == "\xff\xd8\xff"):
return True
elif (data[6:] == 'JFIF\0'):
return True
else:
return False