Overview
here's how to tell what a file really is by its content. sometimes people think they can rename files and suddenly it works as a different document type. wrong. but Due to license agreements of Microsoft and Adobe products, I can't look into the output of their files and give you the file format. the file format of microsoft documents is available from their web site.
| extension | file type |
content |
|---|---|---|
| .7z | 7-zip | starts with 7z |
| .zip, .docx, .dotx, .pptx, .ppsx, etc | zip/PKZIP, (all ms office documents 2007 and up are known to be zipped XML files with your pictures, videos, and files) | starts with PK |
| .exe | windows/dos executable | starts with MZ |
| .gif | GIF 89 format image | starts with GIF89 |
| .png | Portable Network Graphics (PNG) | 1st byte 0x89 followed by PNG |
| .xcf | GIMP native format xcf file | starts with gimp xcf file |
| .mp3 | MP3 | starts with 0xFF 0xE3 0x10 0x0C 0x00 0x01 |
| .m4a | m4a | starts with 0x00 0x00 0x00 0x18 0x66 0x74 0x79 0x70 0x6D 0x70 0x34 0x32 0x00 0x00 0x00 0x00
0x6D 0x70 0x34 0x32 0x69 0x73 0x6F 0x6D 0x00 |
| .mpg | mpeg | starts with 0x00 0x00 0x01 0xBA |
| .avi | AVI | starts with RIFF |
| .bmp | BMP image | starts with BM |
| .doc | document | many formats. could be wordpad RTF, could be ms office 97-2003, could be plain ascii text. ms office 97-2003 binary format documents are documented in the msdn. |
| .ppt, .pps, .xls, .xlw, .dot | ms office 97-2003 binary format documents are documented in the msdn. | |
| .pub, .pubx | Microsoft Publisher | Unpublished format. |
| .csv | Comma Separated Values | bunch of fields separated by , and may or may not have strings surrounded by double quotes. line endings differ depending on platform. |
| .txt | text file or Tab Separated Values | if it's a text file, it has plain ascii or UNICODE text, viewable with Notepad.
if it's Tab Separated Values, it's an ASCII file with ascii character code 0x09 or Horizontal Tab separating the fields. |