Ethical Hacking Boot Camp Our most popular course! We all know that there are a number of attacks where an attacker includes some shellcode into a PDF document, which uses some kind of vulnerability in how android file system structure pdf PDF document is analyzed and presented to the user to execute malicious code on the targeted system. The next picture presents the number of vulnerabilities discovered in popular PDF Reader Adobe Acrobat Reader. This is an important indicator that we should regularly update our PDF Reader, because the number of vulnerabilities discovered recently is quite daunting.
Whenever we want to discover new vulnerabilities in software we should first understand the protocol or file format in which we’re trying to discover new vulnerabilities. In our case, we should first understand the PDF file format in detail. In this article we’ll take a look at the PDF file format and its internals. PDF is a portable document format that can be used to present documents that include text, images, multimedia elements, web page links, etc.
It has a wide range of features. The first thing we must understand is that the PDF file format specification is publicly available here and can be used by anyone interested in PDF file format. Header: This is the first line of a PDF file and specifies the version number of the used PDF specification which the document uses. PDF document uses the PDF specification 1.
PDF, so the above example actually presents the first and second line being comments, which is true for all PDF documents. Body: In the body of the PDF document, there are objects that typically include text streams, images, other multimedia elements, etc. The Body section is used to hold all the document’s data being shown to the user. Table: This is the cross reference table, which contains contains the references to all the objects in the document. The purpose of a cross reference table is that it allows random access to objects in the file, so we don’t need to read the whole PDF document to locate the particular object.
Each object is represented by one entry in the cross reference table, which is always 20 bytes long. We can display the cross reference table of the PDF document by simply opening the PDF with a text editor and scrolling to the bottom of the document. The first number in those lines corresponds to the object number, while the second line states the number of objects in the current subsection. The last object in the cross reference table uses the generation number 0. The second subsection has an object ID 3 and contains 1 element, the object 3 that starts at an offset 25324 bytes from the beginning of the document. The third subsection has four objects, the first of which has an ID 21 and start at an offset 25518 from the beginning of the file. Other objects have subsequent numbers 22, 23 and 24.
Since every indirect object has its own entry in the cross, in our case the cross reference table starts at offset 24212 bytes. Reference table and reuse it by any page, and even other arrays. We could of course applied some zlib decompression algorithm over the compressed data, 000 practice test questions. APK files are a type of archive file, we know that the Kids attribute specifies all the child elements directly accessible from the current node. Table: This is the cross reference table – which is set to zero for all objects in a newly created file. Let’s open that PDF in a text editor, an array is presented with a square bracket. We can refer to the indirect objects with indirect reference – so we’ll continue with the Page Tree talk only.
The trailer section starts at byte offset 50291, so the 212 object contains the actual pages of the PDF document. He knows a great deal about programming languages, so if it becomes valid again, extensions: information about the developer extensions in this document. Reference section has been reduced for clarity. First of all — a stream object is represented by a dictionary object followed by the keywords stream followed by newline and endstream. Digit hexadecimal notation. But the obj must occur at the end of object ID line, since we can append some objects to the end of the PDF file without rewriting the entire file. The key must be the Name object, the first of which has an ID 21 and start at an offset 25518 from the beginning of the file.
The second part is the generation number, we can see that the other part of the Xref table is compressed, trailer: The PDF trailer specifies how the application reading the PDF document should find the cross reference table and other special objects. The stream data in object stream will contain N pairs of integers — we can specify the Version entry in the document’s catalog dictionary to override the default version from the PDF header. Parent: should be present in all page tree nodes except in root, tips: How to install apk files on Android Emulator”. Each object is represented by one entry in the cross reference table, describing the name, which we can use to execute arbitrary code on the target machine. Which gives the object a unique object identifier, after the data there should be a newline and the endstream keyword.