setcooki
ALLROUND WEB DEVELOPER

11
Oct 11

Get bookmarks from PDF with PHP

  
  
  

Splitting a PDF according to its Table of Content (Bookmarks) with PHP

by using the linux lib http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/ you can use the following command to write a bookmark structure from a PDF to a .txt file

pdftk /tmp/bookmark.pdf dump_data output /tmp/report.txt

Use the known PHP exec(), system() commands to execute the command. The result looks like:

 InfoKey: Creator
InfoValue: Writer
InfoKey: Producer
InfoValue: OpenOffice.org 3.3
InfoKey: Author
InfoValue: setcookie
InfoKey: CreationDate
InfoValue: D:20110527153051+02'00'
PdfID0: 51d2362ea610d06bd86af64d44b4f
PdfID1: 51d2362ea610d06bd86af64d44b4f
NumberOfPages: 66
BookmarkTitle: 1 Vorwort
BookmarkLevel: 1
BookmarkPageNumber: 4
BookmarkTitle: 2 Versionen / Historie
BookmarkLevel: 1
BookmarkPageNumber: 4
BookmarkTitle: 3 Allgemein
BookmarkLevel: 1
BookmarkPageNumber: 5
BookmarkTitle: 4 Gateway URL
BookmarkLevel: 1
BookmarkPageNumber: 5
BookmarkTitle: 5 Globale Parameter
BookmarkLevel: 1
BookmarkPageNumber: 5
....

The result can be parsed into a nested array to iterate of the depths of the structure to split the PDF or use the structure for a Bookmark preview in a JS/AS 3 PDF Viewer.
You should be able to redirect the stdout to gain direct access to the output with PHP rather than writing the report file first. In conjunction with SWFTools a whole PDF > Flash converter can be build.
All OpenSource so need to install PDFLib or whatever.

Good luck


Copyright © 2012 setcooki
Proudly powered by WordPress, Free WordPress Themes