This is an incredibly tall order. PDF is a maddeningly convoluted standard. You should look for a PDF scanning and parsing API to interface with.