I have been busy recently working on a simple DRM project for a client. The requirement is to protect around 500 pdf files for use on a number of machines in the field. The client wants engineers to be able to read the files but nothing else. No printing, copying, nada !
The challenges have been varied and I’ll cover some of the others in another article more specific to the protection of the files. The problem I want to cover here is the need to get the files which are as large as 350MB slimmed do to a size that can be handled by the conversion to secure documents at a later stage.
Once upon a time – Adobe acrobat was the only tool for the creation and manipulation of acrobat pdf documents. Now I have 6 different suppliers tools that allow me to do just about everything I need to these information carriers. Except for this job!! 4GBs of pdf files – 10’s of thousands of pages and differing requirements from file to file !! Pdfs with a A4 image per page…..
While I could have taken the files and manually made them smaller by cutting into sections – I realised that would take a substantial amount of time. I needed something that would allow me to
1) chop the files into sections based on the number of pages
2) chop the files into sub sets based on ranges of page that coincide with sections
3) get round the fact that some are password protected (and I know the password)
4) allow me to repeat the above as many times as necessary to then run the process on a remote server which holds the actual distribution versions of the protected files
So the tool that got me through this ? A-pdf Split command line – from www.a-pdf.com @ http://www.a-pdf.com/split/cmd.htm
In the end I opted for creating a dos .cmd file with the commands in it – this allow me to test and rerun the commands while getting the syntax correct.
An example line from the script shows the use of quotes to get round spaces in the pdf names and passwords on the files
pscmd “HM400-1_0502.pdf” -L”HM400-1_0502.psl” -Wpasswordonthefile
where the first parameter is the file to split – the second is the text file holding the ranges and the -W the file password
The .psl file holds the ranges
1-263
264-828
829-1097and the result once run are three files named
HM300-1.0001.pdf
HM300-1.0002.pdf
HM300-1.0003.pdf
So there you go – a tool that once again does what is says on the tin……