on every day data processing penners command line

Hi there,

given that you need to do the bookkeeping (and btw the renewal application for social money at the so called Jobcenter office) every 3 months or once a year – on a linux system and as a self entertainer, as me I am, then here I drop some command lines as a starting point and reference:

#on hand out invoices

# in here you got all the invoice into in the end
mkdir ~/pathtoJC150312/print2

#first as a workaround you remove all the white spaces from its bills filenames; bash goes crazy with it..
for f in /pathtocollectedbills/1[45]* ; do mv “$f” `echo $f | tr ‘ ‘ ‘_’`; done

#then copy them bills to your invoices dir like this
for FILE in “`ls /pathtocollectedbills/1[45]*|grep -v A140[1-7]`” ; do cp $FILE ~/pathtoJC150312/print2; done

#all into 1 pdf file, it is physical printer (hp) gets dirty on chinese cardridges otherwise

cd ~/pathtoJC150312/print2;

pdftk *.pdf cat output zusammen.pdf

#to leave it here as digital archive and to make it staple (paper):

lp -o ColorModel=KGray -o fit-to-page -o outputorder=reverse zusammen.pdf

On filling out form pdfs

#linux’ pdf viewers as okular gets dirty printing to paper, especially when it treats about fdf forms filled out before.  – just take lp for it..

lp Abschliessende-Angaben-Einkommen-Selbstaendiger.pdf

# on ordinary bookkeeping and handout workflow:

before bookkeeping: we had source bills in our emails, from the online shop accounts, some loose papers flying around..,

-watch them collected:

  1. consult: as sink-source rep, sinks:
    1. survey of historic handout:give all sinks files
    2. among them tab.xls to be used for c&p interactivly into sinky form,
    3. source: accounts currents, human memory, bills in our emails, from the online shop accounts, some loose papers
  2. (that means interactively log in and download (dl) accounts currents .xls of period (BWZ) from Bank online,
    1. -btw dl account current pdfs, to accounts current dir, to be afterward copied (by dolphin? or cl) to hand out print dir, there bound with pdftk ,and printed with lp (see cl above)
  3. in order to find them expenses and fit them to your historic tab.xls: here you got in the odyssey of calc sorting:https://forum.ubuntuusers.de/topic/open-office-calc-tabelle-nach-datum-sortieren/
    # see posts after: “Datümern, die Calc – warum auch immer – nicht als Datum erkennt.
  4. dump them slips to pathtocollectedbills
    1. scanned->,  from slips dir, online accounts, where e
    2. named with dolphin on historic pattern
    3. calculating them contents in calc, c&p from file name to tab.xls
    4. process expense hand out on cl above
  5. leave files on historic survey updated on disk and on staple,1,2,8. You copied the historic handout dirs to mess on it, right?
  6. Take care on having signed each form twice, note for eye keeping each affair over the years, bind them staples in envelope: 1.

#Tool search in pdfs

#for example for a special admin notice among much bundles
for i in 1[45]*; do pdftotext 140*.3.pdf – |grep “01.08”|grep “14”; done

#First page jc admin notices handouts: unpaper, add to bundle, copy last 3 pages

#unpaper scan sandwiched install http://wiki.ubuntuusers.de/ocrmypdf.. it is to say that we should deliver a seamless bundle of notices of administration of social office (Jobcenter), to say their first pages respectively, that we earned social money whole years long.

#first scanning the notice of administration (Bescheide) itself, sandwiched

hp-scan –adf -f /home/kubuntu/Downloads/scan002.pdf
#drops a non sandwiched scan
/opt/OCRmyPDF-2.2-stable/OCRmyPDF.sh -f -l deu /home/kubuntu/hpscan002.pdf 150731JC_BescheidevorlaeufigeBewillBWZ1.8.14-31.1.15.pdf
rm 140301JC_BescheidevorlaeufigeBewillBWZ1.8.14-31.1.15.pdf
okular 150731JC_BescheidevorlaeufigeBewillBWZ1.8.14-31.1.15.pdf

# extract first pages out of one, attach it to another staple, then copy last 3 pages in the working dir to be printed out and to be archived as applied

pdftk A=$A B=$B cat B A1 output “$B”_

pdftk B=$B cat B11-end output “$C”
xpdf “$C”


  • update

  • have a look at my recent elaborate scanndistribute.sh for any scanning/distribution purposes..


#sh jc-startup150823.sh

#sh jc-startup150823.sh
echo ‘https://softwareforhartz4entertainer.wordpress.com/2015/07/31/on-every-day-data-processing-penners-command-line/
254923374520398jg -wordpress pass
blecher.tom20164@yandex.com -wordpress user login
add first and last love letter, security pass
Pathes as variables /pathtocollectedbills (after bookkeeping) /pathtoJC150312/ (JC handouts’
#there you got work; introduce it in the computer, the behaviourist among you got keen for new animal experiment data
kate –start “JC-Burokratie” –line 26891 # it evokes old session records, there we are at our local, dirty, non public origin
ding& # without ding no way

Problems I got rid off
#P2 kate could not handle text files more then 30.000 lines
#P3 k had problems with calc exporting to pdf: use gui/file/pdf export.. and page print preview
#P4 k had problems with calc sorting dates: Data/Sort, Data/TexttoColums wg Datum (press on column’s head in dialog
#P5 k okular could not print form.pdf: lp does it
#P7 on pdf forms
#aha we see there seems to be a non interactive way prepared to enter your calc data into a form pdf…
#P8 do not like that one cannot _restart hp-scan adf.
#P10 do not like that there is no pdf preview in dolphin neither.P7 scanimage instead of hp-scan –adf, Tried this already. http://ubuntuforums.org/showthread.php?t=2291873., and https://github.com/fritz-hh/OCRmyPDF/issues/116 k, what comes now? No k

P11 lp -o ColorModel=KGray -o fit-to-page -o outputorder=reverse got some message on the hp display: format not apt or so… probably because I did: #uff wie got some Problems big black bars below, this:
#pdfcrop –margins ‘0 0 0 -189’  — Now: I do not understand this graphics measuring day by now or how to look it up. For example this – here what does that mean? Is it apt for printing on din a4 –  not?

pdfinfo $Pstore$NstoreCreator:        pdftk 2.01 – http://www.pdftk.com
Producer:       itext-paulo-155 (itextpdf.sf.net-lowagie.com)
CreationDate:   Mon Aug 24 23:15:39 2015
ModDate:        Mon Aug 24 23:15:39 2015
Tagged:         no
Form:           none
Pages:          6
Encrypted:      no
Page size:      609 x 819 pts
Page rot:       0
File size:      696187 bytes
Optimized:      no
PDF version:    1.4

Todos: T1 all whole pathes as variables /pathtocollectedbills /pathtoJC150312/ to be integrated in scripts

T2: tab.xls once uploaded anonymised

T3: format this? rec&p to libreoffice for contents?

