I have a few dozen PDF files that start with Date (f.e. 2019.07.01) then followed by long text each word is separated by a space. I want to keep the date and the following two words (names of authors) and remove the rest of the text in the following fashion:
2010.02.01 word1 word2 word3 word4 word5 ... .pdf ---> 2010.02.01 word1 word2.pdf
2010.04.02 wordX wordY wordZ wordXX wordXY ... .pdf ---> 2010.04.02 wordX wordY.pdf
The text and the number of characters in 'word1' and 'word2' are not the same (author names) and they differ in each file. How do I keep the date and the author names and get rid of the remaining long text?
I tried to look up if I can do this with RegEx, but couldn't succeed so far. Could someone advise how can this be achieved?
Thanks in advance!
Frank