🤔 Is your feature request related to a problem? Please describe.
Most AI models are not trained on PDF data since parsing it is difficult. I'm working on a PDF parsing project that removes tables, charts headers, etc., so extraction libraries like PyMuPDF can improve significantly.
I solved table removal; I would love to solve header removal now.
💡 Describe the solution you'd like
Can we remove headers/footers on PDFs so the output of page.get_text() is cleaner?