OCR on Screenshots with Pytesseract #14

jtabor · 2025-09-19T20:27:52Z

This adds functions for ocr with screenshots. It's helps finding button locations when get_uilayout() can't find UI elements (ie. games where the layout just has a PlayerView element where the game is rendered and the UI elements from the game itself are not in the layout).

It passes full-sized screenshots into pytesseract and requires tesseract to be installed.

Similar to get_uilayout, the LLM can sometimes use the OCR functions to get what it needs instead of reading a screenshot, which saves tokens.

jtaborT and others added 2 commits September 19, 2025 16:10

added pytesseract ocr functions

df3d76e

added install instructions for tesseract

03c89a8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OCR on Screenshots with Pytesseract #14

OCR on Screenshots with Pytesseract #14

Uh oh!

jtabor commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

OCR on Screenshots with Pytesseract #14

Are you sure you want to change the base?

OCR on Screenshots with Pytesseract #14

Uh oh!

Conversation

jtabor commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants