The Wrong Team: Blinded by (Computer) Vision for PDF Processing
As an industry-agnostic AI expert who helps funds diligence early-stage start-ups, one of the things I see across the board, especially with the popularity of Agentic service offerings from startups, is the pitches that focus on the development of custom tools for PDF processing, involving image processing (e.g., computer vision) via LLM(s). Their fundraise, then, centers around hiring AI experts and prompt engineers to try to push the boundaries of computer vision and/or LLM algorithms to bring their MVP to market.
In a panel last month, hosted by Mucker Capital at LATechWeek on the topic of “Building an AI Startup,” I shared (YouTube recording here) two reasons why this is a terrible MVP strategy and why I’m skeptical that these teams have the right founder-solution fit to bring their idea to market. Here are four key reasons why – and what to ask during diligence.
Mistake #1: Requiring 1+ PhD in Math/Stats/AI to launch an MVP
When a team requires that AI experts push the envelope on current image processing (e.g., computer vision models), they’re essentially requiring the team to get multiple PhDs in the process; I explain why that is in an earlier Substack post: “Why Foundational Models are not SaaS Products: Advice for AI Startup Diligence”. Additionally, they also have to prove that the ideas work in practice, on real customer data. All while building an MVP, e.g., the plane, while flying it, under very short timelines. When quick iterations and minimizing burn rate are critical for early-stage startups, is this even a realistic role or requirement?
Mistake #2: Not all PDFs are created equal
When startups pitch services that are powered by PDFs behind the scenes, they’re typically focused on processing bills, invoices, and tax forms – industries where an exact answer is required, and an incorrect response will have legal and financial consequences.
When LLMs process PDFs as images without guaranteeing exact answers (because they’re probabilistic), the startup may be one error away from a lawsuit. I compare how poorly LLMs do on processing IRS Tax Form 1040 in an earlier Substack post: “Magic LLM Models, Real IRS Penalties: What Tax Form Errors Reveal about the Risks of Blindly Trusting LLMs — and What It Means for AI Diligence”. When LLMs are used as a hammer, treating every problem as a nail, there is no founder-solution fit.
Mistake #3: Forgetting that PDFs are a data format
In the Substack post on LLM comparison and the LATechWeek panel, I point out that PDF is actually a data source (Wikipedia) – and open-source packages have been around for decades to process it, no computer vision or PhDs required.
For processing IRS Form 1040, as I outlined in this Substack, I found an open-source Python package to do so, fillpdf, created by Tyler Houssian, based on this StackOverflow exchange, which did a great job; no LLMs required; LLMs, by contrast, fall flat.
When the goal for an MVP is to test out the market quickly, for needs and willingness to pay, startups need the right tools for the job, not shiny-object syndrome. Teams with shiny-object syndrome are not the right teams.
Keep reading with a 7-day free trial
Subscribe to Advice for AI Startup Diligence to keep reading this post and get 7 days of free access to the full post archives.

