Why AI Research Stays Shallow -- Seven Barriers I Found in Overseas Company Investigations

AI-powered research has seven structural weaknesses: choosing information sources, constructing search queries, selecting search engines, fetching web pages, geo-blocking, reading PDFs, and managing tokens. Here’s what I’ve noticed from conducting overseas company investigations.

I once asked ChatGPT to “list 20 food manufacturers in Thailand.” What I got back was a list of well-known companies with English-language websites.

“I could have gotten the same results from a Google search myself,” I thought.

Overseas company research is my job. Using AI as a tool while conducting investigations, I noticed that AI hits the same walls every time, in the same predictable pattern. Let me walk through the seven barriers I’ve encountered.

There Are Barriers at All Seven Stages

When you hand research to AI, problems surface at all seven stages – from choosing information sources to managing tokens.

#	Research Stage	Where AI Stumbles
1	Where to look (source selection)	Only does web searches. Can’t predict “this data lives in that database”
2	Search query construction	Doesn’t use AND/OR or phrase search. Doesn’t search in local languages
3	Search engine selection	Doesn’t use Google. Disadvantaged for Asian languages
4	Web page retrieval	Can’t read JavaScript-rendered pages. Can’t operate database interfaces
5	IP-based geo-blocking	Servers are in the US and can’t access locally restricted sites
6	Document reading	Reads every page from top to bottom. Can’t skim
7	Token management	Poor allocation – stops right when it matters

Let me go through each one.

Barrier 1 – AI Doesn’t Know Where the Data Lives

AI starts with a web search no matter what you ask, but much of the data needed for overseas company research can’t be found through web search.

This was the first barrier I noticed.

When asked to “list food manufacturers in Thailand,” I used to start with Google myself. Then one day I discovered that the Thai Department of Industrial Works (DIW) maintains a factory registration database. Search by industry classification code, and you get the complete list of registered factories. The difference was eye-opening.

	What AI Does	What I Usually Do
First action	Web search for “Thailand food manufacturer”	Predict “Thai food factory data should be in the DIW database”
Information source	Google search results	DIW factory database, searched by industry classification code
What you get	10-20 well-known companies with English websites	Complete list of registered factories matching the code
What you miss	Every company without an English website	Registration-based, so coverage is comprehensive

The same pattern shows up in other investigations. “If a product requires certification, the certification database has a manufacturer list.” “US customs data is searchable on ImportYeti by HS code.” This ability to predict where data lives came from conducting hundreds of investigations.

Here’s what happened when I actually asked ChatGPT and Perplexity.

ChatGPT's response. Results are dominated by well-known companies found through web search.

Perplexity's response. Source links reveal all information comes from English-language websites. Not a single Thai government database is referenced.

Meanwhile, accessing the DIW database directly with an industry classification code returns data like this.

(Sample entry – Ban Bung Dairy Cooperative, 24/15 Non Chak, Ban Bung District, Chonburi Province. Manufacture of pasteurized butter, processed milk, and yogurt. Machinery: 176.50 HP, Capital: 11.5 million baht, 10 employees.)

Thai Department of Industrial Works (DIW) search results. Shows factory registration numbers, addresses, industry codes, equipment capacity, and employee counts.

Barrier 2 – Search Queries Are Sloppy

AI tends to generate simple English keyword strings as search queries. It doesn’t use AND/OR combinations or phrase search, and it doesn’t search in local languages.

When you ask AI to “research food manufacturers in Thailand,” it starts searching with simple queries like “Thailand food manufacturer” or “Thai food company list.” I would take a more deliberate approach.

	AI’s Search Queries	Queries I Build
Language	English only	Search in Thai (“โรงงานอาหาร” = food factory)
Structure	String of keywords	Combine conditions with AND/OR
Filtering	None	Filter by province, industrial estate, industry code
Phrase search	Not used	Wrap in “” for exact match

For example, searching in Thai for “โรงงานอาหาร” surfaces a large number of local small and medium manufacturers that don’t appear in English-language results. Add a province name – “โรงงานอาหาร ชลบุรี” – and you narrow it to food factories in Chonburi Province.

The quality of your search query changes the number of companies you find by orders of magnitude. AI doesn’t seem to have learned this craft well, and defaults to generic English keywords.

Barrier 3 – The Search Engine Isn’t Google

Most AI tools use a search engine other than Google internally. For Japanese and Asian languages, Google performs significantly better, putting AI-powered search at a disadvantage.

Even if you’ve constructed a local-language query as described in Barrier 2, the engine processing it may be the bottleneck. For English, most search engines deliver comparable results. But for Japanese, Thai, and Vietnamese, the differences become significant.

Consider searching in Thai for “โรงงานอาหาร ชลบุรี” (food factories in Chonburi Province).

Search Language	Google Search	AI’s Built-in Engine
English	Comparable results	Comparable results
Japanese	Government and industry sites rank high	Accuracy varies
Thai	DIW database and local companies rank high	English sites tend to mix in

ChatGPT, Perplexity, and Copilot use non-Google search engines (Gemini being the exception). Users have no visibility into which engine their AI tool uses. When Asian companies “somehow don’t show up,” the search engine itself is sometimes the reason.

Barrier 4 – AI Can’t Operate Databases

AI can read static web pages, but it can’t fill in forms or fetch JavaScript-rendered pages. Most information sources used in overseas company research hit this barrier.

Here are examples of sources that AI can’t read even when given the URL.

Thai Ministry of Commerce (DBD) – Enter company name in Thai → download financial data
UL Product iQ – Enter category code to find certified companies
National factory registration databases – Specify industry code and region to list factories

All require entering conditions into a browser form to retrieve data.

Thai Department of Industrial Works (DIW) factory search form. Requires entering industry codes and province names in Thai.

The fact that data “exists” on the web and the fact that AI can “access” it are two different things.

Barrier 5 – Some Sites Can’t Be Opened from US Servers

ChatGPT, Claude, Perplexity, and other AI tools most likely access the web from servers in the United States. Sites that are only available within the local country are simply unreachable.

Some Thai and Indonesian government websites apply IP-based geo-blocking, allowing access only from domestic IP addresses.

Thai Department of Industrial Works (DIW) – Some search functions are restricted to Thai IPs
Indonesian company registration databases – Some pages are domestic-access only
Chinese corporate credit information sites – CAPTCHAs and connection limits are imposed on overseas access

I live in Thailand, so Thai government sites work fine for me. But as long as AI servers are located in the US, data from these sites is out of reach. AI tools don’t have VPN capabilities to work around these restrictions.

Barrier 6 – AI Can’t Skim a PDF

When you give AI a PDF, it starts reading from page 1. It can’t scan the table of contents and jump to the relevant chapter.

The PDFs I encounter in overseas research commonly exceed 100 pages. Compare how I read versus how AI reads.

How I read – Check the table of contents → open only the relevant chapters → pull out just the tables and charts
How AI reads – Process every page from the beginning → hit the token limit around page 20-30

The data tables in the latter half are never reached. Reading every page is something even I almost never do.

Barrier 7 – Poor Token Allocation Stops Research Mid-Stream

AI has a hard limit on how much information it can process at once (the token limit). The problem isn’t the limit itself – it’s that AI is bad at allocating tokens.

I can glance at 50 search results and decide “only these 5 are worth reading.” AI struggles with this triage, reading results one by one in order. Around the 10th to 20th result, it hits the token limit.

Right when it’s getting close to the core information, it stops and asks “shall I continue?” Deciding where to concentrate tokens is something AI still finds difficult. You asked it to investigate, but it stopped halfway. This is one root cause of the feeling that “AI research is shallow.”

The Core Issue Behind All Seven Barriers Is Research Design

What AI lacks isn’t processing power – it’s the judgment to decide “where to look next.”

The “deep research” features being marketed by AI companies are evolving toward more searches and more pages read. But I think this is just spreading wider, not going deeper.

True depth means following one lead to the next, drilling down through multiple layers.

Discover a company name
→ Check certification DB for model numbers
→ Identify the OEM parent
→ Verify factory operations in local language

This judgment of “where to dig next” is what determines research depth. I leverage AI’s processing speed and multilingual comprehension while keeping a human in charge of source selection, search engine choice, and database access. A tool is only as good as the person using it. That’s what trial and error have taught me.

What Gets Missed When You Treat AI Search Results as “Research Complete”

Asking AI to “look this up” is natural. But treating the results as a finished investigation means gaps will remain.

#	Barrier	What Happens as a Result
1	Source selection	Every company not on the open web is missed
2	Search query construction	Information in local languages is entirely missed
3	Search engine	Search accuracy drops for Asian languages
4	Page retrieval constraints	Government and certification database data can’t be extracted
5	Geo-blocking	Locally restricted sites are inaccessible
6	Document reading	100-page reports can’t be read through
7	Token allocation	Research stops right when it matters

AI has the power to gather information broadly. Add “which databases to use,” “which languages to search in,” and “how to interpret the data” – and the depth of your research changes.

About the Author

Takashi Kinoshita – CEO, Taitonmai Co., Ltd.

Graduate degree from a national university
8 years in procurement at a major Japanese electronics manufacturer
Including 2 years stationed at the company’s Thailand factory as Procurement Section Manager, managing local staff as the sole Japanese manager
Experienced procurement operations in English, Chinese, and Thai
After founding Taitonmai, has conducted corporate investigations spanning 80+ countries and 10,000+ companies

https://taitonmai.co.jp/en/column/20260216_02/

Why AI Research Stays Shallow -- Seven Barriers I Found in Overseas Company Investigations

There Are Barriers at All Seven Stages

Barrier 1 – AI Doesn’t Know Where the Data Lives

Barrier 2 – Search Queries Are Sloppy

Barrier 3 – The Search Engine Isn’t Google

Barrier 4 – AI Can’t Operate Databases

Barrier 5 – Some Sites Can’t Be Opened from US Servers

Barrier 6 – AI Can’t Skim a PDF

Barrier 7 – Poor Token Allocation Stops Research Mid-Stream

The Core Issue Behind All Seven Barriers Is Research Design

What Gets Missed When You Treat AI Search Results as “Research Complete”

About the Author

Related articles

More reading

How to Get Past Shallow AI Research -- Overcoming the Seven Barriers

Finding Overseas Factories and Suppliers on Google Maps — An Overlooked Corporate Research Tool

What You Miss When You Only Research Foreign Companies in English — How Local-Language Research Improves List Accuracy

We help scope your overseas research