Why AI engines ignore most university websites
ChatGPT, Perplexity and Google AI Overviews do not rank web pages. They synthesise answers from vast corpora and cite sources they deem reliable, structured and factually verifiable. Most university content fails on all three counts.
In the United States, only 29% of ChatGPT responses about higher education name a specific college or university (Source: Skolbot GEO Monitoring, 500 queries x 6 countries x 3 AI engines, Feb 2026). On Perplexity, that figure rises to 38%, especially for institutions that are consistently documented in Common App, College Scorecard and U.S. News. The remaining 62-71% of responses are generic summaries with no institutional mention. Your content exists online, but AI engines cannot extract anything citable from it.
Four factors separate citable content from invisible content: technical structure, data specificity, source authority and answer clarity. Each one is within your enrolment marketing team's control, whether you are a flagship state university, a regional campus or a private college.
What makes content "citable" by an LLM
Structure beats length every time
An LLM does not read a page start to finish. It extracts answer fragments from recognisable patterns: question-answer pairs, comparison tables, definitions framed by semantic markup. A 3,000-word article with no clear structure is less likely to be cited than an 800-word page with descriptive H2s, a data table and a marked-up FAQ.
Structural signals that LLMs exploit:
| Signal | Impact on citability | Implementation difficulty |
|---|---|---|
| FAQ marked up in JSON-LD | High: direct extraction | Low |
| Tables with descriptive headers | High: comparable data | Low |
| H2/H3 phrased as questions | Medium: semantic matching | Low |
| Schema.org EducationalOrganization | High: entity identification | Medium |
| Sourced numerical data | High: verifiable facts | Medium |
Specificity wins over superlatives
Content claiming "our university offers world-class programmes" will never be cited. Content stating "94% of 2025 graduates found employment or graduate study within 6 months, median earnings $58,000, institutional outcomes survey aligned with IPEDS reporting, 418 respondents" can be extracted as factual evidence.
Data points AI engines actively look for on US university websites:
- Graduation and employment outcomes, with methodology and sample size
- Tuition, fees and total cost of attendance by programme or residency status
- Official accreditations and recognitions (regional accreditation, AACSB, ABET, CCNE)
- Rankings with source and year (U.S. News, QS, THE)
- Student numbers, acceptance rates, Common App participation, campus location and housing
4 techniques to make your content citable
1. Implement Schema.org on key pages
Universities with structured Schema.org markup achieve an average of +12 visibility points in AI engine responses (Source: Skolbot GEO Monitoring, 500 queries x 6 countries x 3 AI engines, Feb 2026). The EducationalOrganization markup transforms your institution from a block of text into an identifiable entity. The Course schema does the same for each major, degree or graduate programme.
For the full technical implementation guide, see our Schema.org guide for universities.
The minimum implementation covers three schemas:
- EducationalOrganization on your homepage and About page
- Course on each programme page
- FAQPage on FAQ pages and blog articles containing Q&A sections
The fields that matter most to LLMs: accreditation, numberOfStudents, aggregateRating, alumni and programPrerequisites. These are the data points ChatGPT cross-references with FERPA guidance, the Federal Trade Commission for consumer transparency expectations, and public listings such as Common App and College Scorecard to validate reliability. If your site overstates job outcomes or hides costs in PDFs, that inconsistency works against citation probability.
2. Structure every page with direct answers
AI engines operate on a question-answer model. To maximise citation probability, each H2 should pose or imply a question, and the first 1-2 sentences must answer it directly. The rest of the paragraph adds context and nuance.
Before:
"Our College of Business is known for academic excellence and strong industry connections across the country."
After:
"The BS in Business Analytics at [University] is a 4-year programme costing $24,900 per year for in-state students, with 94% career outcomes at 6 months and a median starting salary of $58,000 (2025 outcomes survey, 418 respondents). It is AACSB-accredited, available through the Common App, and includes a required internship in the junior or senior year."
The second version contains six verifiable data points. The first contains none.
3. Build comparison tables with your data
Tables are the most extractable format for an LLM. A clean table with clear headers and numerical data will be preferred over a narrative paragraph containing the same information.
Example of a citable table for a programme page:
| Criterion | BS Business Analytics | MBA |
|---|---|---|
| Duration | 4 years | 2 years |
| Annual tuition | $24,900 | $38,500 |
| Employment rate at 6 months | 94% | 96% |
| Median starting salary | $58,000 | $92,000 |
| Accreditation | AACSB | AACSB |
| Intake size | 180 | 55 |
Publish these tables on your programme pages, not buried in downloadable brochures. AI engines do not reliably read PDFs hidden behind lead forms.
4. Add marked-up FAQ sections
An FAQ section serves two purposes: it answers the questions prospects ask AI engines, and FAQPage JSON-LD markup enables structured extraction.
The common mistake is writing marketing FAQs ("Why choose our university?") instead of informational FAQs ("What GPA and test policy apply?", "Is the programme on Common App?", "What is the net price after aid?"). AI engines favour the latter.
To diagnose your current situation, use our ChatGPT visibility diagnostic tool.
How to measure whether your content is being cited
Checking whether AI engines cite your university requires a systematic approach.
3-step testing protocol
-
Identify your 20 strategic queries: the questions prospects ask about your institution, programmes, city and sector. Examples: "best business school in Texas", "ABET-accredited computer science degree Florida", "tuition [university] 2026".
-
Test across 3 AI engines: submit each query to ChatGPT, Perplexity and Gemini. Record whether your institution is mentioned, whether the information is accurate, and whether sources are cited.
-
Track monthly evolution: LLM corpora evolve. Content published or updated today may take 4-8 weeks to be integrated. Measure monthly to identify trends.
Key metrics to track
| Metric | Target | Measurement frequency |
|---|---|---|
| Mention rate (brand queries) | >80% | Monthly |
| Mention rate (generic queries) | >20% | Monthly |
| Accuracy of cited information | 100% | Monthly |
| Sources cited (Perplexity) | >2 pages from your site | Monthly |
For a complete methodology on tracking your AI visibility, see our GEO guide for universities.
Before and after: optimising a programme page
A regional US university wanted ChatGPT to mention its MS in Cybersecurity when prospects asked "best cybersecurity master's in the Southeast".
Before optimisation:
- Programme page without Schema.org
- Narrative text with no numerical data
- No FAQ section
- No comparison table
Result: ChatGPT never mentioned the university for this query.
After optimisation:
Coursemarkup witheducationalLevel,provider,accreditation- Table with tuition, duration, employment rate and median salary
- Marked-up FAQ with 5 questions (entry requirements, STEM designation, internship, class size, tuition)
- Link to College Scorecard and programme accreditation sources as authoritative references
Result at 8 weeks: ChatGPT cites the university in 3 out of 5 responses for the same query. Perplexity links to the programme page as a source in 4 out of 5 cases.
This correlation between structured markup and citability holds across our full panel. For the technical mechanisms, our article on structured data for universities details each schema.
FAQ
How do I check if ChatGPT already cites my university?
Test 20 strategic queries directly in ChatGPT. Record every mention of your institution, the accuracy of the data and the presence of links. Repeat monthly to track changes. Perplexity is simpler to audit because it displays its sources beneath each response.
How long before optimised content gets cited?
Between 4 and 8 weeks after publication or modification. LLM corpora are updated in waves. Perplexity is more reactive because it queries the live web, but the page still needs strong structure and trustworthy sources.
Is Schema.org markup enough to get cited?
No, but it is necessary. Markup identifies your university as a verifiable entity. Without it, AI engines must extract this information from raw text, with a high error rate. Markup alone does not replace specific, data-rich, well-structured content.
Should I optimise for ChatGPT or Perplexity first?
Both, as the techniques overlap. But if you must prioritise, start with Perplexity: it cites sources explicitly, making tracking straightforward. Optimisations that work for Perplexity also benefit ChatGPT.
Which pages on my site should I optimise first?
Your homepage, the three most-enquired programme pages, your admissions page and your FAQ page. In the US context, prioritise the pages where prospects compare tuition, accreditation, Common App access, outcomes and programme modality.
Is your university cited by ChatGPT? Test your AI visibility for free


