-
LLMs as Evaluators - Who Watches the Watchers?
As LLMs increasingly evaluate other LLMs, grade student work, and assess human performance, we create a circular system where artificial intelligence defines its own success criteria. The implications extend far beyond technical metrics to fundamental questions about authority, standards, and who gets to decide what constitutes quality.
-
Red Teaming AI for Social Good - Testing for Hidden Biases in the Age of Generative AI
As generative AI systems become integral to our digital lives, UNESCO's Red Teaming playbook reveals the urgent need for systematic bias testing. But should we test for biases or accept them as reflections of human complexity? The answer reveals fundamental questions about fairness, representation, and the future of AI for social good.
-
Can LLMs Be Unbiased? - The Dictionary Dilemma and the Weight of the World's Opinions
Large Language Models inherit the biases of human civilization while claiming objectivity. But should they be neutral arbiters or faithful mirrors of human complexity? The answer reveals fundamental questions about truth, representation, and the nature of knowledge itself.
-
Teaching LLMs Like Teaching Kids to Ride - Why Analytical Tasks Need Focused Instruction
Just as teaching a child to ride a bike requires clear, focused instruction rather than overwhelming information, effective LLM prompt engineering for analytical tasks demands precision, specificity, and structured guidance to overcome cognitive biases and achieve reliable results.
-
The Representation Crisis - How LLM-Based Synthetic Users Obscure Rather Than Illuminate User Understanding
The proliferation of LLM-generated synthetic users in design and research creates a fundamental crisis of representation that undermines the very purpose of user-centered design. This analysis exposes the clarity deficit inherent in synthetic user generation and its profound implications for design validity.