Appendix E — Further Reading

This book is practitioner-focused, but its ideas are grounded in research. The references below point to the studies, papers, and books behind the key claims. They are organised by topic so you can follow up on whatever interests you most.

This is not a comprehensive literature review. It is a trail of breadcrumbs for curious readers.

E.1 Understanding AI and Large Language Models

Supporting Chapters 1–2: What Is AI and What Are Large Language Models

How LLMs work (prediction as the core mechanism):

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems, 30, 5998–6008. The foundational paper introducing the transformer architecture that underpins all modern LLMs.

  • Bommasani, R., Hudson, D. A., Adeli, E., … & Liang, P. (2021). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258. Defines the category of “foundation models” and maps their capabilities, risks, and societal implications.

  • Brown, T. B., Mann, B., Ryder, N., … & Amodei, D. (2020). Language models are few-shot learners. In Advances in Neural Information Processing Systems, 33, 1877–1901. The GPT-3 paper demonstrating that giving a model a few examples in the prompt dramatically improves task performance.

Hallucination:

  • Ji, Z., Lee, N., Frieske, R., … & Fung, P. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12), Article 248. A comprehensive overview of why LLMs generate plausible but false content.

Bias and fairness:

  • Gallegos, I. O., Rossi, R. A., Barrow, J., … & Ahmed, N. K. (2024). Bias and fairness in large language models: A survey. Computational Linguistics, 50(3), 1097–1179. How biases in training data manifest in LLM outputs, and the limitations of current mitigation approaches.

  • Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? In Proceedings of FAccT ’21, 610–623. ACM. How training data biases propagate through LLMs.

Deep learning foundations:

  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. The landmark review providing accessible context for the layered pattern recognition that makes LLMs possible.

  • Zhao, W. X., Zhou, K., Li, J., … & Wen, J.-R. (2023). A survey of large language models. arXiv preprint arXiv:2303.18223. A comprehensive survey of the LLM landscape.

E.2 Prompt Engineering and Structured Communication

Supporting Chapters 3–6: Getting Started, First Steps, Seven Techniques, and Managing Context

Structured prompts outperform unstructured ones:

  • Sahoo, P., Singh, A. K., Saha, S., Jain, V., Mondal, S., & Chadha, A. (2024). A systematic survey of prompt engineering in large language models: Techniques and applications. arXiv preprint arXiv:2402.07927. The broader landscape in which frameworks like CRAFT, RTCF, and CO-STAR sit.

  • Federiakin, D., Molerov, D., Zlatkin-Troitschanskaia, O., & Maur, A. (2024). Prompt engineering as a new 21st century skill. Frontiers in Education, 9, 1366434. Makes the case that structured prompting is a transferable professional skill, not a niche technical ability.

  • Knoth, N., Tolzin, A., Janson, A., & Leimeister, J. M. (2024). AI literacy and its implications for prompt engineering strategies. Computers and Education: Artificial Intelligence, 6, 100225.

Chain-of-thought and reasoning:

  • Wei, J., Wang, X., Schuurmans, D., … & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems, 35, 24824–24837. The paper that formalised “show your working” as a prompting strategy.

  • Wang, X. et al. (2023). Self-consistency improves chain of thought reasoning in language models. ICLR 2023. Sampling multiple reasoning paths and selecting the most consistent answer improves accuracy.

Task decomposition and prompt chaining:

  • Zhou, D. et al. (2023). Least-to-most prompting enables complex reasoning in large language models. ICLR 2023. Breaking complex problems into sequential subproblems significantly improves accuracy.

RE2 (Re-Reading) prompting:

  • Xu, Y. et al. (2024). Re-Reading improves reasoning in large language models. ACL EMNLP. Repeating a question in the prompt creates pseudo-bidirectional attention, improving reasoning accuracy. Twice is the sweet spot.

Iterative refinement over single-shot prompting:

  • Madaan, A. et al. (2023). Self-Refine: Iterative refinement with self-feedback. NeurIPS 2023. Iterative refinement consistently outperforms single-pass generation.

Teaching strategies with AI prompts:

  • Mollick, E. R., & Mollick, L. (2023). Assigning AI: Seven approaches for students, with prompts. arXiv preprint. Seven structured approaches to using AI for learning, including role play, debate, and self-testing.

  • Mollick, E. R., & Mollick, L. (2023). Using AI to implement effective teaching strategies in classrooms: Five strategies, including prompts. SSRN.

  • Mollick, E. R., & Mollick, L. (2024). Instructors as innovators: A future-focused approach to new AI learning opportunities, with prompts. SSRN.

E.3 Critical Evaluation and Staying Sceptical

Supporting Chapters 8 and 9: Critique Toolkit and Ethics, Data Governance & Integrity

Sycophancy in LLMs:

  • Sharma, M. et al. (2023). Towards understanding sycophancy in language models. arXiv preprint. How LLMs systematically tailor responses to match user beliefs, even when those beliefs are incorrect.

  • Perez, E. et al. (2023). Discovering language model behaviors with model-written evaluations. ACL 2023. Evidence of sycophantic behaviour across multiple model families and scales.

The AI Dismissal Fallacy:

  • Claessens, S., Veitch, P., & Everett, J. A. C. (2026). Negative perceptions of outsourcing to artificial intelligence. Computers in Human Behavior, 177, 108894. People systematically devalue work when they learn AI was involved.

Information literacy and lateral reading:

  • Wineburg, S. et al. (2022). Lateral reading and the nature of expertise. Teachers College Record. Experts verify claims by checking sources laterally rather than reading vertically.

E.4 Cognitive Offloading and AI Dependency

Supporting the book’s core argument: partner, don’t delegate

Cognitive offloading:

  • Risko, E. F., & Gilbert, S. J. (2016). Cognitive offloading. Trends in Cognitive Sciences, 20(9), 676–688. How humans use external tools to reduce cognitive demand, and when this helps versus hinders learning.

  • Sparrow, B., Liu, J., & Wegner, D. M. (2011). Google effects on memory: Cognitive consequences of having information at our fingertips. Science, 333(6043), 776–778. Access to searchable information changes what we bother to remember.

  • Hooper, V. J. (2025). Cognitive offloading and the reshaping of human thought: The subtle influence of artificial intelligence. Revista de Pensamiento y Cultura (Colloquia), 12, 1–14.

The generation effect (producing information improves retention):

  • Slamecka, N. J., & Graf, P. (1978). The generation effect: Delineation of a phenomenon. Journal of Experimental Psychology: Human Learning and Memory. Actively generating information leads to better memory than passively receiving it.

AI and learning outcomes:

  • Bastani, H., Bastani, O., Sungu, A., … & Mariman, R. (2025). Generative AI without guardrails can harm learning: Evidence from high school mathematics. Proceedings of the National Academy of Sciences, 122(26), e2422633122. Students using AI without guardrails perform worse on subsequent unaided tasks.

Metacognitive laziness:

  • Fan, Y., Tang, L., Le, H., … & Gasevic, D. (2025). Beware of metacognitive laziness: Effects of generative artificial intelligence on learning motivation, processes, and performance. British Journal of Educational Technology, 56(2), 489–530.

  • Gerlich, M. (2025). AI tools in society: Impacts on cognitive offloading and the future of critical thinking. Societies, 15(1), Article 6.

Cognitive surrender:

  • Shaw, S. D., & Nave, G. (2026). Thinking — fast, slow, and artificial: How AI is reshaping human reasoning and the rise of cognitive surrender. Working paper, The Wharton School.

AI does not reduce work:

  • Ranganathan, A., & Ye, X. M. (2026, February 9). AI doesn’t reduce work — it intensifies it. Harvard Business Review.

E.5 Assessment Design and Academic Integrity

Supporting Chapters 10–11, 15, and 18: Process Assessment, Self-Assessment, Group Assessment, and Assessment Design

Assessment reform for the AI era:

  • Lodge, J. M., Howard, S., Bearman, M., Dawson, P., & Associates. (2023). Assessment reform for the age of artificial intelligence. TEQSA. The discussion paper that framed the Australian higher education response.

  • Swiecki, Z., Khosravi, H., Chen, G., … & Gasevic, D. (2022). Assessment in the age of artificial intelligence. Computers and Education: Artificial Intelligence, 3, Article 100075.

  • Corbin, T., Dawson, P., & Liu, D. (2025). Talk is cheap: Why structural assessment changes are needed for a time of GenAI. Assessment & Evaluation in Higher Education, 50(7), 1087–1097.

  • Perkins, M., & Roe, J. (2025). The end of assessment as we know it: GenAI, inequality and the future of knowing. In AI and the future of education: Disruptions, dilemmas and directions, 76–80.

Validity over detection:

  • Dawson, P., Bearman, M., Dollinger, M., & Boud, D. (2024). Validity matters more than cheating. Assessment & Evaluation in Higher Education, 49(7), 1005–1016.

  • Corbin, T., Dawson, P., Nicola-Richmond, K., & Partridge, H. (2025). ‘Where’s the line? It’s an absurd line’: Towards a framework for acceptable uses of AI in assessment. Assessment & Evaluation in Higher Education, 50(5), 705–717.

Rubric design:

  • Dawson, P. (2017). Assessment rubrics: Towards clearer and more replicable design, research and practice. Assessment & Evaluation in Higher Education, 42(3), 347–360.

  • Perkins, M., Furze, L., Roe, J., & MacVaugh, J. (2024). The Artificial Intelligence Assessment Scale (AIAS): A framework for ethical integration of generative AI in educational assessment. Journal of University Teaching and Learning Practice, 21(6).

Evaluative judgement:

  • Bearman, M., Tai, J., Dawson, P., Boud, D., & Ajjawi, R. (2024). Developing evaluative judgement for a time of generative artificial intelligence. Assessment & Evaluation in Higher Education, 49(6), 893–905.

  • Boud, D., Ajjawi, R., Dawson, P., & Tai, J. (Eds.). (2018). Developing evaluative judgement in higher education: Assessment for knowing and producing quality work. Routledge.

Authentic assessment:

  • Villarroel, V., Bloxham, S., Bruna, D., Bruna, C., & Herrera-Seda, C. (2018). Authentic assessment: Creating a blueprint for course design. Assessment & Evaluation in Higher Education, 43(5), 840–854.

Retrieval practice:

  • Roediger, H. L., & Butler, A. C. (2011). The critical role of retrieval practice in long-term retention. Trends in Cognitive Sciences. Testing yourself improves learning more than re-reading.

Oral examination as alternative assessment:

  • Hartmann, C. (2025). Oral exams for a generative AI world: Managing concerns and logistics for undergraduate humanities instruction. College Teaching.

  • Buehler, M. J., & Schneider, L. U. (2009). Speak up! Oral examinations and political science. Journal of Political Science Education, 5(4), 315–331.

Bloom’s taxonomy and cognitive levels:

  • Anderson, L. W., & Krathwohl, D. R. (Eds.). (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational objectives (Complete ed.). Longman.

  • Webb, N. L. (2002). Depth-of-knowledge levels for four content areas. Wisconsin Center for Education Research.

E.6 AI in Higher Education Policy and Practice

Supporting Chapters 9, 13, and 20–21: Ethics, Unit Design, Accessibility, and Global Perspectives

Institutional frameworks:

  • Chan, C. K. Y. (2023). A comprehensive AI policy education framework for university teaching and learning. International Journal of Educational Technology in Higher Education, 20, 38.

  • UNESCO. (2023). Guidance for generative AI in education and research.

  • Sabzalieva, E., & Valentini, A. (2023). ChatGPT and artificial intelligence in higher education: Quick start guide. UNESCO IESALC.

  • Russell Group. (2023). Russell Group principles on the use of generative AI tools in education.

  • European Parliament & Council of the European Union. (2024). Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act).

  • OECD. (2025). Empowering Learners for the Age of AI: An AI Literacy Framework.

AI and pedagogy:

  • Bearman, M., & Ajjawi, R. (2023). Learning to work with the black box: Pedagogy for a world with artificial intelligence. British Journal of Educational Technology, 54(5), 1160–1173.

  • Kasneci, E., Sessler, K., Kuchemann, S., … & Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, Article 102274.

  • Labadze, L., Grigolia, M., & Machaidze, L. (2023). Role of AI chatbots in education: Systematic literature review. International Journal of Educational Technology in Higher Education, 20, Article 56.

Critical AI literacy:

  • Roe, J., Furze, L., & Perkins, M. (2025). Digital plastic: A metaphorical framework for Critical AI Literacy in the multiliteracies era. Pedagogies: An International Journal.

  • Madsen, D. O., & Puyt, R. W. (2025). When AI turns culture into slop. AI & Society.

E.7 Human-AI Collaboration and the Future of Work

Supporting Chapters 7, 12, and 17: Flight Simulator, Virtual Company, and Advanced Frontiers

  • Wilson, H. J., & Daugherty, P. R. (2018). Collaborative intelligence: Humans and AI are joining forces. Harvard Business Review, 96(4), 114–123. The greatest performance gains come from structured human-AI collaboration, not AI alone.

  • Dellermann, D., Ebel, P., Sollner, M., & Leimeister, J. M. (2019). Hybrid intelligence. Business & Information Systems Engineering, 61(5), 637–643.

  • Mosqueira-Rey, E., Hernandez-Pereira, E., Alonso-Rios, D., Bobes-Bascaran, J., & Fernandez-Leal, A. (2023). Human-in-the-loop machine learning: A state of the art. Artificial Intelligence Review, 56(4), 3005–3054.

  • Autor, D. H. (2015). Why are there still so many jobs? The history and future of workplace automation. Journal of Economic Perspectives, 29(3), 3–30.

  • Acemoglu, D., & Restrepo, P. (2019). Automation and new tasks: How technology displaces and reinstates labor. Journal of Economic Perspectives, 33(2), 3–30.

  • Deming, D. J. (2017). The growing importance of social skills in the labor market. The Quarterly Journal of Economics, 132(4), 1593–1640.

  • Bloom, B. S. (1984). The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Educational Researcher, 13(6), 4–16. The research showing that personalised tutoring produces dramatic learning gains — the aspiration behind AI-as-tutor approaches.

Agentic AI:

  • Wang, L., Ma, C., Feng, X., … & Wen, J.-R. (2024). A survey on large language model based autonomous agents. Frontiers of Computer Science, 18(6), Article 186345.

  • Shavit, Y., Agarwal, S., Brundage, M., … & Robinson, D. G. (2023). Practices for governing agentic AI systems. OpenAI white paper.

  • Tabassi, E. (2023). Artificial intelligence risk management framework (AI RMF 1.0). NIST AI 100-1.

E.8 General Background

For readers who want a broader foundation in how AI systems work and how to think about their role in society:

  • Mitchell, M. (2019). Artificial Intelligence: A Guide for Thinking Humans. New York: Farrar, Straus and Giroux. An accessible, rigorous introduction to AI for non-specialists.

  • Christian, B. (2020). The Alignment Problem. New York: W.W. Norton. Explores the gap between what we want AI to do and what it actually does.

  • Mollick, E. (2024). Co-Intelligence: Living and Working with AI. New York: Portfolio. A practitioner-oriented book on integrating AI into professional work.

  • Shneiderman, B. (2022). Human-Centered AI. Oxford University Press. AI systems designed around human control and oversight rather than full automation.

  • Russell, S. J., & Norvig, P. (2021). Artificial intelligence: A modern approach (4th ed.). Pearson. The standard AI textbook for those who want deeper technical grounding.

  • Agrawal, A., Gans, J., & Goldfarb, A. (2018). Prediction machines: The simple economics of artificial intelligence. Harvard Business Review Press. An accessible economic framework for understanding what AI does and doesn’t change about decision-making.

E.9 The Companion Book

This book applies the methodology developed in Conversation, Not Delegation: How to Think With AI, Not Just Use It (Borck, 2025) to business education. That companion book covers the full framework in depth: the Conversation Loop, the VET framework for evaluating AI output, the cognitive traps that undermine critical thinking, and the principle of AI Last. For the underlying rationale and a discipline-neutral treatment, start there.