James Faulkner – September 13, 2024
Working on a literature review? Want to make your documents, presentations, and spreadsheets smart searchable? Want to offer students a better FAQ experience with important documents (e.g., course syllabus, Endicott policies)? Increasingly, foundation models (OpenAI’s ChatGPT, Anthropic’s Claude, etc.) have offered services that allow you to “chat with your data,” an ability once confined to code-based approaches employing retrieval-augmented generation (RAG). Let’s take a look at some of the more accessible options available now.
NB: But first, as with any third-party tool, you want to make sure student data is de-identified, or better yet excluded altogether, from any file uploads.
Typically, you will begin by creating a knowledge base of relevant files that you would like the LLM to know well. You may include “system instructions” that tell the LLM how to handle your data sources and how to interact with the user. You may have heard of OpenAI’s custom GPTs—recently made free to view, though not to create—but now there are a number of options available.
Watch this video to see how I used Google’s NotebookLM to create a custom GPT.
- Custom GPTs (OpenAI)
- Requires GPTPlus subscription to create; any user with a free OpenAI account may use a custom GPT, however
- Notebook LM (Google)
- Free
- Currently under development, so only accessible by personal Gmail account
- Ability to invite (like a Google doc)
- Google’s Gemini LLM excels at searching long documents
- HuggingFace Assistants
- Free
- Easily shareable; often no login is required of the user
- Cannot take files directly, but can search over (scrape) websites that contain relevant information
- Option to select more efficient (lower-carbon footprint) LLM models, e.g., Microsoft’s Phi-3
- Claude Projects (Anthropic)
- Requires Claude Pro subscription
- Not easily shareable