AI Agents Web App
This individual project is intended for you to practice deploying AI agents, creating a gateway that selects the correct agent, and web app to test it deployed on a Duke server.
Your grade will be determined primarily by your programming process rather than the exact functionality you choose to implement:
- using small commits, Tags, and Feature branches in GIT tell your project's story
- using Issues in Gitlab to help manage your project
- creating two or more Docker containers composed together into a single app deployed to your server
- creating a Pipeline in Gitlab to automate basic code Linting, testing, and security auditing
- logging important events in your program appropriately
The goals of the agents are entirely up to you, as is the technology (basic HTML/CSS/JavaScript, a web framework like React or Svelte, a Python framework like Flask or Django, or a Java framework like Spring). With proper attribution, you are free start with our example code, any of the numerous online tutorials, or with the help of an AI Assistant to make an app with at least the following characteristics:
- it selects between at least three different "experts": two simple agents and a complex agent
- it can be parameterized in at least two ways, such as the AI model's "randomness", the AI model used, or the length or style of the response
- it outputs the results in a user-friendly way, such as human readable text, an image, or table or graph (rather than something like raw JSON data)
You are expected to build your own AI Agent rather than using pre-built ones or low-code tools to build one.
Submission
Use GIT to push your implementation to the main branch of the provided ai_agents_NETID repository hosted in the course's Gitlab group for each Phase:
- Tag the commit representing the submitted version of the code as
PhaseN_Complete
- Submit your code,
gitignore and gitattribute files, as well as any properly attributed resources (images, sounds, configuration files, etc.) your web app needs
- Submit an updated project README file at the top-level (no separate folder)
- Submit a 2-4 minute video of you using your app and provide a voice over on the video to describe what is happening in the code to show you have a basic understanding of how the app works.
There are many free screen capture tools available for any platform or free trials of some pretty powerful tools as well.
- Submit the log output produced from the program's run that you recorded in your video.
Note, name it phaseN_log.txt so it is not recognized by your gitignore file.
You are responsible for ensuring that all files are correctly pushed to the repository on time.
AI Agent Details
For this project, an AI Agent is anything that helps provide a response to user interaction (not necessarily just the next response in a chat conversation), such as something that:
- uses a weather API (REST or Python) to generate the data to pass to an LLM to output an appropriate sentence or image
- provides information based on Duke-specific data APIs
- interprets sensor data as a description, graph, or highlights trends
- answers questions based on information from a database or API
- reviews given code based on a set of guidelines
- finds articles from a set of sources that are relevant to given topics or notes
- summarizes reviews from a set of sources about a book, movie, restaurant, etc.
- finds an item to buy based on a Google search using a given image
- identifies Duke faculty or campus buildings based on a given image
- describes or "tags" a given image
In most cases, each AI Agent will need to keep track of information specific to its purpose, such as:
- system prompt detailing its role, expected input and output, and any other characteristics the response should have
- the criteria by which it should be chosen to provide a response
- specific APIs, databases, or data sets needed to provide a response
- data needed across queries or runs
You will not be able to test your AI Agents precisely since each response may be unique, so instead focus on testing that the parts work together correctly, such as:
- are inputs given in the correct format?
- are the "calculations" performed correctly (API call or Database query completes, LLM responds, etc.)?
- are data returned in the correct format?
- are error cases and unexpected inputs properly reported and handled?
You are welcome to test the frontend similarly (was content provided to the expected HTML id tag after the appropriate interaction occurred), but it is not necessary.
Phase 1
- Deploy the provided starter code in your provided repository on your Duke VCM Server
- Make a Gitlab Pipeline that Lints the code
- Add a simple AI agent to the framework that interacts with an LLM based on data provided by the user (but not necessarily only "chat" data)
Note, simple means one or two basic data/LLM interactions to produce a response with code that is not much more complicated than the provided example
- Add tests for your new AI Agent to your Gitlab Pipeline that tests the code directly using
pytest (for Python) rather than building the Docker container on Gitlab's server
Note, you do not need to make tests for the starter code unless you choose to program it in a different language.
- Log significant events for your AI Agent using the
logging module (for Python) with appropriate levels to the Console when run within Docker to complement their logging tools
Note, you do not need to add logs for the starter code unless you feel it helps you to more easily debug the program
Tag the final committed version of your app as Phase1_Complete
Phase 2
- Add at least one more simple and a complex AI agent to the framework that interacts with an LLM based on data provided by the user (but not necessarily only "chat" data)
Note, complex means combining several data or LLM interactions to produce a response such as accessing multiple data sources to provide context for the LLM or chaining together multiple LLM outputs to achieve the final result.
Note, these agents must use "live" data sources, such as an API or database, rather than simply using a fixed file.
- Add frontend UI interactions to allow the user to provide values to the backend that are otherwise hardcoded, such as a slider that determines the LLM API's temperature value or a provided URL/search criteria that determines the set of data files to fetch.
Note, this requires the backend web endpoint(s) be updated to accept additional parameters to the ChatRequest from the POST request created in the frontend.
- Add security checks for your AI Agents to your Gitlab Pipeline on the code directly using
bandit (for Python) rather than building the Docker container on Gitlab's server
Note, you should try to minimize the number of security issues, but you do not need to fix all of the issues.
- Continue logging and testing the new backend code you write.
Tag the final committed version of your app as Phase2_Complete