https://docs./en/stable/understanding/putting_it_all_together/apps/fullstack_app_guide/ LlamaIndex is a python library, which means that integrating it with a full-stack web application will be a little different than what you might be used to. This guide seeks to walk through the steps needed to create a basic API service written in python, and how this interacts with a TypeScript+React frontend. All code examples here are available from the llama_index_starter_pack in the flask_react folder. The main technologies used in this guide are as follows:
Flask Backend#For this guide, our backend will use a Flask API server to communicate with our frontend code. If you prefer, you can also easily translate this to a FastAPI server, or any other python server library of your choice. Setting up a server using Flask is easy. You import the package, create the app object, and then create your endpoints. Let's create a basic skeleton for the server first: from flask import Flaskapp = Flask(__name__)@app.route("/")def home():return "Hello World!"if __name__ == "__main__":app.run(host="0.0.0.0", port=5601) flask_demo.py If you run this file ( The next step is deciding what functions we want to include in our server, and to start using LlamaIndex. To keep things simple, the most basic operation we can provide is querying an existing index. Using the paul graham essay from LlamaIndex, create a documents folder and download+place the essay text file inside of it. Basic Flask - Handling User Index Queries#Now, let's write some code to initialize our index: import osfrom llama_index.core import (SimpleDirectoryReader,VectorStoreIndex,StorageContext,)# NOTE: for local testing only, do NOT deploy with your key hardcodedos.environ["OPENAI_API_KEY"] = "your key here"index = Nonedef initialize_index():global indexstorage_context = StorageContext.from_defaults()if os.path.exists(index_dir):index = load_index_from_storage(storage_context)else:documents = SimpleDirectoryReader("./documents").load_data()index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)storage_context.persist(index_dir) This function will initialize our index. If we call this just before starting the flask server in the Our query endpoint will accept from flask import request@app.route("/query", methods=["GET"])def query_index():global indexquery_text = request.args.get("text", None)if query_text is None:return ("No text found, please include a ?text=blah parameter in the URL",400,)query_engine = index.as_query_engine()response = query_engine.query(query_text)return str(response), 200 Now, we've introduced a few new concepts to our server:
A full query example that you can test in your browser might look something like this: Things are looking pretty good! We now have a functional API. Using your own documents, you can easily provide an interface for any application to call the flask API and get answers to queries. Advanced Flask - Handling User Document Uploads#Things are looking pretty cool, but how can we take this a step further? What if we want to allow users to build their own indexes by uploading their own documents? Have no fear, Flask can handle it all :muscle:. To let users upload documents, we have to take some extra precautions. Instead of querying an existing index, the index will become mutable. If you have many users adding to the same index, we need to think about how to handle concurrency. Our Flask server is threaded, which means multiple users can ping the server with requests which will be handled at the same time. One option might be to create an index for each user or group, and store and fetch things from S3. But for this example, we will assume there is one locally stored index that users are interacting with. To handle concurrent uploads and ensure sequential inserts into the index, we can use the Here's a basic example of what our import osfrom multiprocessing import Lockfrom multiprocessing.managers import BaseManagerfrom llama_index.core import SimpleDirectoryReader, VectorStoreIndex, Document# NOTE: for local testing only, do NOT deploy with your key hardcodedos.environ["OPENAI_API_KEY"] = "your key here"index = Nonelock = Lock()def initialize_index():global indexwith lock:# same as before ...passdef query_index(query_text):global indexquery_engine = index.as_query_engine()response = query_engine.query(query_text)return str(response)if __name__ == "__main__":# init the global indexprint("initializing index...")initialize_index()# setup server# NOTE: you might want to handle the password in a less hardcoded waymanager = BaseManager(("", 5602), b"password")manager.register("query_index", query_index)server = manager.get_server()print("starting server...")server.serve_forever() index_server.py So, we've moved our functions, introduced the Then, we can adjust our flask code as follows: from multiprocessing.managers import BaseManagerfrom flask import Flask, request# initialize manager connection# NOTE: you might want to handle the password in a less hardcoded waymanager = BaseManager(("", 5602), b"password")manager.register("query_index")manager.connect()@app.route("/query", methods=["GET"])def query_index():global indexquery_text = request.args.get("text", None)if query_text is None:return ("No text found, please include a ?text=blah parameter in the URL",400,)response = manager.query_index(query_text)._getvalue()return str(response), 200@app.route("/")def home():return "Hello World!"if __name__ == "__main__":app.run(host="0.0.0.0", port=5601) flask_demo.py The two main changes are connecting to our existing One special thing to note is that If we allow users to upload their own documents, we should probably remove the Paul Graham essay from the documents folder, so let's do that first. Then, let's add an endpoint to upload files! First, let's define our Flask endpoint function: ...manager.register("insert_into_index")...@app.route("/uploadFile", methods=["POST"])def upload_file():global managerif "file" not in request.files:return "Please send a POST request with a file", 400filepath = Nonetry:uploaded_file = request.files["file"]filename = secure_filename(uploaded_file.filename)filepath = os.path.join("documents", os.path.basename(filename))uploaded_file.save(filepath)if request.form.get("filename_as_doc_id", None) is not None:manager.insert_into_index(filepath, doc_id=filename)else:manager.insert_into_index(filepath)except Exception as e:# cleanup temp fileif filepath is not None and os.path.exists(filepath):os.remove(filepath)return "Error: {}".format(str(e)), 500# cleanup temp fileif filepath is not None and os.path.exists(filepath):os.remove(filepath)return "File inserted!", 200 Not too bad! You will notice that we write the file to disk. We could skip this if we only accept basic file formats like With these more complicated requests, I also suggest using a tool like Postman. Examples of using postman to test our endpoints are in the repository for this project. Lastly, you'll notice we added a new function to the manager. Let's implement that inside def insert_into_index(doc_text, doc_id=None):global indexdocument = SimpleDirectoryReader(input_files=[doc_text]).load_data()[0]if doc_id is not None:document.doc_id = doc_idwith lock:index.insert(document)index.storage_context.persist()...manager.register("insert_into_index", insert_into_index)... Easy! If we launch both the To support some functionality in the frontend, I've adjusted what some responses look like from the Flask API, as well as added some functionality to keep track of which documents are stored in the index (LlamaIndex doesn't currently support this in a user-friendly way, but we can augment it ourselves!). Lastly, I had to add CORS support to the server using the Check out the complete React Frontend#Generally, React and Typescript are one of the most popular libraries and languages for writing webapps today. This guide will assume you are familiar with how these tools work, because otherwise this guide will triple in length :smile:. In the repository, the frontend code is organized inside of the The most relevant part of the frontend will be the
Using these three queries, we can build a robust frontend that allows users to upload and keep track of their files, query the index, and view the query response and information about which text nodes were used to form the response. fetchDocuments.tsx#This file contains the function to, you guessed it, fetch the list of current documents in the index. The code is as follows: export type Document = { id: string; text: string;};const fetchDocuments = async (): Promise<Document[]> => { const response = await fetch("http://localhost:5601/getDocuments", { mode: "cors", }); if (!response.ok) { return []; } const documentList = (await response.json()) as Document[]; return documentList;}; As you can see, we make a query to the Flask server (here, it assumes running on localhost). Notice that we need to include the Then, we check if the response was ok, and if so, get the response json and return it. Here, the response json is a list of queryIndex.tsx#This file sends the user query to the flask server, and gets the response back, as well as details about which nodes in our index provided the response. export type ResponseSources = { text: string; doc_id: string; start: number; end: number; similarity: number;};export type QueryResponse = { text: string; sources: ResponseSources[];};const queryIndex = async (query: string): Promise<QueryResponse> => { const queryURL = new URL("http://localhost:5601/query?text=1"); queryURL.searchParams.append("text", query); const response = await fetch(queryURL, { mode: "cors" }); if (!response.ok) { return { text: "Error in query", sources: [] }; } const queryResponse = (await response.json()) as QueryResponse; return queryResponse;};export default queryIndex; This is similar to the insertDocument.tsx#Probably the most complex API call is uploading a document. The function here accepts a file object and constructs a The actual response text is not used in the app but could be utilized to provide some user feedback on if the file failed to upload or not. const insertDocument = async (file: File) => { const formData = new FormData(); formData.append("file", file); formData.append("filename_as_doc_id", "true"); const response = await fetch("http://localhost:5601/uploadFile", { mode: "cors", method: "POST", body: formData, }); const responseText = response.text(); return responseText;};export default insertDocument; All the Other Frontend Good-ness#And that pretty much wraps up the frontend portion! The rest of the react frontend code is some pretty basic react components, and my best attempt to make it look at least a little nice :smile:. I encourage to read the rest of the codebase and submit any PRs for improvements! Conclusion#This guide has covered a ton of information. We went from a basic "Hello World" Flask server written in python, to a fully functioning LlamaIndex powered backend and how to connect that to a frontend application. As you can see, we can easily augment and wrap the services provided by LlamaIndex (like the little external document tracker) to help provide a good user experience on the frontend. You could take this and add many features (multi-index/user support, saving objects into S3, adding a Pinecone vector server, etc.). And when you build an app after reading this, be sure to share the final result in the Discord! Good Luck! :muscle: |
|
来自: LibraryPKU > 《Chatgpt》