LangChain is a framework built around Large Language Models.
I used LangChain to create a medical report application.
Streamlit is an open-source Python library used to create web applications for data science and machine learning projects.
I used streamlit to create the application's user interface.
app0.py
hello world streamlit app
I updated my hello world application to call OpenAI's API.
I had to request a secret API key.
secret key
app1.py
Using LangChain to call OpenAI's API.
I used LangChain to return the text found on each page of a sample medical report.
https://www.med.unc.edu/medclerk/wp-content/uploads/sites/877/2018/10/hp4.pdf
hp4.pdf medical report
To improve performance and reduce costs I pre-processed the pdf file and stored the result.
I created a Jupyter Notebook to keep track of the steps I followed.
Running jupyter-notebook locally
I loaded pages from the medical report.
[1]
I broke the pages into paragraphs (texts).
[2] & [3]
I used Chroma to create an embeddings vector store and saved the store locally.
I used OpenAI's API to generate the embeddings
[4]
I converted the query 'Does the patient smoke?' to an embedding and compared the result with the embeddings in the vector store.
[5]
I ensured that the embeddings vector store could be loaded from the local folder.
[6]
I located the most similar paragraphs and sent those (see Context Injection) and the query to OpenAI's servers.
[7]
The streamlit code
app2.py
The finished application