Created data loader for sikt.no and changed the configuration to be domain...
Created data loader for sikt.no and changed the configuration to be domain specific and not part of rag package. Keeping the sikt-no data loader here as an example of how it is done just as a temporary step while the package is being figured out.
Closes #22 (closed)
Merge request reports
Activity
assigned to @anca
- sikt_no_rag/configuration.py 0 → 100644
1 import asyncio 2 import random 3 import sys 4 from collections.abc import Callable 5 from typing import Optional 6 7 8 class SiktConfiguration: Unless the plan for Configuration is something entirely different, I really think it should not even be a class. It doesn't need to be. There is some overlap in these cases of where data is gathered from, so I can see a usecase for making some utils functions available through our package for loading. But I don't think it is worth the effort to abstract and generalize this bit. It will only make the code less readable and complicate interaction with it.
I think long term, how the data is loaded should not be a concern of our package. It will vary massively between implementations, and here I think you want to have as simple an interface as possible, and a list of Document objects as the user interface to our package seems like the best.
Configuration
should be something different, this bit is now just a data-loader and has lost most "configuration"-like properties. So perhaps that should be the change here, remove the class and just have the methodget_docs
here and in feide, and leaveConfiguration
with the rest (environment reading can be pushed in there later)
mentioned in commit 365a5654