Jump to content
TNG Community

Need to feed entire TNG site contents into an AI LLM, can anyone help?


PISCES TNG

Recommended Posts

My boss' boss wants the entire contents of our TNG site to augment his favorite AI chatbot.  So he can "ask questions about anyone on the site" and have the chatbot answer accurately. 

I figure I might be able to cobble something together using Python, BeautifulSoup etc. and feed that output into the LLM, but not sure what to do about all the data in the DB. 

Any ideas?  Has anyone written an agent to allow an LLM like ChatGPT or Claude to dig into the data in a TNG site and answer questions about its content with a degree of accuracy?

If not, does anyone know anyone who can be hired to write this?


Thank you.

Link to comment
Share on other sites

Rob Severijns

No idea where you are from but here in Europe, where I'm from, there are strickt rules and regulations about this. (GDPR  & Privacy Laws & Regulations)

If you have any data in your database of living people or recently deceased it's not allowed to publish that let alone feed it into an AI Large Language Model like ChatGPT.

Once uploaded into an AI LLM it's available to everyone all over the world.

All in all I would advise my boss to think again about what he/she wants.

There's a big difference between wanting something and being allowed to do something.

He/she also needs to be made aware of the consequences of what he/she wants and what the guardrails are.

Best to do that in writing so you won't be blamed afterwards.

If something goes wrong and someone files a complaint or a lawsuit, the first reaction of a boss is usualy that you didn't tell them about the guardrails and try to put the blame on you or make you responsable too.

Obviously you're not allowed to do anything that's against the law eventhough we see more and more examples of folks that don't care about what the law says and do things anyway.

 

Link to comment
Share on other sites

Absolutely Agree with @Rob Severijns with this.. Check out the privacy laws etc where ever you may live.. By doing what you suggest you WILL open the Business up to some possibly serious law suits against you. Not only from people but from the Governments of countries all around the world..

If Your Bossses Boss wants to go forward with this the again as Rob says, put stuff in writing to keep you OUT of any issues.. In fact just DON't do it and sue the business if they choose to fire you.. Cannot be fired if you are following the Law.

6 hours ago, Rob Severijns said:

Obviously you're not allowed to do anything that's against the law eventhough we see more and more examples of folks that don't care about what the law says and do things anyway.

This is partially correct.. There are indeed those that think they are above the law or it does not apply to them and or they don't care.. For the most part people don't even understand most laws and inadvertantly break them. Not wanting to break the flow of the OP so IF folk want to discuss this further open a NEW post to do so..  And this is just my opinion.. 

Link to comment
Share on other sites

The privacy concerns have already been addressed, so I'll stick to the technical.

ChatGPT can read GEDCOM files. People have fed it a GEDCOM file and asked it to output biographical info about some individuals in the file. Those test were done with a relatively small tree. Not sure how it would manage a huge database. But you could exclude living individuals from the GEDCOM export.

Link to comment
Share on other sites

Hi there--

Thank you for the input.

Rick M and Rob: re: privacy concerns--I am well aware that--this data can't ever go into a public LLM.  Without going into detail: We use a fully secure and private implementation of ChatGPT's LLM; information from RAG and our queries never go into the public model.

Eric D, thank you for your tip as well, I am aware of using the GED export, i used that output as RAG to the chatGPT client and showed it to my boss' boss, he didn't think it was complete enough, since the GED export ignored large parts of the "color" added to TNG object, like stories told by relatives.  I am still stuck on this....next I will try seeing if I can somehow get the info out of a text dump from the SQL DB.  Thanks.

Link to comment
Share on other sites

Rob Severijns
4 hours ago, PISCES TNG said:

Rick M and Rob: re: privacy concerns--I am well aware that--this data can't ever go into a public LLM.  Without going into detail: We use a fully secure and private implementation of ChatGPT's LLM; information from RAG and our queries never go into the public model.

You didn't mention that earlier but I've seen those implementations before.

Can't help you with the feeding question though.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...