embedder

Ranter

fullstackcircus

9498

Comments

1

fullstackcircus

9498

2y

Ok, first pothole, pinecone vector uploads don't work, i go to look into it, i find i'm reading chinese:

https://docs.pinecone.io/docs/...

haha, and everything i read essentially boils down to "well, just brute force trying every combination and pick the one that works best!"

seems like we've really made a lot of progress in ML 🙄
1

fullstackcircus

9498

2y

PineconeError: undefined

welcome to hell
2

fullstackcircus

9498

2y

"The session with the highest return is the session on Tuesday, March 14, 2023, with a return of 0.24%."

yeah, i don't think so buddy

sigh... just about what I expected... absolute garbage
1

fullstackcircus

9498

2y

it's somewhat decent if you ask exactly the right questions, but it hallucinates a bit and gets stuff wrong... need to research how i can fine tune it
2

fullstackcircus

9498

2y

@atheist BuT tHeY tOLd mE iT wAs A TriLlIoN dOLLaR iNduStRy!!!!
0

fullstackcircus

9498

2y

I had an idea that instead of trying to "find" the answers itself, it would write the CODE to find the answers (the SQL query or similar), and then THAT could be the specific data for the answer (could use another step of LLM to actually summarize the results)

from my experience it never hurts to add deterministic steps mixed in with AI models
1

hack

6148

2y

How is that even works? I mean using pinecone with open ai. Are you using chat completions?
1

fullstackcircus

9498

2y

@hack saw this on hackernews, its a framework where you can just use typescript to connect all the pieces: https://github.com/axilla-io/ax/...

you use pinecone to store your documents in vector form and open AI's embedder to actually make sense of them (from a machine's perspective)

but like i said, the model's responses themselves not working too well at initial try
0

max19931

355

2y

OCR on the docs to get the Text and feed it into mangodb(or other document based database) and just query it manually.

No need for fancy algorithm, that just do the same job.
0

fullstackcircus

9498

2y

@max19931 ... why would you use OCR when 1. already have the literal text in file form

2. you could put the texts in a full text search in postgres

the power of a chat based way to query data is that you can have a moving context window and have some intellegence beyond to dig deeper or show related things. you can't quite do that with a full text search / regex based query
0

max19931

355

2y

@fullstackcircus i thought you had Scanned financial records on paper.

But if they have the data already extracted, it would simply be a query like on a database. Very old school, a relational database structure.
0

fullstackcircus

9498

2y

@max19931
0

fullstackcircus

9498

2y

@max19931 a financial scrub is not going to be able to write / query "SELECT * FROM sessions ORDER BY "return" DESC LIMIT 1;"

thus the whole point of embedding / RAG / chat based applications. it enables the non programmer / engineer to be able to do programming / engineering type things