Hacker Newsnew | past | comments | ask | show | jobs | submit | jn2clark's commentslogin

How does it compare to previous work on learning to learn? I don't see it referenced https://arxiv.org/abs/1606.04474


With binary representations you still get 2^D possible configurations so its entirely possible from a representation perspective. The main issue (I think at least) is around determining the similarity. Hamming distance gives an output space of D possible scores. As mentioned in the article, going to 0/1 with cosine gives better granularity as it now penalizes embeddings if they have differing amounts of positive elements in the embedding (i.e. living on different hyper-spheres). It is probably well suited to retrieval where there is a 1:1 correspondence for query-document but if the degeneracy of queries is large then there could be issues discriminating between similar documents. Regimes of binary and (small) dense embeddings could be quite good. I expect a lot more innovation in this space.


That's a great question. I think regimes like that could offer better trade-offs of memory/latency/retrieval performance, although I don't know what they are right now. It also assumes that going to the larger dimensions can preserve more of the full-precision performance which is still TBD. The other thing is how the binary embeddings play with ANN algorithms like HNSW (i.e. recall). With hamming distance the space of similarity scores is quite limited.


I would love an LLM agent that could generate small api examples (reliably) from a repo like this for the various different models and ways to use them.


What is accuracy in this case? is it meant to be recall or is it some evaluation metric?


Yeah, it is recall.


We (Marqo) are doing a lot on 1 and 2. There is a huge amount to be done on the ML side of vector search and we are investing heavily in it. I think it has not quite sunk in that vector search systems are ML systems and everything that comes with that. I would love to chat about 1 and 2 so feel free to email me (email is in my profile).


Take a look here https://github.com/marqo-ai/local-image-search-demo. It is based on https://github.com/marqo-ai/marqo. We do a lot of image search applications. Feel free to reach out if you have other questions (email in profile).


That looks indeed pretty interesting. But I still feel that it's not very convenient for usage in a desktop environment with local files. This is of course not to blame on the project itself, since I assume that it simply targets different use-cases and audiences.

I also researched in the meantime whether such a functionality could be implemented at all for the Gnome Shell and, more specifically, for its file browser. But the search and extension APIs would not even allow it or require many hacks.


Can anyone comment on an open source multi-modal LLM that can produce structured outputs based on an image? I have not found a good open source one yet (this included), seems to be only closed source that can do this reliably well. Any suggestions are very welcome!


Something like this?

https://imgur.com/a/hPAaZUv

https://huggingface.co/spaces/Qwen/Qwen-VL-Plus

You can also ask it to give you bounding boxes of objects.


I've only used LLaVA / BakLLaVA. It falls under the LLAMA 2 Community License. Not sure if you consider that open source or not.


That sounds much longer than it should. I am not sure on your exact use-case but I would encourage you to check out Marqo (https://github.com/marqo-ai/marqo - disclaimer, I am a co-founder). All inference and orchestration is included (no api calls) and many open-source or fine-tuned models can be used.


> That [pgvector index creation time] sounds much longer than it should... I would encourage you to check out Marqo

Your comment makes it sound like Marqo is a way to speed up pgvector indexing, but to be clear, Marqo is just another Vector Database and is unrelated to pgvector.


Fair enough, apologies for the confusion!


The reason I would use pgvector is because I am uninterested in another piece of infrastructure.


Try this https://github.com/marqo-ai/marqo which handles all the chunking for you (and is configurable). Also handles chunking of images in an analogous way. This enables highlighting in longer docs and also for images in a single retrieval step.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: